depicts the x-ray structures of ppGalNAcT-2 with PDB codes 2FFU and 2FFV. A comparison of these two x-ray structures indicates that the conformational changes of the catalytic domain associated to the ligand binding and processing are concentrated in two long loops: loop A (residues 89–98) located relatively far from the active site (modeled in 2FFV as in 2FFU); and loop B (residues 361–377) located at one end of the active site. Loop B undergoes a large movement, shifting by as much as 24 Å from one side of the peptide binding groove to the other. In addition, smaller conformational changes of the protein backbone occur in three other short loops: loop C (residues 127–134), an exposed loop placed on the side opposite of the active site; loop D (residues 287–297) forming a short α-helix, located more than 9 Å away from the peptide binding groove, that moves by approximately 5 Å in the absence of the peptide; and loop E (residues 330–333), showing a 4 Å shift. These loops are shown in , colored blue in 2FFU and orange in 2FFV. In analyzing our simulations, we are particularly interested in the flexibility of the loops A, B and E and their possible role in ligand binding.
To assess the dynamical properties of ppGalNacT-2, we performed extensive MD simulations of a series of complexes between the enzyme and its different ligands based upon the x-ray structures described above. With solvent molecules treated explicitly, including Na+ and Cl− counter ions, the simulation systems contained up to 95,000 atoms. Simulations of up to 40 ns per trajectory resulted in a total simulation time of more than 600 ns (). To study the relative dynamics of the two domains our initial simulations included both the catalytic and lectin domains using the 2FFU crystal structure from which UDP and EA2 had been removed. We also simulated the same system in the presence of the substrates, but with the lectin domain positioned near the catalytic domain, as guided by the mT1 structure. This simulation allowed us to test whether the binding of substrates triggered movement of the lectin domain away from the catalytic domain. A comparison of the results of these simulations indicated that the average flexibility pattern of each domain, as measured by RMSD per residue, is independent of the presence of the other domain or the distance between the domains. Thus, to reduce computational requirements, subsequent simulations focused on the catalytic domain to test the influence of substrates on the dynamics of the enzyme at different steps in the catalytic process. In this the second series of MD trajectories, we simulated the catalytic domain as in the ternary complex in the absence of substrates, in the presence of UDP-GalNAc, in the presence of UDP-GalNAc and EA2 (substrate complex), in the presence of UDP and glycosylated EA2 (product complex), and in the presence of only UDP or UDP and EA2. To evaluate the dynamics of the enzyme after the reaction (structure 2FFV), we also simulated the catalytic domain as in the binary complex, either alone or in the presence of the reaction product UDP.
Summary of MD simulations performed on ppGalNAcT2 and its substrates. The letter “Y” indicates that the corresponding component is included in the simulated system.
We compared the simulations of the catalytic domain starting from the ternary complex 2FFU in the presence or absence of UDP-GalNAc donor and peptide acceptor and starting from the binary complex 2FFV in the presence and absence of the reaction product UDP.
Protein flexibility in the presence and absence of the substrates
As a measure of the average flexibility of various regions in the protein, we first calculated for each residue the root-mean-square distance (RMSD) of the alpha carbon (Cα) from the average position and compared it to the corresponding RMSD between the two crystal structures, 2FFU and 2FFV. The results of the simulations done in the presence of UDP-GalNAc and EA2 (C(UG)Px), in the presence of UDP-GalNAc only (C(UG)x) and in the absence of substrates (Cx) are represented in . The theoretical and experimental RMSDs agree well and the MD simulations correctly identify the flexible regions in the protein. In particular, the highest peaks in calculated RMSD as a function of residue number correspond to the flexible loops described above.
Figure 2 RMSD per residue calculated for the catalytic domain from simulations starting from the x-ray structure 2FFU in the absence of both substrates (red, trajectory Cx, see for notation rule), in the presence of UDP-GalNAc (green, C(UG)x), and in the (more ...)
The highest RMSD (13 Å) is seen in loop A in the absence of ligands. However, when both UDP-GalNAc and the peptide are present, the RMSD of loop A relative to the average position is reduced significantly from 13 Å to 4 Å (, peak A). Even though loop A (missing in the 2FFV “open” x-ray structure) was consistently modeled as in the “closed” 2FFU x-ray structure, its dynamics depends not only on the presence of ligands but also the state of loop B. In simulations starting from the structure with loop B in the closed conformation (), loop A appears more stable, because loops A and B interact through a hydrogen bond between the main chain oxygen atom of N102 and the R362 side chain. In the absence of UDP-GalNAc, loop B opens and this interaction is lost. Consequently, loop A becomes more flexible. In simulations starting from the structure with loop B in an open conformation (), this interaction is absent from the beginning and the conformational space sampled by loop A is much larger.
Figure 3 RMSD of loop A as a function of simulation time (ns) calculated using the conformation in the x-ray structure (2FFU) as reference. Results are shown for simulations starting from closed (a) and open (b) conformations of the long loop, both in the presence (more ...)
The flexibility of loop B is slightly lower than that of loop A in the absence of substrates but is significantly reduced, from 12 Å to less than 4 Å RMSD, upon UDP-GalNAc binding. However, the subsequent binding of the peptide does not provide additional stabilization. Since this loop interacts directly with the donor and acceptor and its dynamics greatly affects the binding of substrates, its motions will be discussed in more detail in the following section.
Access to the active site and mechanism of donor binding: role of W331
In the simulation starting from the closed conformation, but with both substrates removed (Cx), loop B undergoes a conformational transition towards the open state (). This is a slow process that starts almost 1.5 ns after the simulation begins and lasts for about 25 ns. To identify the factors that trigger this large conformational change, we analyzed the residues in the vicinity of loop B. Interacting with its end, we identified W331 (part of the longer, conserved WGGEN motif within loop E) whose side-chain flips by approximately Δχ1~180° around the Cα-Cβ bond after 1.5 ns simulation time. This rearrangement () creates an unfavorable steric clash with P366 and Y367, which are both well conserved, and a cluster of bulky residues (H365, F369, F377). The W331 χ1 rotation also appears to trigger the slow movement of loop B towards the opposite side of the peptide binding groove. W331 maintains the flipped conformation until loop B is stabilized in an intermediate conformation at about 8 Å RMSD relative to the completely open state. At this point W331 is no longer interacting with loop B. Thus, its side-chain is free to switch back to the initial conformation ().
Figure 4 Conformational changes of loop B are correlated with the rotameric state of the W331 side-chain. (a, b) RMSD of loop B (line, left y-axis) relative to the 2FFU conformation, and the side-chain orientation of W331 (χ1 angle, dots, right y-axis), (more ...)
To determine whether the opening of loop B observed in the absence of the donor is correlated with the side-chain rotation of W331 or is simply due to the lack of interaction with UDP-GalNAc, we simulated an identical system (i.e., in the absence of the nucleotide-sugar donor) but with the rotameric state of the W331 side chain restrained (, Cwx). In this simulation, we observe that loop B maintains its original conformation and does not open during the 40 ns-long simulation (), providing further evidence that the change in the W331 side chain orientation is indeed required for loop opening.
Interestingly, the conformational change of the W331 side chain occurs only in the absence of donor UDP-GalNAc. The phosphate groups of either the substrate UDP-GalNAc or the product UDP stabilize the inward orientation of W331 through two hydrogen bonds between their oxygen atoms and nitrogen NE1. In the simulations starting with loop B in its open state and with UDP present (, CopenU), the W331 side chain does not change orientation. In contrast, with the loop initially in the open state but without UDP (, Copen), the side chain behaves similarly to the case with the loop B closed (, Cx). The correlated change in the orientation of the W331 side chain with loop B opening suggests that the W331 amino acid may be facilitating donor access into the active site and product release.
Similar mechanisms, in which access to the active site is controlled by the position of a long loop whose dynamics is in turn correlated with motions in a short and conserved loop, have been described for other families of transferases using different molecules as donors and acceptors. For example, β-(1,4)-Gal-T1 undergoes a transition from an open to a closed conformation upon glycosyl donor substrate binding.26,27
The corresponding Trp residue, W314, is again part of a WGGEN motif within a small flexible loop of that protein. W314 plays an important role in the conformational change of the adjacent long loop, by moving toward the catalytic pocket to interact with the complex of donor substrate and Mn2+
ion upon donor substrate binding.28
This conformational change seems to be essential for the subsequent productive binding of an acceptor sugar substrate, since a W314Y mutation drastically reduced both the binding affinity to the sugar nucleotide and the catalytic activity of the enzyme.29
In molecular dynamics simulations of β-(1,4)-Gal-T1 in both implicit and explicit solution, Gunasekaran et al.13,14
reported tight coupling of motions in a long loop (that would correspond to our loop B) to the structure and motions of the corresponding 314-WGG-316 motif. Mutations of the conserved Trp and the flanking Gly residues to Ala reduced the fluctuations also in the long loop. Overall, the simulation results for β-(1,4)-Gal-T1 are consistent with our results for ppGalNAcT-2. However, in our specific case, the main chain of the WGGEN loop stays relatively rigid and the conformational change appears to be triggered by the side chain of W331 alone. Thus, our simulations suggest a mechanism in which W331 acts as a gatekeeper that controls the transition between the closed (unable to accept ligand) and open (able to accept ligand) conformations of loop B.
Evolution of charge distribution in the active site during loop opening
The accessible surface area of the UDP-GalNAc binding site doubles during loop opening (). The newly exposed surface consists mainly of positively charged residues (R201, R362) and hydrophobic residues, while negatively charged residues do not become exposed. The two arginine residues reorient towards solvent, whereas two His residues (H145, H226) do not change orientation but become exposed due to loop repositioning. Both R201 and R362 are highly conserved in the ppGalNAcTs and interact with the uridine ring and phosphate oxygen atoms of UDP-GalNAc. In addition to their favorable electrostatic interactions with the bound UDP, they may also facilitate donor entry into the active site by making weak long-range contacts in the open, highly flexible state of the loop. The observed flexibility of the unstructured regions in the vicinity of the donor binding site may thus be a relevant factor for protein-substrate recognition.
Figure 5 Increase of the accessible surface area (ASA) of the UDP-GalNAc binding site (defined using atoms that are located 5Å around UDP-GalNAc) during loop opening in the Cx simulations. (a) Total ASA and contributions from the hydrophobic (P, G, A, (more ...)
Hydration of Mn2+ and sugar donor UDP-GalNAc. Implications on enzymatic mechanism
A manganese ion is required for both the stability and activity of ppGalNAcTs and is bound by residues in the active site motif DxH and an additional conserved His.3,5
bridges the protein side chains with the diphosphate group of the donor phosphate. X-ray structures containing UDP-GalNAc or UDP show that Mn2+
bridges three residues in the active site (2 × His, 1 × Asp) and two oxygen atoms in the phosphate groups of UDP. The presence of one water molecule at a distance of ~2.2 Å results in a coordination number 6 of the Mn2+
We analyzed the stability of Mn2+ interactions in the presence and absence of substrates and the hydration patterns in the active site, to explore possible implications on the enzymatic mechanism of ppGalNAcTs.
Simulations in the presence of UDP-GalNAc indicate that the distance between Mn2+ and the beta-phosphate-oxygen is maintained essentially constant at ~4 Å, in agreement with the x-ray structures. This is also true for simulations containing free UDP, although the oxygen atoms in the terminal phosphate group interchange positions due to rotation of the second phosphate group.
We find that in simulations in the presence of UDP-GalNAc alone or UDP-GalNAc and peptide, a single water molecule with high residence time remains in the first hydration shell of Mn2+ (). Even in the simulations performed in the absence of crystal water, a water molecule from the surrounding bulk solvent takes this role, suggesting the importance of hydration for the Mn2+ ion in the active site.
Figure 6 Snapshots of water molecules in the first hydration shell of the Mn2+ ion from simulations (a) in the presence of both UDP-GalNAc and peptide, (b) in the presence of UDP-GalNAc or (c) in the absence of ligands. (d) Histogram of the number of water molecules (more ...)
In the presence of UDP-GalNAc, the water molecule in the first hydration layer of Mn2+ is part of a larger water network formed with molecules in the vicinity of glycosidic oxygen. In the presence of both UDP-GalNAc and the EA2 peptide, the glycosidic oxygen interacts with the OG1 atom of the accepting T7 peptide residue via three water molecules: the first bridges only the glycosidic oxygen and OG1, the second bridges in addition the main chain N atom of the T7 side chain, and the third bridges also the O5* atom of the GalNAc moiety. Taken independently, each of these water molecules have a residence time of around 50% of the simulation time. However, taken together, at least one of these water bridges is present during more than 90% of the simulation time. Therefore, the oxygen of the anomeric carbon and the hydroxyl group of the acceptor Thr are nearly continuously bridged by water.
In the simulations of the catalytic domain without substrates, Mn2+
is 7-coordinated, with 3 water molecules replacing the two oxygen atoms of the phosphate groups. The first hydration layer of Mn2+
contains 4 water molecules with high residence time (), as indicated also by the peak at 2.5 Å in the histogram in . The average coordination geometry of Mn2+
is a pentagonal bipyramid; the coordination partners are the water molecules, oxygen atoms of D224 stably coordinating during the entire trajectory, and nitrogen atoms of H226 and H359 competing with additional water molecules for Mn2+
coordination. In the absence of crystal water, molecules from bulk solvent take the role of solvating the Mn2+
ion. The same change of coordination number of the Mn2+
ion has been described also in the case of another retaining glycosyltransferase, α-(1,4)-galactosyltransferase LgtC from Neisseria meningitidis
where it was confirmed also by ab-initio
quantum mechanics calculations. The increase in the coordination number is correlated with a higher accessibility of the active site in the absence of substrates.
These observations have implications on the enzymatic mechanism. Previously published data do not unequivocally support any of the mechanisms proposed for retaining glycosyltransferases. Our simulations are more consistent with the SNi mechanism than the double displacement mechanism since the large conformational changes of the active site, required by a double-displacement mechanism to bring putative nucleophiles closer to the beta-phosphate-oxygen, did not occur on the timescale of the simulations. However, a double displacement mechanism is still possible if the timescale of the conformational change needed for the enzyme-sugar intermediate is much longer than the time sampled by molecular dynamics. Alternatively, a single displacement mechanism would imply that the nucleophilic attack on C1-O bond is performed by the hydroxy group of the acceptor Ser/Thr. The distance between the glycosidic oxygen and the nucleophilic hydroxy group is about 3 Å and is maintained nearly constant during the simulation, which would at least structurally be consistent with a nucleophile role of the acceptor.
Analysis of tightly bound water molecules
Hydration of the active site is important not only for catalysis, but also for the dynamics and flexibility of loop B controlling ligand access in the active site. Simulations of the system in the presence and absence of crystal water molecules for the catalytic domain alone (, Cx and C) showed that in the absence of water molecules hydrating deep cavities in the active site, the side chain of W331 is stabilized in the inward orientation, and consequently loop B does not open. Interestingly, loop B is the only region whose dynamics is affected in a major way by the removal of x-ray crystallographic water molecules in the initial structure. We found that in the cavity around W331 two internal water molecules (HOH1160 and 1162) do not get replaced by bulk water. These molecules form hydrogen bonds with W331 and E334 in the conserved loop WGGEN. Without these water molecules, W331 and E334 interact directly, preventing the conformational change of W331 side chain. Therefore, these internal water molecules may have a “lubricating” role in the vicinity of W331. Similar effects of internal water molecules have been reported by calculation of hydration free energies and entropies30
suggesting that the addition of a water molecule can increase protein flexibility through effectively weakened protein-protein hydrogen bonds.
In the presence of UDP (, CUx and CU) and in the presence of UDP and EA2 (, CUPx and CUP) the RMSD per residue profiles were not significantly different, explained partly by the fact that large conformational changes do not occur in the presence of ligands, and partly by the observation that bulk water molecules from solution occupy the crystallographic water sites close to the protein surface during the simulation.
Peptide flexibility and interactions
One of the fundamental questions regarding the family of ppGalNAcTs is related to the in vivo
peptide substrate preferences of individual isoforms. A detailed understanding of peptide conformational preferences is crucial for the evaluation of peptide affinity in sequence scanning experiments. Moreover, an accurate description of the conformational space sampled by the backbone of the peptide in complex with the enzyme and the identification of patterns of flexibility along peptide sequence is essential for understanding peptide specificity31
and the design of alternative peptide substrates. Previous studies showed that by introducing backbone perturbations in conformations of peptide ligands observed in crystal structures, it was possible to identify a larger number of interacting sequences for a given docking target.32
The analysis of the x-ray ternary complex (PDB code 2FFU5
) shows that EA2 binds in an extended conformation into a shallow groove on the surface of hT2. This conformation was maintained throughout our simulation, except for the terminal residues where the backbone residues sample several conformers (). Hydrogen bonds formed between residues at both ends of the peptide (S5 - H365 and K13-G265, ) are less stable, while interactions near the glycosylation site (T6-R362 and T11-S267) are more stable, consistent with the lower flexibility in the center of the peptide. However, the terminal residues are less flexible when the GalNAc moiety is present, either covalently linked to UDP or to the peptide, compared with the trajectory containing UDP only.
Figure 7 (a) Flexibility of the glycosylated and un-glycosylated peptide in complex with the catalytic domain, measured by the RMSD per residue. (b, c) Evolution of least stable hydrogen bonds between peptide and catalytic domain during simulations before and (more ...)
A general preference of transferases for peptides adopting extended or random coil conformations has been established both by NMR experiments33,34
and by database analysis,35
and these observations have been used successfully in predictions of mucin-type O-glycosylation sites. The higher backbone flexibility of the peptide termini seen here suggests a less pronounced sequence preference at the ends.