3.1. General comments
The structure of Tb
GS was solved by molecular replacement using the structure of the yeast enzyme, Sc
as the search model. The refinement statistics and model geometry () indicate that the refinement has produced a low-resolution model of acceptable quality. Four molecules are present in the asymmetric unit, forming two dimers consisting of subunits A with B, and C with D. The subunits are very similar in structure with r.m.s.d. values of 0.9 Å (A–B), 0.9 Å (A–C) and 0.8 Å (A–D) when they are superimposed with SSM 
. This is in part a consequence of the application of NCS during the refinement. Subunit A has five disordered sections (residues 131–132, 178–181, 234–248, 444–455, and 537–544), subunit B has four (residues 128–137, 234–246, 401–406, and 426–464), subunit C has seven (residues 127–134, 181–184, 238–246, 414–416, 423–430, 442–464, and 536–543) and subunit D has three (residues 128–132, 235–245, and 425–464). In general the following discussions will deal with molecule A, it being the most complete of the four subunits in the asymmetric unit. There are several regions in the other subunits, which were better defined than in molecule A and these will be discussed as appropriate.
Superimposition of the Tb
GS structure on the other available GS enzyme structures revealed strong structural conservation with the S. cerevisae
(31% sequence identity) and human enzymes (31% sequence identity). These gave r.m.s.d. values of 1.7 Å and 1.6 Å respectively. The E. coli
enzyme, which has only 17% sequence identity, shows significant differences to these with an r.m.s.d. value of 3.2 Å when superimposed on the Tb
GS structure. Description of the overall structure of Tb
GS will, therefore, follow closely previous descriptions of GS and we maintain a nomenclature already defined for these structures [15,16,27–29]
The modest resolution to which diffraction data have been acquired necessarily limits the accuracy of the model and a conservative approach has been adopted in reporting structural details. We have assigned hydrogen bonding and salt bridge interactions in general if the distance between relevant functional groups falls in the range 2.5–3.5 Å.
3.2. Overall structure
The structure of subunit A is presented in A, colored according to secondary structure. The Tb
GS subunit is composed of two domains, a large “core” domain and a smaller “lid” domain (A) that together form an ATP-grasp fold [30,31]
. The core domain is itself constructed from two sub-domains placed on either side of a four-stranded anti-parallel β-sheet formed by β5, β6, β20 and β21. One of the sub-domains comprises four parallel and two anti-parallel β-strands enclosed by short α-helices, for example helices 9, 10, 11, 12, 13, 14, and 15 (A). The second sub-domain consists of three helices (α1, α2, and α7) positioned beside a four-stranded anti-parallel β-sheet (β3, β7, β18 and β19). The lid domain is poorly resolved in the electron density map, with only a short section of anti-parallel β-sheet observed in subunits B, C and D. Subunit A displayed greater order in this region, allowing for model building of all but a short sequence between residues 443 and 446. Secondary structure within the lid domain of subunit A can be assigned with confidence, however the positioning of some side chains was less clear. The fold of this small domain is similar to that observed in other GS structures, with a core of three β-strands forming one wall and lid of the active site with a further three α-helical sections exposed on the protein surface (A).
Fig. 2 Structure of TbGS. (A) Overall structure of a subunit and position of GSH. (B) Structure of the TbGS dimer, with those secondary structure elements identified to be involved in the interface marked. (C) Sequence alignment of TbGS with the sequences for (more ...)
The lid domain appears to be inherently more mobile than the remainder of the enzyme as indicated by the less ordered electron density and elevated thermal parameters in the model. The function of this lid is to fold down on top of, or grasp, the bound ATP. Disorder of the lid domain has been observed previously in GS [15,16,29]
and in related ATP-grasp fold enzymes such as biotin carboxylase 
. Despite attempts to co-crystallize Tb
GS with ATP or the non-hydrolysable AMP-PNP, these ligands were not observed. The absence of a ligand in the ATP-binding site may, in part, explain the problems encountered when attempting to interpret diffuse electron density for this region of the molecule. It should be noted, however, that Tb
GS contains two insertions within this region of four and six residues (C), a point discussed later. The importance of the lid domain in ATP-grasp enzymes is not simply to restrict the movement of ligand bound prior to catalysis, or aid in the orientation of the ligands, but also to prevent the intrusion of solvent into the active site, since the phosphate intermediates formed as part of the reaction mechanism are vulnerable to hydrolysis 
. In Tb
GS, the lid domain in subunit A is considered to be in an open position (). Comparison of the positioning of the lid domain in both Sc
GS and Hs
GS when nucleotide is bound reveals much greater coverage of the top of the active site than observed in Tb
Fig. 3 Stereoview showing the superimposed main chain trace of TbGS, ScGS and HsGS. TbGS is in green, HsGS (Protein Databank code 2HGS) in blue and the ScGS structure (Protein Databank code 1M0W) in orange. The approximate positions of the A- and G-loops as (more ...)
Size exclusion chromatography indicated that TbGS is a dimer in solution, and the asymmetric unit was found to comprise two such dimers, subunits A–B and C–D. The TbGS dimer is formed in a similar manner to that observed for the ScGS and HsGS with a non-crystallographic twofold rotation axis relating the subunits to each other (B). The subunit–subunit interface was calculated to cover ~1820 Å2 or 7.5% of the total surface area of a subunit, and is formed through contacts from a short span of anti-parallel β-sheet (β1–2) and contacts from α2 and a short section of α7 along with α10. Salt bridge interactions are formed between Glu12 and Arg288, Arg13 and Glu292, Arg176 and Asp251 plus both Lys21 and His164 are in a position to interact with Glu279 (data not shown). A short stretch of anti-parallel β-sheet between adjacent β19 (residues 486–490), along with salt bridge interactions between Arg491 and Glu16 form the crystal contacts between dimers within the asymmetric unit (data not shown).
The sequence alignment of TbGS with ScGS and HsGS reveals six insertions of greater than four residues in the trypanosomatid enzyme (C). The first two of these, a six-residue insertion within α7 and eight residues inserted before β7, combine to form an elongated loop compared to the yeast and human enzymes. The third and fourth insertions occur on one side of the TbGS core domain formed by helices α8 and α9. They produce an extension to α8 compared to the ScGS and HsGS structures in addition to a helix (α9) unique to TbGS. Residues 238–245 have not been resolved in the TbGS electron density maps, but the structure becomes ordered at the structurally conserved β8. The fifth insertion, just prior to α20, is part of the lid domain and is largely disordered, but again it would appear to form an extended loop. Finally, a nine-residue insertion forms an elongated twisted β-turn between β20 and β21. All of these insertions occur on the surface of GS distant from the active site, substrate/cofactor binding sites and the dimerization interface. An alignment of trypanosomatid GS sequences with those from other eukaryotic species (data not shown) reveals these insertions are mainly restricted to the trypanosomatid enzymes. An exception is found in GS from the mosquito Anopholes gambiae where it is noted that insertions 1 and 2 are present although there is no appreciable sequence homology in these segments of the proteins.
3.3. Active site structure and ligand binding
GS has an ATP-grasp fold that carries a well-defined nucleotide-binding site. The floor of the nucleotide-binding pocket is formed by the anti-parallel strands β5 and β6 with walls formed on one side by strands β20 and β21 and on the other by the lid domain. The GSH binding site is positioned over the top of a loop linking β6 to α7, with further interactions to residues from the loops that link β8 to α10 and β13 to α12 (A). Sequence alignments of the ATP-grasp family in conjunction with available structural data have highlighted three flexible loops regions as being important for the activity of these enzymes 
. These flexible segments are known as the substrate binding loop (S-loop), the glycine rich loop (G-loop) and the alanine rich loop (A-loop) [15,28]
. The positions of these loops, together with most of the residues that interact with GSH or are predicted to bind ATP are shown in , with an alignment of these regions in 23 eukaryotic GS enzymes presented in .
Fig. 4 Stereoviews of the active site. A molecule of GSH is bound in the active sites of all four subunits in the asymmetric unit of TbGS. GSH is depicted as a stick model colored by atom type (C: black, O: red, N: blue, and S: yellow). A molecule of ATP is (more ...)
The alignment of the S-, G- and A-loops from 23 eukaryotic GS sequences. While the G-loop shows identity approaching 100% the other two loops show lower, yet still significant levels of conservation. The numbering relates to the sequence of TbGS.
The S-loop extends from β13 and lies across the top of the bound GSH. The conservation is lower in the S-loop when compared to the G- and A-loops but three residues important for interaction with ligands are conserved. Tyr322 aids the orientation of Arg324 that interacts with the glutamyl carboxylate of GSH. Tyr327, held in position by a hydrogen bond donated from Gln261 (conserved as glutamine or glutamate in other GS sequences) lies across the face of the cysteinyl moiety of GSH. The presence of the aromatic side chain may afford some protection against side reactions occurring with the reactive thiol group.
The A-loop is in close proximity to the glycyl end of GSH and interacts using main chain functional groups. The amides of Val541 and Met542 donate hydrogen bonds to the glycyl carboxylate group (data not shown). This carboxylate group also accepts a hydrogen bond donated from the side chain of the Arg530 (conserved as Arg450, Arg467 in Hs
GS and Sc
GS respectively). In Hs
GS, GSH displays a similar interaction pattern with the A-loop, although the residues concerned are Val461 and Ala462 
. The amino acid type at the latter position varies in the sequences from the most common alanine to low occurrences of serine, isoleucine, histidine and methionine. The presence of two glycine residues preceding this section is almost completely conserved in eukaryotic GS sequences (valine in a single sequence, Dictyostelium discoideum
, see ), as are the two residues that follow; usually alanine, otherwise serine or glycine, and a completely conserved glycine. The sequence alignments alone suggest that Sc
GS would show a high level of structural similarity with Hs
GS and Tb
GS in this region. However, this is not the case. In Sc
GS the loop in this region extends away from the GSH binding site or is disordered. Neither of the reported Sc
GS structures possesses a complete molecule of GSH in the active site with only γ-glutamylcysteine being observed. The lack of the glycyl moiety might preclude conformational changes to lock down and order residues in this part of the active site as observed for Hs
GS and Tb
The G-loop, between β16 and α19, is a highly conserved region with eight residues out of twelve found to be identical in the sequences examined. The remaining four residues involve conservative changes (). This loop was most problematic to model and analyze, lying as it does within the flexible lid domain. In Tb
GS the loop has been successfully modeled in subunit A with a partial model in subunit C. The loop contains three conserved glycines, one of which has previously been assigned as having a role in stabilizing the pentavalent phosphate intermediate during the phosphorylation step of the catalytic cycle [15,16]
. While this section is modeled in Tb
GS, the placement of the glycines is distant from the expected position over the active site. Previous structural studies of GS have required AMP-PMP or ADP to be bound within the nucleotide-binding site before the G-loop can be resolved, and this binding appears to cause significant reorganization of the lid domain. The structures of Sc
GS revealed that binding of AMP-PNP caused a 64° rotation [16,33]
and maximal shift of some 24 Å of this section of the molecule relative to the core domain.
Away from the A- and G-loop regions there are several other interactions observed between GSH and TbGS (B). The γ-glutamyl moiety forms salt bridge interactions with Arg324 and hydrogen bonds to Ser150, Glu264 and Asn266. These residues are strictly conserved in HsGS as Arg267, Ser151, Glu214 and Asn216; and in ScGS as Arg285, Ser153, Glu228 and Asn230. The cysteine carbonyl and amide groups potentially form hydrogen bonds to Arg119 and Ser148 respectively. This pair of residues is strictly conserved as Arg125, Ser149 in HsGS and Arg128, Ser151 in ScGS. The glycinyl carboxylate is placed to form salt bridge interactions with TbGS Arg530 and again the interaction appears to be strictly conserved in HsGS with Arg450 and in ScGS with Arg467.
In general, the part of the ATP-grasp enzyme family active site that binds the nucleotide cofactor is well-ordered and does not change conformation upon ligand binding 
. Therefore, on the basis of sequence and structural homology we expect ATP to bind Tb
GS in a similar manner to that discussed for other ATP-grasp proteins and a model of nucleotide binding, primarily based on the Sc
GS:AMP–PNP complex structure 
is presented in B. The adenine would make a large number of hydrophobic interactions with Tb
GS Met123, Val142, Met468, Ile471 and Phe473. These five residues are highly conserved in Hs
GS (Met129, Ile143, Met398, and Ile401) and Sc
GS (Leu132, Val145, Met415, and Ile418). A potential hydrogen bond network can also be identified with adenine N6 donating to the carbonyl of Ser469 and adenine N1 accepting from the amide of Ile471. Lys532, Glu496 and Glu432 are in a position such that they would be capable of forming hydrogen bonds to the ribose group of ATP. The binding and orientation of the cofactor phosphates would be expected to involve Lys363 and Lys428, matching to Lys305 and Lys364 of Hs
GS, and also the Lys324, Lys382 pair in Sc
GS. However, in the Tb
GS:ATP model only the first lysine residue matches the position found in the other species. The latter, Lys428, is pointing out of the active site due to the orientation of the G-loop in the Tb
GS structure. We predict a significant re-organization of the G-loop would accompany nucleotide binding and therefore an interaction with Lys428 may well be conserved. The phosphate tail of ATP would be expected to bind near, interacting with residues on the highly conserved G-loop. These residues are not shown in B since in the Tb
GS structure the G-loop, which is part of the lid domain, is in an open conformation.
Two magnesium ions stabilize the position of the phosphate tail of the cofactor in the active site of ATP-grasp enzymes and are thought to influence the stability of the reaction intermediates during catalysis. The positions of such cations are clearly observed in the Sc
GS:AMP–PNP complex structure where four acidic residues either coordinate the Mg2+
ions directly or bind coordinating water molecules in conjunction with direct phosphate-cation coordination 
. In Sc
GS these residues are Asp130, Glu146, Glu386 and Glu442. All four residues are strictly conserved in Tb
GS and Hs
GS (C). In Tb
GS, three of these residues, Asp121, Glu146 and Glu496 are placed to fulfill the same role. The fourth residue, Glu442 is observed to be distant from the cofactor-binding site since the mobile lid domain is in an open conformation. Our model indicates also that Arg119 and Arg530 of Tb
GS, residues important for substrate recognition by interaction with glycine and cysteine moieties respectively, and catalysis as discussed previously, are structurally conserved.