|Home | About | Journals | Submit | Contact Us | Français|
ADP ribosylation factor-like (ARL) proteins are small GTPases that undergo conformational changes upon nucleotide binding, and which regulate the affinity of ARLs for binding other proteins, lipids or membranes. There is a paucity of structural data on this family of proteins in the Kinetoplastida, despite studies implicating them in key events related to vesicular transport and regulation of microtubule dependent processes. The crystal structure of Leishmania major ARL1 in complex with GDP has been determined to 2.1 Å resolution and reveals a high degree of structural conservation with human ADP ribosylation factor 1 (ARF1). Putative L. major and Trypanosoma brucei ARF/ARL family members have been classified based on structural considerations, amino acid sequence conservation combined with functional data on Kinetoplastid and human orthologues. This classification may guide future studies designed to elucidate the function of specific family members.
ADP-ribosylation factor-like (ARL) proteins are a heterogeneous group of GTPase enzymes within the Ras (Rat sarcoma gene) family, and related to the ADP-ribosylation factor (ARF) group [1, 2]. ARLs lack the defining ARF functions; they are not cholera toxin cofactors, cannot complement the lethal arf-/arf2- phenotype in Saccharomyces cerevisiae, and are incapable of activating phospholipase D. Functional data on ARF/ARL proteins are limited and the primitive eukaryotic Kinetoplastida may provide appropriate model systems to investigate this protein family since genomic data are available as are reagents and protocols for genetic studies.
ARL1 is conserved among the trypanosomatids . In Trypanosoma brucei, ARL1 (TbARL1) is expressed only in the bloodstream form and RNAi knock down is lethal . Both L. major ARL1 (LmARL1) and L. donovani ARL1 (LdARL1), which share 98% sequence identity, are myristoylated on Gly2 for localization to the kinetoplast and the flagella pocket trans-Golgi network where they likely contribute to intracellular trafficking and mutations have revealed that GTP-binding is required for correct localization .
Crystal structures of human and yeast ARLs are known  but not of any protist ARL. We determined the structure of LmARL1 to 2.1 Å resolution following protocols commonly applied in our laboratory. Crystallographic statistics are presented in Table 1 and detailed in Protein Data Bank entry 2X77. The model supports sequence-structure comparisons of LmARL1 with ARF/ARL proteins predicted from L. major and T. brucei genomic data [3, 5] exploiting what is known of other ARF/ARL proteins [1, 6]. A classification of these GTPases that may guide future investigations of family members in the Kinetoplastida is presented.
Analytical gel filtration indicated that dimeric and monomeric species of LmARL1 were present and the ratio varied in a time dependant manner due to disulfide bond formation. The dimer peak gave ordered crystals in space group P21 with two molecules in the asymmetric unit. Molecule A consists of which consists of residues 4 to 184, and molecule B, consisting of residues 3 to 48, 55 to 73 and 83 to 183. The molecules are linked via a Cys83-Cys83 disulfide bond similar to that observed in the S. cerevisiae orthologue, ScARL1 . LmARL1 Cys83, conserved in ARL1 orthologues (data not shown), is placed on the loop linking β4 with β5 (Figure 1A). Mutation of this cysteine in ScARL1 prevents dimerization yet the mutant protein can rescue cold sensitivity of arl− cells, implying that biological function is retained by the monomer and that the dimer is an artifact generated during purification. The r.m.s.d. of 165 Cα positions common to molecule A and B is 0.63 Å. No restraints were imposed on non-crystallographic symmetry so this indicates highly similar molecules and only A is discussed, unless otherwise stated.
Structural comparisons, using molecule A in DALI  identify that human ARF1 has the highest similarity to LmARL1 with an r.m.s.d. of 0.64 Å for superposition of 174 Cα atoms and indicate strong conservation of structure within the ARF/ARL protein family. The fold of LmARL1 is similar to other Ras family members, which display a β-sheet core of six strands surrounded by five α helixes; in LmARL1 a seventh β-strand is observed (Figure 1A). ARF/ARL proteins undergo conformational changes upon nucleotide binding. They possess two segments, switch I and switch II, which direct contacts with guanine-activating proteins and guanine-exchange factors and a phosphate-binding segment (P-loop), which interacts with nucleotide phosphates (Figure 1A). ARLs differ from other Ras family members in that they display an interswitch toggle effect whereby upon binding GTP the interswitch region, those residues positioned between switch I and II undergo a conformational change and displace the amino terminal α-helix into a position to interact with membranes .
In LmARL1 the switch I region consists of residues Gly43 to Val54 and switch II comprises Gly72 to Tyr84. Strands β3 and β4 together with the turn linking them form the interswitch section (Figure 1A). Like other ARL:GDP complexes, α1, on the protein surface, is positioned between α2, α5 and the interswitch region [1, 7]. The GDP binding site is formed between α2 and β1 and the loops linking β5 - α3 and β6 - α4 (Figure 1A). Residues interacting with the ligand are shown in Figure 1B. The P-loop, with the Walker A motif GxxxxGK(S/T), coordinates the phosphates in combination with Ser35, Asn30, and Ala31. A Mg2+, coordinated by four water molecules, OG1 Thr34 and GDP β-phosphate, occupies one end of the active site. Here, Glu57 and Asp70 contribute to binding the hydrated Mg2+ by water-mediated interactions (Figure 1B). Octahedral coordination of the Mg2+ ion stabilizes the switch I and interswitch regions in a conformation which allows α1 to adopt a retracted position.
Guanine binds at the β6 - α4 region of the active site making van der Waals contacts with Lys130 on one side and Met152 from a symmetry related molecule in the crystal lattice on the other (data not shown). GDP N7 forms a hydrogen bond with a buried water molecule that also hydrogen bonds with Gly32, Asn129 (not shown) and Ser163. The amide of Ser163 and hydroxyl of Ser162 form hydrogen bonds with GDP O6 whilst Asp132 accepts hydrogen bonds donated from N1 and N2.
The amino acid conservation of the Leishmania ARF/ARL proteins has been mapped onto LmARL1. The most highly conserved residues constitute the core of the protein structure and the nucleotide-binding site (Supplementary data, Figures S1, S2). Of the fourteen residues described above with a role in binding GDP:Mg2+ we note that seven (Gly32, Lys33, Thr34, Asp70, Asn129, Lys130 and Asp132) are strictly conserved in eleven (of twelve) putative ARF/ARL sequences encoded in the L. major genome (, Supplementary data, Figure S1). Ser162 is strictly conserved in seven sequences, and replaced by cysteine in another three. Asn30 and Ala31 are strictly conserved in eight and ten of the other sequences respectively. Ser35 is strictly conserved in four sequences or replaced by threonine in the remainder. Glu57, part of the interswitch region, is strictly conserved in five sequences with an aspartate in another. Residues not involved in GDP binding and located on the protein surface show poor conservation and are likely involved in complex formation with the membrane and different effectors, thus providing functional specificity to the ARF/ARL family.
Previous work on ScARL1 suggested that the position of a tyrosine on α2 (Tyr38 in LmARL1) and interactions with the conserved glutamate (Glu57 in LmARL1) could influence Mg2+ binding and nucleotide exchange . The tyrosine displays two conformations in LmARL1 (Figure 1B). In molecule A, Tyr38 is directed towards Glu57 yet remains over 6Å from the acidic side chain. In molecule B, Tyr38 is directed away from Glu57 to participate in a hydrogen bonding interaction with Thr155 OG of a symmetry related molecule (data not shown). This difference between the two molecules does not affect the occupancy of Mg2+ or GDP in the active site, contrary to previous predictions .
Based on the crystal structure of LmARL1, data on trypanosomatid ARF/ARL family members and comparative data with human proteins, we sought to extend from previous studies [3, 4] and organize both the L. major and T. brucei ARF/ARL family into functional groups. All L. major and T. brucei proteins labeled as proven or putative ARFs or ARLs in GeneDB  were identified by use of BLAST and text searches ; twelve in L. major and 13 in T. brucei, although in the case of two T. brucei proteins (Tb09211.4470 and Tb09.211.4480) there is only a single conservative difference involving valine and alanine. Each sequence was subjected to a BLAST  search against the human genomic database since this links to a significant body of functional data and a naming convention is available . A sequence identity distance tree for L. major and T. brucei proteins was calculated [Figure 2]. We excluded LmjF36.0820 from the alignment presented in Supplemental Figure 1 due to an unusual N-terminus extension and an insertion before the P-loop.
Two major groups, referred to as class I and class II respectively , are apparent. In class I there are four L. major proteins and five T. brucei proteins sharing greatest identity with ARF/ARL family members involved in membrane trafficking. In class II, those most closely related to proteins implicated in regulation of microtubule dependent events, there are three proteins from each species of parasite.
One of the assigned class I proteins (LmjF31.2280) is identical in amino acid sequence to L. donovani ARF1, a protein shown to be essential in vesicular trafficking and in the structural maintenance of the Golgi network . Another, LmjF31.2790, shares 86 % sequence identity with the Golgi-localizing T. brucei ARF1 (Tb09.211.4480), which is implicated in the maintenance of post-Golgi transport and endocytosis . The final group member, LmjF04.0480, awaits functional studies. LmjF04.0340, with 45 % sequence identity to LmARL1, is closely related to LmjF04.0480 (Figure 2) and is assigned to class I.
The class II proteins are most similar (sequence identities from 53 - 62 %) with human ARL variants many of which are implicated in the regulation of microtubule-dependent events . LmjF29.0880 and LmjF36.6230 are currently assigned as ARL3 and the L. donovani and T. brucei orthologues have been localized to the flagellum [13, 14]. In Drosophila melanogaster, ARL3 is localized in chemo- and mechanosensory ciliated cells  so it would be interesting to test if LmjF29.0880 and LmjF36.6230 are localized to the flagellum. The amino acid sequence of the remaining class II member, LmjF35.0130, does not display any unusual features and experimental work would be necessary to advance understanding of function.
LmjF30.2370 is an outlier lacking the myristoylation signal sequence and with a three-residue extension in the switch I region. LmjF05.0030 also lacks the myristoylation signal sequence, however we tentatively assign it as a Sar-like protein since it shares 51 % amino acid sequence identity to the human Secretion-associated and Ras-related (Sar) protein. Sar is a small GTPase within the Ras superfamily implicated in vesicular coat initiation and formation in similar fashion to ARF1 except that Sar initiates anterograde transport of coat protein complex II (ER to Golgi) instead of retrograde transport (Golgi to ER) of complex I . The Sar orthologue in T. brucei, Tb927.5.4500 , shares 78 % sequence identity with LmjF05.0030.
LmjF36.6230 shares 47 % amino acid sequence identity with Tb10.6k15.1960, a T. brucei protein identified in the flagella proteome . The closest human homologue, ARL13B is located in the cilia, and mutations are implicated in the ciliopathies Bardet-Biedl, and Joubert’s syndromes [e.g. 18]. With a degree of caution we suggest that LmjF36.6230 may be a class II ARL although it appears divergent in the sequence distance tree (Figure 2). The divergence may be due to the Leishmania protein possessing an extended N terminal section (data not shown) compared to Tb10.6k15.1960.
The N-terminal region of LmjF16.1380 is unusually short compared to other ARF/ARL proteins and may not be helical. The L. major sequence also lacks the glycine pair at the start of switch II, which is replaced by Ser61-Gly62 (Supplementary data, Figure S1). The lack of a Gly-Gly motif in human ARL6 feature may reduce flexibility at the interswitch-switch II section .
As outlined earlier, we also identified predicted ARF/ARL family members in T. brucei and investigated those sequences as described for the L. major proteins. The classification we arrive at is very similar in both organisms (data not shown). This observation agrees with findings of a study of T. brucei GTPases .
A search of the L. major and T. brucei genomic data identified twelve and thirteen ARF/ARL proteins respectively. On the basis of sequence and structure considerations, combined with biological data where available, we suggest that eight of the L. major proteins and nine of the T. brucei proteins fall into the two main groups of ARF/ARL proteins: namely those likely involved in microtubule interactions and those, like LmARL1, implicated in vesicular transport. There is a single Sar-like GTPase predicted in each organism. Further data would be essential, e.g. localization, phenotypic analysis following gene knock out or RNA interference methods, in order to assist assignment of biological function for the remaining proteins and to refine this classification of ARF/ARL proteins. Knowledge based on the Kinetoplastids might then form a solid basis upon which to advance work on the human ARF/ARL family.
Supported by BBSRC [PhD studentship and Structural Proteomics of Rational Targets grant BBS/B/14434], and The Wellcome Trust [grant numbers WT082596 and WT083481]. We thank ESRF Grenoble for synchrotron beam time and support.