|Home | About | Journals | Submit | Contact Us | Français|
Leishmaniasis, a disease of the developing world affects about 12 million people and has limited therapeutic interventions available. L-type lectins, Endoplasmic Reticulum Golgi Intermediate Compartment/Vesicular Integral Proteins (ERGIC-53/VIP36) are involved in protein sorting in luminal compartments of animal cells and are important for parasite biology. A lectin homologue was identified through a bioinformatics analysis of Leishmania genome and it was found to have N-terminal conserved carbohydrate recognition domain (CRD) and a unique C-terminal region rich in repetitive amino acids and a poly glutamine tract. The N-terminal CRD region was cloned and expressed in Escherichia coli, but gave an insoluble expression which was re-solubilized by on column refolding. The fold integrity was checked through CD, fluorescence and functional assay of hemagglutination activity using rabbit erythrocyte. Bioinformatics analysis identified 15 members from Tritryps (Leishmania spp., Trypanosoma spp.) and they separate out as a distinct clade in the global phylogenetic analysis of all ERGIC-53/VIP36 sequences downloaded from Uniprot. Our analysis shows that the extended C-terminal regions with repeats is unique to Tritryps and this repeat pattern is different in sequences from Leishmania spp. and Trypanosoma spp. and all these features make this protein an interesting candidate for further detailed studies.
Leishmaniasis is a disease of the developing world affecting about 12 million people and is caused by digenetic, obligate parasites belonging to genus Leishmania. These parasites exist as intracellular amastigotes within mammalian phagocytes and as flagellated promastigotes residing in sandfly (www.who.int/tdr/diseases/le-ish). Leishmaniasis can be visceral, cutaneous or muco-cutaneous depending upon the causative species.
In earlier studies an ERGIC-53/VIP36 homologue was identified from Trypanosoma cruzi and it was shown to be antigenic because of a unique repetitive amino acid region in the C-terminus . ERGIC-53 protein is mannose-specific membrane lectin operating as a cargo receptor for the transport of glycoproteins from the ER to the ERGIC , . Lack of functional ERGIC-53 was reported to cause selective defect in the secretion of glycoproteins in cultured cells and hemophilia in humans . Similarly a vesicular integral protein of 36 kDa, VIP36 was reported to play significant roles in vesicular transport from the ER to the Golgi complex . A similar ERGIC-53/VIP36 homologue has been identified from Leishmania spp. and in this communication we report its cloning, expression, biochemical characterization and bioinformatics based analysis for better understanding of the structural and functional aspects of this protein, which could be useful for further research in the area of infection biology.
The L. major gene (LmjF.13.0670/Q4QG87) was identified as the candidate gene and primers were designed based on its sequence for generating two clones, the full length complete gene (referred to as LdEg) and truncated gene having only the CRD region (referred to as LdEgTr). Complete LdEg gene was PCR amplified using forward primer: 5′-GTA CAT ATG ATG GCC GCC GCA AGG–3′ and reverse primer 5′-CG GGA TCC CTA CTC ATC CTG CTC CAC G–3′. For LdEgTr, same forward primer was used and a different reverse primer: 5′-CG GGA TCC AGA CAG ATG AAC GAA CAT GAT G–3′ was designed to amplify only the CRD coding region. The PCR reaction mixture consisted of 1 μl genomic DNA (50ng/ml), 0.2 mM dNTPs, 10 pmol of each primer, 3 mM MgCl2, 2U i-proof Taq polymerase (BioRad) in a total reaction volume of 50 μl. The PCR conditions used were initial denaturation at 98 °C for 1 min followed by 35 cycles of 98 °C for 30 s, 70 °C for 25 s, 72 °C for 25 s and final renaturation at 72 °C for 10 min. PCR products were analyzed in 1% agarose gel electrophoresis and purified using gel extraction kit (Qiagen) following manufacturer’s instruction. The forward and reverse restriction sites were NdeI and BamHI (Fermentas) in both cases and the amplified products were cloned in pET15b and pET16b (Novagen) vectors for LdEg and LdEgTr respectively. The ligated products were transformed into Escherichia coli DH5α through heat-shock method and 50 mg/ml ampicillin was used as a selection marker. Positive clones were confirmed through plasmid shift, colony PCR, restriction digestion by NdeI and BamHI and finally through nucleotide sequencing.
The selected positive clones were expressed in E. coli Rosetta (DE3) cells (Novagen) and protein expression was induced using auto induction method of Studier in 100 ml culture . Cells were harvested by centrifugation and re-suspended in 3 ml of buffer A (20 mM Tris–HCl pH 8.0, 300 mM NaCl and 1 mM Phenylmethylsulfonyl fluoride) and cells were lysed using ultrasonicator (Sonics model VCX750). The cell extract was centrifuged at 12,000 × g (Kubota 6500) for 20 min to obtain the soluble supernatant and the insoluble pellet. The insoluble inclusion bodies in the pellet were solubilized using buffer B (20 mM Tris–HCl pH 8.0, 300 mM NaCl, 8 M urea, 10 mM β-mercaptoethanol and 20 mM imidazole) and incubated overnight at 4 °C with constant shaking. Resulting suspension was further centrifuged at 12,000 × g for 20 min at 4 °C and the clear supernatant containing the denatured protein was used for refolding and purification.
The denatured protein was loaded on Ni-NTA resin (Sigma) column, which was pre-equilibrated with buffer B and on-column refolding, and purification was performed. Column was washed extensively with buffer B to remove non-specifically bound contaminants before the renaturation step. For renaturation, column was washed with 10-column volumes (CV) of Buffer A containing 0.1% Triton X-100 (Sigma) followed by washing with 10 CV of Buffer A containing 5 mM β-cyclodextrin (Sigma) in place of Triton X-100 to remove the detergent and allow the protein to refold. Finally the column was washed with 20 mM Tris–HCl (pH 8.0), 0.1 M NaCl to remove remaining impurities and β-cyclodextrin and the protein was eluted with Buffer A supplemented with 250 mM imidazole. All eluted fractions were analyzed on 12% SDS–PAGE gels.
The selected coomassie stained protein band was excised from the SDS–PAGE gel, de stained and in-gel digested using trypsin . The extracted peptides were subjected to MALDI-TOF/MS/MS using AB Sciex QSTAR elite LC MS/MS system. The peak list generated by Analyst or 4000 Series Explorer software was analyzed using the MASCOT search engine (http://www.matrixscience.com) and the GPS Explorer or Protein Pilot software to identify the peptides.
Steady-state fluorescence measurements were performed on a SpectraMax Me2 spectro-fluorimeter equipped with 96 well plate reader. Experiments were performed at room temperature using 0.2 mg/ml solution of protein in 20 mM Tris–HCl buffer (pH 7.5) with 280 nm excitation wavelength, 5 nm emission slit width and background signals were corrected. Both the change in fluorescence intensity and the shift in fluorescence maxima were recorded to monitor the unfolding transition following overnight incubation at 4 °C of the protein sample with guanidinium chloride in the concentration range from 0 to 6 M.
Far UV CD spectra were recorded on a Jasco J-715 spectropolarimeter equipped with a Peltier type temperature control system (PTC- 348WI model). Protein [0.2 mg/ml in 2 mM citric acid-glycine-hepes (CGH) buffer pH 7.5] was used and CD spectra were recorded at 20 °C with a time constant of 4s, a 1 nm band width and a scan rate of 5 nm min-1 and signal-averaged over at least 3 scans were collected.
The Near UV spectra of refolded LdEgTr were measured at room temperature using Jasco 815CD spectrometer. The spectra were recorded in the wavelength range of 250 –300 nm using a rectangular quartz cell of 1 mm path length. The protein concentration used for measuring CD spectra measurement was 0.2 mg/ml in 10 mM Tris buffer pH 7.5 and each spectra was the average of 3 scans.
The hemagglutination activity of the purified refolded protein was checked using 1% rabbit erythrocyte (RBC) suspension in 0.9% NaCl. Serially diluted 50 μl of protein solution (2 mg/ml in 0.9% NaCl), 50 μl of rabbit RBC suspension and 50 μl of 0.9% NaCl solution was added to wells in U shaped micro-titer plate and incubated at room temperature for one hour. Visual examination (RBC sediment as button formation) of agglutination was performed in presence of refolded protein and the hemagglutination unit (HU), which is defined as the reciprocal of the highest dilution exhibiting visible hemagglutination was noted while the specific activity was calculated as the number of hemagglutination unit per mg of protein. In case of inhibition studies, 50 μl of sugar solution in the concentration range of (5–100 mM in 0.9% NaCl) were added instead of the blank 0.9% NaCl solution and button formation observed. The sugars tested in the study were: d-glucose, d-mannose, d-galactose, maltose, fructose, sucrose, lactose, N-acetyl-d-galactosamine, methyl-d-glucopyranoside, methyl-d-mannopyranoside and the glycoprotein fetuin.
Multiple sequence alignment was done using the program clustalW (http://www.ebi.ac.uk/Tools/msa/clustalw2/). I-TASSER server (http://zhang.bioinformatics.ku.edu/I-TASSER/) was used for generating 3-D model of the proteins and quality of the final models were assessed by PROCHECK (http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html) and VERIFY-3D (http://nihserver.mbi.ucla.edu/Verify_3D) programs. Structural similarity search was performed using DALI server (http://ekhidna.biocenter.helsinki.fi/dali_server). Active sites analysis was done using CASTp program (http://sts.bioengr.uic.edu/castp) and structural superposition was done using SUPERPOSE program (http://wishart.biology.ualberta.ca/SuperPose/). Phylogenetic analysis of 44 well annotated ERGIC-53 and VIP36 like lectins from UniProt database (www.uniprot.org) and 15 sequences from TriTryp was performed using boot strap and Neighbor Joining method using program MEGA 5 . The full-length sequences, CRD and C-terminal region of these proteins were analyzed separately.
PCR amplification of both LdEg and LdEgTr was achieved and positive clones identified through plasmid shift and restriction digestion and PCR amplification of LdEgTr is shown (Fig. 1a). Significant expression of both LdEg and LdEgTr was observed in autoinduction but were located in the inclusion bodies. The column refolded LdEgTr was analyzed on 12% SDS PAGE, where a distinct band was observed at around 30kDa, which is the expected molecular weight of LdEgTr. As LdEgTr possess an internal cysteine residue, there was possibility of formation of intermolecular disulphide bridges, resulting in LdEgTr dimers. Thus, the column-refolded LdEgTr protein was separated on SDS–PAGE under reducing as well as non-reducing conditions. In both conditions, LdEgTr appeared at the same 30 kDa region indicating that the protein did not dimerize during refolding (Fig. 1b). Further, the protein identity was confirmed by MALDI MS/MS analysis, where a homologous lectin from L. major was predicted as significant hit (Fig. 1c). However efforts to re-solubilize LdEg even after repeated trials failed and therefore, it was not taken up for further studies.
LdEgTr contains three tryptophans, ten tyrosines and 15 phenylalanines, which allow conformational changes to be investigated by fluorescence spectroscopy. The refolded LdEgTr when excited at 280 nm, the emission was observed at 340 nm (λmax), which is a characteristic of tyrosine residues exposed in polar environment. Upon denaturation with increasing concentrations of Guanidine–HCl (0–6 M), a red shift and a decrease in intensity was observed (Fig. 2). The shift in λmax from 340 nm to 350 nm indicates that the tryptophan residues, which were previously unexposed in folded condition, was exposed after denaturation which indicates that the re-solubilized LdEgTr is in folded form, which was further confirmed through CD studies and hemagglutination assay.
The CD spectra of LdEgTr protein showed a minimum dip at 214 nm which is a characteristic of beta-sheet and a dip at 209 nm which indicated the presence of alpha helix in the structure which was similar to in-silico predicted secondary structure of the protein (Fig. 3a, c) and the near UV region spectra indicated the presence of aromatic residues Tyr, Phe, Trp in the sequence Fig. 3b.
LdEgTr agglutinatinated rabbit erythrocytes with a specific activity of 1HU/mg which was inhibited only with fetuin, a glycoprotein, at minimum inhibitory concentration (MIC) of 0.12% indicating it has specificity towards the complex sugars and not with other simple sugars used in this study.
Blast analysis of LdEg as query gives L-type lectins as hits (Fig. 4a) which aligns with the N-terminal CRD region (1–226 residues) and there is a 160 amino acid long C-terminus region which is unique and found only in homologues from Tritryps. LdEg has 31.2% identity and 44% sequence similarity with the T. cruzi sequence Q1ZY26_TYCR, which has been reported earlier. Modeling results from I-tasser server for the CRD region of query sequence showed the characteristic con-A fold of ERGIC-53/VIP36 proteins and it superposed very well on the crystal structure 1gv9; the mammalian ERGIC-53 and on 2dur, the VIP36 structures with a RMSD of 0.66 and 2.44, respectively (Fig. 4b) and most of the key residues like S66, E66, D100, N166 reported in the mammalian ERGIC- 53 to interact with sugars were found to be well conserved. Sequence analysis of the query also showed the absence of the characteristic KKXX motif thought to interact with coat protein 1 complex (COP-I) coated vesicles. The cysteine residues, which are well known for dimerization of subunits are also absent in this region of LdEgTr.
Phylogenetic analysis done with CRD region of VIP36/ERGIC-53 sequences downloaded from Uniprot shows three distinct clades; VIP-36, ERGIC-53 and Tritryp which are homologue sequence of ERGIC-53 (Fig. 5a). In case of phylogenetic analysis of full length sequences four distinct clades were observed i.e Tritryp, fungal, and two from eukaryotes (Fig. 5b).
In an earlier study we had reported the identification of a lectin homologue from Leishmania major genome analysis using bioinformatics studies  which has a con-A like fold and is a homologue of ERGIC 53/VIP36 and other L-type lectins. A previous study has reported the isolation of two cDNAs, from Trypanosoma cruzi amastigote library immuno-screened with sera from patients with Chagas disease which were identified as VIP36/ERGIC-53 homologues. The proteins encoded by them have an N-terminal lectin domain and repetitive amino acids and a poly-glutamine tract in the C-terminal region. They are expressed in all the life stages and located in sub-cellular localization in the vicinity of the flagellar pocket membrane . The lectin identified by us is a close homologue of this protein and based on sequence analysis a total of fifteen homologues were identified from Tritryp species (Fig. 6). Phylogenetic analysis of VIP36/ERGIC53 and their homologues from Tritryp shows the separation into distinct clades, indicating that the Tritryp ERGIC-53 homologues are different from VIP36/ERGIC-53 of other species. In fact the earlier work done by Macedo has identified them as antigens and the antigenicity is attributed to the C-terminal repeats . Usually a distinct signal motif KK or KR at C-terminus is identified in these proteins as responsible for interaction with coat protein 1 complex (COPI) during ER-golgi recycling , . In case of fungi and protozoans there is a marked variation in these residues which are different from Tritryps which needs to be further studied.
Since the antigenicity is due to the repetitive amino acid repeats and polyglutamines we investigated their pattern in different species. The repeats could be identified only in the Tritryps and there were some differences between the members of Leishmania and Trypanosoma species in the pattern of the repeats in the C-terminal region (Fig. 7). The repeat patterns [EQP(Q) n(H/Y)]n; [D(X) n]n; [(E) n(X) n]n were found in Trypanosoma spp. whilst the patterns [QEA(H/Q)PA(Q)n]n; [ETQPAPQ]n were found in Leishmania spp. These differences could have possible implications on their antigenic behavior and function, but needs further understanding.
The ER-Golgi intermediate compartment (ERGIC) marker ERGIC-53 is a mannose-specific membrane lectin operating as a cargo receptor for the transport of glycoproteins from the ER to the ERGIC. Lack of functional ERGIC-53 leads to a selective defect in secretion of glycoproteins in cultured cells and to hemophilia in humans . Similarly VIP36 is reported to be involved in the trafficking of glycoproteins, glycolipids during post-Golgi trafficking . A similar protein from Tritryps having an additional antigenic C-terminal repeat region makes it a very interesting candidate for further studies. Here, the CRD domain of LdEg has been cloned and refolded through on column refolding and its activity and function has been confirmed through hemagglutination assay and further characterization has been done through biochemical and biophysical studies.
Taking a lead from our earlier bioinformatics work, we have cloned and characterized a lectin from Leishmania donovani genome in the present study. The protein is a homologue of a similar protein reported from T. cruzi earlier, which was reported as an antigen. The protein has an ERGIC-53/VIP-36 region in the N-terminal (1–250 residues) and a repeat region of 160 residues towards the C-terminal. Two clones were generated; the full length complete gene (referred as LdEg) and truncated gene having only the CRD region (referred as LdEgTr). We could get a resolubilized LdEgTr by column refolding and it was characterized by MS/MS analysis, CD and fluorescence spectra. The refolded protein shows hemagglutination activity with rabbit erythrocyte (2%) with a specific activity of 1HU/mg and it also shows specificity towards fetuin, a complex glycoprotein. Bioinformatics analysis identifies some unique features like a distinct clade in phylogeny analysis; unique repeats in C-terminus and a possible substitution of KK motif at C-terminus with Arginine thus making it an interesting candidate for further detailed studies.
The authors wish to thank DBT, Govt. of India and Puri Foundation for Education in India for financial support. The kind help of Dr. Hemanta Majumder, Dr. Partha Saha in obtaining L. donovani genomic DNA is gratefully acknowledged. Authors RSK and MK acknowledges the senior research fellowship from CSIR and ICMR, New Delhi respectively. The kind help of Dr. R Vardarajan from MBU, IISc Bangalore in conducting some experiments is gratefully acknowledged.