As the midgut is the primary organ of the sand fly in which the Leishmania parasite develops, cDNA libraries of the midgut tissue were constructed, sequenced and analyzed to investigate the molecules present which may have important interactions between these two organisms. In total, five cDNA libraries were constructed from the midgut tissue of female L. longipalpis during different conditions of feeding and digestion. These conditions included one library combining the midguts from sand flies allowed to feed on a sucrose solution (SF), a pool of midgut tissue from sand flies fully engorged from an artificial blood meal 1, 2, and 3 days post blood meal ingestion (BF), and a pool of midguts from gravid sand flies 5, 6, and 7 days post blood meal digestion (PBMD). The conditions chosen and the pooling of those times after blood meal ingestion allows better coverage of the most abundant molecules transcribed in the midgut as well as a comparison of the molecules present prior to blood feeding, while the blood bolus is present, during digestion of the blood meal, and after the blood byproducts have been excreted. Two cDNA libraries were constructed from the equivalent pools of time points after blood feeding in L. longipalpis midgut tissue from sand flies which had ingested amastigote-infected macrophages in an artificial blood meal (BFi and PBMDi), a more natural presentation of parasites to the blood-feeding sand fly.
Once constructed, approximately 2300 phage plaques were picked and ultimately sequenced for each of the five cDNA libraries; generating a total of 9601 high quality sequences from the midgut tissue of L. longipalpis. These sequences have been submitted to the NCBI EST database under accession numbers EW987149 – EW996682. Table summarizes the results of sequence quality and bioinformatics analysis of each library and the combination of all libraries by the number of sequences analyzed, the number of high quality sequences used in the bioinformatics analysis, the number of contigs, the number of singletons and the average number of sequences per contig. Each library generated a similar number of sequences and sequence recovery from the phage plaques ranged from 79–85%. After discarding low quality sequences, each library retained 71–80% sequences with an average of 73% of the total 11,520 phage producing high quality sequence data. Clustering similar sequences into contigs, based on sequence homology, produced a comparable number of contigs for each library as well as a similar number of singletons. The comparable number of high quality sequences, contigs and singletons produced from each library allows for a better comparison between the sequence abundance of specific molecules of interest and the respective biological condition of the midgut under which they were recovered. The average number of sequences used in the cluster of the contigs varied slightly between libraries. The BF, PBMD, and PBMDi cDNA libraries contained an average sequence per cluster ratio of 8.4, 8.10 and 8.76, respectively. The SF cDNA library had a sequence per cluster ratio of 6.86 and the BFi cDNA library produced an average of 6.34 sequences per cluster. The combining of all cDNA library sequences produced 655 contigs, 2279 singletons and an average of 9.45 sequences per contig. Each cluster was assigned a putative function and placed in a functional class based on the sequence homology to molecules identified by the BLAST results from the NCBI non-redundant protein, the Gene Ontology, the conserved domain, rRNA and mitochondrial databases. Figure shows an overall view of sequence abundance of the functional classes that occurs during the processes of sugar feeding, blood feeding and after the digestion of the blood meal. The clusters of those three cDNA libraries, with an E-value less than 10E-5 result of the KOG BLAST, were grouped according to the general functional class. Although this is a summation of a large number of different clusters, the total number of sequences in each functional class can highlight overall trends that are potentially important in the processes of blood feeding and digestion.
Overall examination of the 5 individual cDNA libraries and the combined analysis
Figure 1 Histograph of the number of sequences grouped into functional classes from the sugar fed, blood fed and post blood meal digestion cDNA libraries. Sequences from clusters of those three cDNA libraries, with an E-value less than 10E-5 result of the COG (more ...)
Following is a more detailed description of the most abundant transcripts identified in this analysis:
Proteases were among the most abundant transcripts captured in the random sequencing of the midgut cDNA libraries and included trypsin-like serine proteases, chymotrypsins, carboxypeptidases, and an astacin-like metalloprotease. Table shows the putative proteases identified in the midgut transcriptome. The Sanger Institute's Lutzomyia longipalpis EST database was searched using BLAST to find the best matches and results are shown with the corresponding E value. The proteases described here are most similar to those described in the sand fly Phlebotomus papatasi, the mosquitoes Aedes aegypti or Anopheles gambie, with the exception that cluster 91 encodes a putative carboxypeptidase that shares homology with a molecule from the beetle Tribolium castaneum. Table shows the transcript producing a full length, high quality sequence for each cluster and the putative function of the identified transcripts. The number of sequences that each cluster contributed to each of the cDNA libraries is also shown and from this it can be seen that most proteases are more abundant, as expected, in the blood fed (BF) and blood fed-Leishmania infected (BFi) libraries. An interesting observation is that cluster 18, which encodes a putative trypsin, is most abundant in the SF, PBMD and PMBDi cDNA libraries, indicating that this putative trypsin may have a role other than blood meal digestion or is produced and stored prior to the ingestion of a blood meal. Table describes the predicted localization, molecular weight and isoelectric point of these proteases. All of the identified proteases posses a potential signal peptide and the molecular weight and isoelectric point given is that of the predicted mature and secreted protein.
Putative midgut-associated proteases; best matched results and corresponding E values from BLAST inquiries of a GenBank-derived non-redundant protein database and Lutzomyia longipalpis EST database
Putative midgut-associated proteases; putative function and sequence distribution contributed from each cDNA library
Putative midgut-associated proteases; localization, molecular weight and isoelectric point of putative midgut proteins
Four trypsin-like transcripts were identified in the transcriptome with high homology to the described P. papatasi
midgut trypsins [3
]. Clusters 18, 35, 60 and 83 are similar to P. papatasi Pptryp1
, and Pptryp3
, respectively. Recently, two transcripts from L. longipalpis
midgut EST sequencing were partially characterized and named Lltryp1
, which corresponds with Cluster 35 identified in our cDNA libraries, and Lltryp2
, which corresponds with Cluster 18 [3
was found in highest abundance, 434 sequences, with the unique sequence distribution among the five cDNA libraries in that most sequences were contributed by the sugar fed and post blood meal digestion groups. In order of decreasing abundance of sequences are Lltryp1
, and LuloTryp3
had relatively homogenous sequence distribution among the cDNA libraries, although LuloTryp3
was underrepresented in the PMBDi cDNA library with only one sequence identified. The distribution of Lltryp1
sequences between the cDNA libraries correlates with reverse transcriptase-PCR results published showing the expression of Lltryp1
during the presence of a blood meal in the female sand fly midgut [3
]. Further information about the putative trypsin molecules can be found in Table , showing the range of molecular weight ranging from 26.0 to 26.2 kDa. The isoelectric points (pI) of these putative trypsin vary with Lltryp1 having a higher pI of 6.32, Lltryp has a lower pI of 4.95, and LuloTryp3 and LuloTryp4 have similar pI of 5.67 and 5.52, respectively. Phylogenetic analysis of amino acid sequences from Dipteran trypsin molecules and a trypsin from Blattella germanica
resulted in two major clades, one containing the A. gambiae
trypsin molecules (group I) and another containing the remaining sequences. Within the other major clade the sand fly trypsins from L. longipalpis
and P. papatasi
form two subclades (Group II) (Figure ). As previously published[3
], Pptryp1 and Pptryp2 form a clade apart from the clade containing Pptryp3 and Pptryp4. The putative trypsin molecules identified in L. longipalpis
midgut share a high homology with the P. papatasi
molecules, being grouped into the same clades. Multiple sequence alignment of the trypsin molecules of L. longipalpis
depicts the potential secretory signal peptide, the H/D/S catalytic site residues and substrate specifying residues (Figure ).
Figure 2 Sequence analysis of trypsin-like serine proteases. (A) Phylogenetic analysis of amino acid sequences from Anopheles gambiae (Antryp), Culicoides sonorensis (Cs), Blattella germanica (Bg), Lutzomyia longipalpis (Lulo and Ll), Phlebotomus paptasi (Pp), (more ...)
A novel midgut-associated serine protease, LuloSerPro, was identified in the sequencing and annotation of these midgut cDNA libraries. LuloSerPro is predicted to be secreted and have a mature molecular weight of 29.0 kDa, slightly larger than the other trypsin-like serine proteases in the midgut, and has an unusually high predicted pI of 8.26 (Table ). This molecule, while found in low abundance, was present in the sugar fed, blood fed-Leishmania infected, and post blood meal digestion-Leishmania infected cDNA libraries (Table ). Phylogenetic analysis and multiple sequence alignments of the midgut trypsin molecules and LuloSerPro show that while this molecule is very similar to other trypsin molecules and retains the catalytic residues, this is a distinctly different serine protease (Figure ). Additionally, there is a difference in the residues that determine the substrate specificity (Lys to Val) between the other midgut trypsins and LuloSerPro (Figure ).
Chymotrypsin is another serine protease found in abundance in the midgut of this hematophage midgut. This study identified five clusters with homology to chymotrypsin molecules described in P. papatasi and one cluster with homology to a putative larval chymotrypsin found in A. aegypti (Tables , , ). Clusters 33, 32, 64, 87, 30 and 31 were named LuloChym1A, LuloChym1B, LuloChym2, LuloChym3, LuloChym4 and LuloChym5, respectively. LuloChym4 was found in higher abundance in the sugar fed cDNA library and LuloChym5 sequences were found in relatively equal numbers between blood fed and sugar fed cDNA libraries. In contrast the other chymotrypsin molecules appear in highest abundance in the blood fed and blood fed-Leishmania infected cDNA libraries (Table ). According to sequence numbers between the cDNA libraries it appears that chymotrypsin transcription is quiescent after the blood meal has been digested and excreted. The L. longipalpis chymotrypsin sequences have a predicted molecular weight of mature and secreted protein ranging from 25.8 to 27.6 kDa (Table ).
Phylogenetic analysis of chymotrypsin amino acid sequences show that there is conservation in sequence homology between L. longipalpis chymotrypsin and P. papatasi chymotrypsin molecules (Figure ). LuloChym1A, LuloChym1B, LuloChym4 and LuloChym5 form a subclade within a clade containing only sand fly chymotrypsin molecules. The short phylogenetic distance between LuloChym1A and LuloChym1B and the 95% amino acid identity they share suggests that these transcript sequences may represent polymorphisms. Further comparisons between the amino acid sequences of the midgut-associated chymotrypsin molecules show that the cysteine and catalytic residues H/D/S are conserved (Figure ).
Figure 3 Chymotrysin sequence analysis. (A) Phylogenetic analysis of chymotrypsin sequences from Phlebotomus papatasi (Pp), Lutzomyia longipalpis (Lulo), Anopheles gambiae (Ag), Aedes aegypti (Aa), and Culicoides sonorensis (Cs). Accession numbers are shown in (more ...)
The three longest transcripts encoding putative proteases identified in the analysis are similar to zinc metallocarboxypeptidases found in other insects and significant similarity to ESTs from the Sanger Institute database (Table ). These transcripts from clusters 104, 107 and 91 were named LuloCpepA1,LuloCpepA2 and LuloCpepB, have molecular weights of 45.8, 46.0 and 45.9 kDa and a pI of 5.36, 5.41 and 4.73, respectively (Table ). Although LuloCpepA2 appears to be an incomplete transcript with a 5' truncation, based on homology and predicted signal peptide sequences a putative mature protein can be used in further characterization and comparison. Most of the sequences grouped to produce the carboxypeptidase clusters were captured from the blood fed library, suggesting that these molecules are likely induced by the ingestion or presence of blood in the midgut of the sand fly (Table ). The classification of these molecules as members of the A or B class of metallocarboxypeptidases was determined by the output from phylogenetic analysis of the amino acid sequences (Figure ). The phylogenetic tree produced by this analysis shows distinct clades containing insect sequences nearly all annotated as either carboxypeptidase A or carboxypeptidase B molecules. The high node support values of the sand fly carboxypeptidases in the phylogenetic tree imply conservation of these molecules when comparing the Old World sand fly P. papatasi and that of the New World sand fly L. longipalpis. Similarity between the two sand flies, with regards to the carboxypeptidase molecules, can be seen in amino acid sequence alignments, depicting the high level of identity and retention of the catalytic residues necessary for metallocarboxypeptidase activity (Figure ). Furthermore, the amino acid sequence alignment depicts the incongruousness that separates LuloCpepA1 from LuloCpepA2 (Figure ).
Figure 4 Analysis of putative carboxypeptidase molecules. (A) Phylogenetic analysis of carboxypeptidases from Lutzomyia longipalpis (Lulo), Phlebotomus papatasi (Pp), Ochlerotatus triseriatus (Ot), Aedes aegypti (Ae), Anopheles gambiae (Ag), Drosophila melanogaster (more ...)
A putative zinc metalloprotease was identified as a likely astacin-like molecule based on results for a search of the conserved domains database. This molecule was derived from clusters 58 and 59, both encoding the same putative protein, but separated due to differing lengths of 5'- and 3'- UTRs by the bioinformatics software. The astacin-like metalloprotease was named LuloAstacin and is predicted to have a molecular weight or 28 kDa once secreted and pI of 5.36 (Table ). LuloAstacin was most abundant in the sugar fed cDNA library in contrast to PpAstacin, an astacin-like molecule identified in P. papatasi midgut, which was most abundant in the blood fed cDNA library (Table ). Phylogenetic analysis of other putative astacin amino acid sequences illustrate that one clade is an assemblage of the Dipteran sequences. LuloAstacin branches out of the subclade containing PpAstacin and away from the other Dipteran sequences (Figure ). Further differences in amino acid sequence can be visualized in the multiple sequence alignment of Dipteran astacins and while LuloAstacin diverges from the other astacin molecules, the residues responsible for zinc-binding and activity are conserved (Figure ).
Figure 5 Astacin-like metalloprotease sequence comparison and analysis. (A) Phylogenetic analysis of amino acid sequences from Lutzomyia longipalpis (Lulo), Phlebotomus papatasi (Pp), Mus musculus (Mm), Homo sapiens (Hs), Glossina morsitans morsitans (Gm), Drosophila (more ...)
A number of molecules were identified as containing chitin binding domains based on results from the conserved domains database (Tables , , ). Three of the transcripts resembled previously identified peritrophin molecules based on sequence homology with peritrophin-A domains. The most abundant of these putative peritrophin transcripts was named LuloPer1 (Cluster 77/78) and was overrepresented in the blood fed Leishmania-infected cDNA library and encodes a likely secreted protein of 27.8 kDa (Tables and ). LuloPer1 consists of four chitin-binding domains (Fig. ); contrasting the other two peritrophin molecules, LuloPer2 and LuloPer3, which are molecules of a single chitin-binding domain (Figure ). LuloPer2 and LuloPer3 sequences originated in higher numbers from blood fed midgut cDNA libraries and were in relatively equal numbers between the infected and uninfected sand flies. These small putative peritrophins are predicted to have a mature molecular weight of 9.2 and 7.5 kDa and isoelectric points of 4.38 and 3.8 for LuloPer2 and LuloPer3, respectively (Table ). LuloPer1 is likely to have a role in cross linking chitin fibrils that will form the peritrophic matrix around the ingested blood bolus. LuloPer2 and LuloPer3 may have roles in capping the ends of chitin fibrils or sequestering free chitinous molecules within the midgut lumen. However, the two sequences share only 39% identity and 44% similarity, conserving primarily the cysteine residues, suggesting they may have very different ligand specificities or roles in peritrophic matrix formation or chitin management within the midgut (data not shown). Phylogenetic analysis of the individual chitin-binding domains from several other insect peritrophin and mucin molecules demonstrates conservation of the LuloPer1 domain arrangement when compared with P. papatasi PpPer1, suggesting that if the domains are gene duplication events that those events occurred prior to speciation (Figure ). Additionally, the small putative peritrophin molecules domains from LuloPer2 and LuloPer3 form a clade containing another chitin-binding domain from a small peritophin of P. papatasi (Figure ).
Putative midgut-associated peritrophin proteins; best matched results and corresponding E values from BLAST inquiries of a GenBank-derived non-redundant protein database and Lutzomyia longipalpis EST database
Putative midgut-associated peritrophin proteins; putative function and sequence distribution contributed from each cDNA library
Putative midgut-associated peritrophin proteins; localization, molecular weight and isoelectric point of putative midgut proteins
Figure 6 Characterization of peritrophin sequences. (A) Diagrammatic representation of Lutzomyia longipalpis peritrophin-like molecules showing the predicted signal peptide and chitin binding domains. (B) Phylogenetic analysis of predicted chitin binding domains (more ...)
In addition to the putative peritrophin molecules a transcript with homology to a predicted chitin-binding domain was identified from the clustering of 6 sequences collected primarily from the blood fed Leishmania-infected cDNA library. This domain has homology to a much larger chitin-binding domain than those found in the putative peritrophin molecules and the identified transcript, LuloChiBi, has one of these domains and is predicted to be a mature molecular weight of 20.9 kDa (Table ).
Among the most abundant sequences identified in the cDNA libraries were transcripts encoding putative microvillar-associate proteins with homology to insect allergens identified in Periplaneta americana and Blattella germanica (Table ). By BLAST analysis high homology was also found to molecules in the mosquito Aedes aegypti. In order of decreasing overall sequence abundance, clusters 27, 29, 48, 66 and 36 were named LuloMVP1, LuloMVP2, LuloMVP3, LuloMVP4 and LuloMVP5, respectively (Table ). In general the microvillar proteins were most abundant in the blood fed cDNA libraries; although, LuloMVP3 (cluster 48) sequences were underrepresented in the blood fed cDNA libraries and was relatively equally identified in the sugar fed and post-blood meal ingestion cDNA. LuloMVP1, LuloMVP2 and LuloMVP5 are of nearly equal mature molecular weight of 21 kDa based on the cleavage of the predicted signal peptide present in all of the microvillar proteins and LuloMVP3 and LuloMVP4 are slightly larger; around 23 kDa. A notable difference in the isoelectric point among the microvillar proteins was observed at a predicted value of 8.84 for LuloMVP3, whereas the other microvillar molecules isoelectric point ranges from 4.46 to 5.12 (Table ).
Putative midgut-associated microvillar proteins; best matched results and corresponding E values from BLAST inquiries of a GenBank-derived non-redundant protein database and Lutzomyia longipalpis EST database
Putative midgut-associated microvillar proteins; putative function and sequence distribution contributed from each cDNA library
Putative midgut-associated microvillar proteins; localization, molecular weight and isoelectric point of putative midgut proteins
The L. longipalpis microvillar proteins share respective homology with similar molecules identified in the midgut of P. papatasi, as demonstrated by amino acid phylogenetic analysis (Figure ). The sand fly microvillar proteins are separated from the clade containing cockroaches. Additionally, LuloMVP2 and LuloMVP5 are in a subclade with the microvillar proteins of A. aegypti and A. gambiae while the other molecules pair with the P. papatasi microvillar proteins (Figure ). Sequence alignment of the L. longipalpis microvillar proteins shows little sequence homology suggesting that the classification of microvillar proteins is rather broad and in fact these molecules may have different functions altogether.
Figure 7 Sequence analysis of microvillar proteins. (A) Phylogenetic analysis of amino acid sequences from Blattella germanica (Bg), Periplaneta americana (Pa), Tenebrio molitor (Tm), Aedes aegypti (Aa), Anopheles gambiae (Ag), Phlebotomus papatasi (Pp) and Lutzomyia (more ...)
Oxidative stress molecules
The sand fly, being an obligate blood feeding insect, must cope with the physiological challenges posed by the digestion of blood which includes the generation of reactive oxygen species (ROS) released by free heme and metabolic radicals produced in abundance during the digestion of the blood meal [7
]. Five molecules were identified in the midgut cDNA libraries which have putative roles as antioxidants such as glutathione s-transferase (GST), catalase, copper-zinc superoxide dismutase (SOD) and peroxiredoxin (PRX) (Table ). In addition to the protection these molecules may impart on the regulation of ROS due to blood meal digestion there is evidence that antioxidants interact with and can impact the outcomes of infection by bacterial and parasitic agents [8
]. Two transcripts were identified with homology to GST molecules of the Class Sigma and Class Delta and Epsilon subfamilies and were named LuloGST1
, respectively. Phylogenetic analysis of the putative GST molecules supports the separation and classification of into the subfamily classes of Sigma and Delta/Epsilon. Additionally, LuloGST1 is grouped in a subclade with other dipertan GST molecules while LuloGST2 diverges from the dipteran Delta/Epsilon GST molecules (Figure ). The LuloGST1
cluster was generated from sequences from each of the cDNA libraries made and analyzed while LuloGST2
consists of one sequence from the sugar fed library and two sequences from the blood fed Leishmania
-infected cDNA library. Additional antioxidant molecules include a catalase (LuloCAT
), copper-zinc superoxide dismutase (LuloSOD
), and peroxiredoxin (LuloPRX
) of which LuloSOD and LuloPRX are both predicted to be secreted based on the presence of a likely signal peptide sequence. ROS and reactive nitrogen oxide species (RNOS) are important in host defenses against microorganisms and LuloCAT, LuloSOD and LuloPRX are molecules which may serve to regulate and prevent damage of the sand fly midgut by the ROS and RNOS defenses similar to the protective effect of a peroxiredoxin in Anopheles stephensi
Putative midgut-associated oxidative stress molecules; best matched results and corresponding E values from BLAST inquiries of a GenBank-derived non-redundant protein database and Lutzomyia longipalpis EST database
Figure 8 Phylogenetic analysis of glutathione s-transferase molecules. Sequences analysed from Lutzomyia longipalpis (Lulo), Phlebotomus papatasi (Pp), Drosophila melanogaster (Dm), Aedes aegypti (Ae), Anopheles gambiae (Ag), Musca domestica (Md), Bombyx mori (more ...)
Upon the ingestion of a blood meal by a hematophagous insect a large amount of iron and heme is released during digestion. To combat the toxic effects of free iron and the generation of damaging reactive oxygen species ferritin is produced to sequester the iron and hemoglobin that is liberated by the digestion of red blood cells. Ferritin molecules are commonly associated with iron metabolism and it is likely that the molecules identified in this transcriptome engage in metabolic function; however, given the relative size of the blood meal in comparison with the sand fly ferritin molecules within the midgut likely serve a large role in preventing the generation of oxygen radicals by the Fenton reaction. Two transcripts from clusters 76 and 79 were identified with homology to ferritin light-chain and ferritin heavy-chain molecules and were named LuloFLC and LuloFHC, respectively (Tables and ). The expression of LuloFLC and LuloFHC appears to be constitutive based on the number of sequences generated in each cDNA library spanning the condition of sugar fed, blood fed, and post blood meal digestion (Table ).
Putative midgut-associated oxidative stress molecules; localization, molecular weight and isoelectric point of putative midgut proteins
Putative midgut-associated oxidative stress molecules; putative function and sequence distribution contributed from each cDNA library
Serine protease inhibitors
Two types of serine protease inhibitors were identified in the cDNA libraries; a single sequence with homology to SERPIN and a cluster of 17 sequences with homology to a Kazal-type serine protease inhibitor (Tables , , ). SERPIN molecules within the midgut of the sand fly may serve to counteract damaging proteases produced by microorganisms; however LuloSRPN lacks a predicted signal peptide sequence and thus may serve an intracellular housekeeping function. LuloKZL
, identified from cluster 112, is a small molecule of 6.3 kDa and is predicted to be secreted. Comparison of LuloKZL with Kazal-type serine protease inhibitors found in a transcriptome analysis of the midgut of P. papatasi
identified PpKZL1 as a highly conserved homolog (data not shown). Kazal-type protease inhibitors, such as rhodniin and infestin identified in Rhodnius prolixus
and Triatoma infestans
, respectively, have been characterized as thrombin inhibitors; thereby these molecules would prevent coagulation of ingested blood to facilitate successful digestion of the blood meal [10
sequences are more abundant prior to and during blood meal digestion based on the number of sequences in the sugar fed, blood fed and post blood meal digestion cDNA libraries. Additionally, LuloKZL
was not identified in an EST analysis of whole sand fly L. longipalpis
and is therefore more likely a midgut-specific molecule found in abundance only in the alimentary tissue [5
]. Thus, a prudent hypothesis would be that LuloKZL serves a similar function, allowing the blood bolus to remain in a colloidal suspension within the gut to facilitate peristalsis and digestion.
House keeping and low abundant transcripts from the midgut of L. longipalpis; best matched results and corresponding E values from BLAST inquiries of a GenBank-derived non-redundant protein database and Lutzomyia longipalpis EST database
House keeping and low abundant transcripts from the midgut of L. longipalpis; putative function and sequence distribution contributed from each cDNA library
House keeping and low abundant transcripts from the midgut of L. longipalpis; localization, molecular weight and isoelectric point of putative midgut proteins
Two molecules, originating from clusters 235 and 1960, encode a putative peptidoglycan recognition protein (LuloPGRP
) and defensin (LuloDEF
), respectively. LuloPGRP is similar to other predicted peptidoglycan recognition proteins found in Glossina morsitans morsitans
and mosquitoes and is phylogenetically distinct from lepidopteran molecues (Figure ). This is the first report of a putative PGRP identified in sand flies and in searching a midgut transcriptome database of P. papatasi
a molecule was identified with 87% identity. LuloPGRP may serve as a pattern recognition protein, specifically for the conserved structure of peptidoglycan indicated by the conservation of the amino acid sequence among insects, as a component of the sand fly immune system defense against bacterial pathogens (Figure ). PGRP molecules characterized in Bombyx mori
and Trichoplusia ni
have been shown to be expressed primarily in the fat body and hemocytes and it is conceivable that the identification of LuloPGRP
transcripts arose due to a contamination of the tissue sample [12
]. It is possible that the midgut tissue of sand flies express a PGRP for protection against microorganisms ingested during sugar and blood feeding as a PGRP was identified as preferentially expressed in the midgut of Samia cynthia ricini
Figure 9 Sequence analysis of peptidoglycan recognition proteins. (A) Phylogenetic analysis of amino acid sequences of peptidoglycan recognition proteins from Lutzomyia longipalpis (Lulo), Phlebotomus papatasi (Pp), Anopheles gambiae (Ag), Aedes aegypti (Ae), (more ...)
Defensins are another type of innate immune defense that insect possesses to ward off pathogenic bacteria. A single sequence, named LuloDEF
, was identified in the post blood meal digestion midgut cDNA library with homology to a defensin molecule characterized in A. aegypti
. Like other insect defensin molecules, LuloDEF has a predicted secretion signal peptide and most homology is given by the carboxyl half of the sequence and conservation of cysteine residues (Figure ). LuloDEF shares 47% identity and 61% similarity with a defensin characterized in Phlebotomus duboscqi
which is induced by the presence of wild type Leishmania major
]. Both immunity-associated genes, LuloPGRP
, may have an impact on the progression and result of a midgut infection by Leishmania parasites, either directly or by indirect effects if co-colonization of the midgut with bacteria is an intermediary confounding factor.
Figure 10 Multiple sequence alignment of putative defensin sequences. Aligned sequences from Lutzomyia longipalpis (Lulo), Culicoides sonorensis (Cs), Phlebotomus duboscqi (Pd), Drosophila melanogastor (Dm), Anopheles gambiae (Ag), Aedes aegypti (Ae) and Muscus (more ...)
Transcripts differentially expressed by blood feeding and digestion
A comparison between the sugar fed and blood and between the blood fed and post blood meal digestion libraries was conducted using Pearson's chi-square equation to identify overrepresented transcripts within each cluster. As was previously seen in P. papatasi
a number of digestion-associated transcripts were overabundant in the blood fed cDNA library [6
]. We envisioned similar results in the analysis of the L. longipalpis
midgut cDNA libraries with the enhanced advantage of a cDNA library produced from midguts that had fully processed and excreted the blood meal byproducts. It was our hypothesis that the post blood meal midgut transcript abundance is most similar to the sugar fed midgut transcript abundance prior to a blood meal. Overall, the number of sequences per cluster was similar in the sugar fed cDNA library to those in the post blood meal digestion cDNA library and most transcripts are overrepresented in the blood fed library (Table ). Several exceptions to both overall observations do occur, however. Most of the microvillar protein transcripts are abundant in the blood fed cDNA library except for LuloMVP3
, which is highly represented in the sugar fed and post blood meal digestion cDNA libraries. This reinforces the suggestion that the microvillar proteins are likely functionally different molecules grouped solely on homology to previously annotated sequences. In general, proteases appear to be induced by the act of blood feeding or the presence of a blood meal within the midgut; with the exception of Lltryp2
which is significantly more abundant in the sugar-fed and also in the post blood meal digestion cDNA libraries and also LuloAstacin
which is more abundant in the sugar-fed cDNA library (Tables and ). These molecules may be produced and stored prior to blood feeding for immediate use in digestion or perhaps have a role other than digestion altogether, such as immunity. Other proteases such as LuloChym4
are present in higher or near equal numbers in the sugar fed library when compared with that of the blood fed library. Other molecules such as Peritrophin LuloPer1
, are also more plentiful in the blood fed cDNA library, suggesting that these molecules may be transcribed only in response to blood feeding. A transcript encoding a predicted protein of unknown function derived from cluster 40 was identified as being most abundant in the post blood meal digestion cDNA library, signifying it may play a role outside of blood meal digestion, such as oogenesis.
Sequence distribution altered during sugar feeding and blood meal digestion; clusters overrepresented in the sugar-fed, blood-fed and post blood meal digestion midgut cDNA libraries as determined by X2 statistical analysis
Sequence distribution altered during sugar feeding and blood meal digestion; clusters that appear overabundant in the sugar-fed, blood-fed and post blood meal digestion midgut cDNA libraries
Transcripts differentially expressed by the presence of Leishmania infantum chagasi
To evaluate the effects of the presence of L. infantum chagasi
parasites on the transcript abundance in the midgut tissue of the sand fly we compared the number of sequences in each cluster between the blood fed and blood fed Leishmania-infected cDNA library and the post blood meal digestion and post blood meal digestion Leishmania-infected cDNA library using chi-square analysis (Tables , , , ). We hypothesized that the effects of the parasites presence in the blood engorged sand fly would likely mirror what we had observed in a similar comparison of P. papatasi
infected with L. major
. Additionally, we hypothesized that the analysis of the post blood meal digestion midgut tissue would reveal a large number of differentially abundant transcripts as during this time period Leishmania parasites are interacting with the midgut epithelium, replicating, and differentiating to the metacyclic form. In accordance with what we observed previously in blood engorged P. papatasi
infected with L. major
, there was an under representation of the microvillar protein transcripts [6
]. Similar trends in abundance between infected P. papatasi
and infected L. longipalpis
also occur for transcripts encoding the putative digestion enzymes trypsin (Lltryp2
) and chymotrypsin (LuloChym1A
). Two other digestive proteases, LuloAstacin
, were identified as differentially abundant in the presence of L. infantum chagasi
with a reduction in the number of transcripts captured in the blood fed Leishmania-infected library, however only the LuloCpepA1
difference was statistically significant. There is a striking contradiction of the modulated abundance of peritrophin transcripts. In the midgut of infected P. papatasi
peritrophin transcripts decrease whereas in L. longipalpis
infected with L. infantum chagasi
has a significant over representation of peritrophin (LuloPer1
) and over representation of the putative chitin-binding molecule (LuloChiBi
). There appears to be a downregulation of actin transcripts by the presence of the L. infantum chagasi
parasites in the midgut. We speculate that this could be a tactic of the parasite to decrease the cytoskeletal rearrangement that occurs after blood feeding as a means of decreasing peristalsis, which may aid in the retention of the parasite within the gut of the sand fly.
Sequence distribution altered by Leishmania infantum chagasi; clusters overrepresented in the blood-fed and blood-fed Leishmania infantum chagasi-infected midgut cDNA libraries as determined by X2 statistical analysis
Sequence distribution altered by Leishmania infantum chagasi; clusters overrepresented in the post blood meal digestion and post blood meal digestion Leishmania infantum chagasi-infected midgut cDNA libraries as determined by X2 statistical analysis
Sequence distribution altered by Leishmania infantum chagasi; clusters that appear overabundant in the blood-fed or blood-fed Leishmania infantum chagasi-infected midgut cDNA libraries
Sequence distribution altered by Leishmania infantum chagasi; LuloTryp3 appears underrepresented in the post blood meal digestion Leishmania infantum chagasi-infected midgut cDNA library
In the context of abundant transcripts, the post blood meal digestion midgut infected with L. infantum chagasi is relatively quiescent. Only one transcript, encoding a putative trypsin molecule, was identified as significantly different in abundance. Lltryp2 sequences were 1.54 times more abundant in the L. infantum chagasi-infected post blood meal digestion cDNA library which corroborates the observed overrepresentation of Lltryp2 sequences in the blood fed infected cDNA library. It is possible that the increase in sand fly Lltryp2 occurs due to the presence of a perceived pathogen or as a consequence of a non-specific perception of contents within the midgut. Conversely, LuloTryp3 transcripts were captured at a lower frequency in the L. infantum chagasi-infected midgut after blood meal digestion.