|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Malaria is a tropical disease caused by protozoan parasite, Plasmodium, which is transmitted to humans by various species of female anopheline mosquitoes. Anopheles stephensi is one such major malaria vector in urban parts of the Indian subcontinent. Unlike Anopheles gambiae, an African malaria vector, transcriptome of A. stephensi midgut tissue is less explored. We have therefore carried out generation, annotation, and analysis of expressed sequence tags from sugar-fed and Plasmodium yoelii infected blood-fed (post 24 h) adult female A. stephensi midgut tissue.
We obtained 7061 and 8306 ESTs from the sugar-fed and P. yoelii infected mosquito midgut tissue libraries, respectively. ESTs from the combined dataset formed 1319 contigs and 2627 singlets, totaling to 3946 unique transcripts. Putative functions were assigned to 1615 (40.9%) transcripts using BLASTX against UniProtKB database. Amongst unannotated transcripts, we identified 1513 putative novel transcripts and 818 potential untranslated regions (UTRs). Statistical comparison of annotated and unannotated ESTs from the two libraries identified 119 differentially regulated genes. Out of 3946 unique transcripts, only 1387 transcripts were mapped on the A. gambiae genome. These also included 189 novel transcripts, which were mapped to the unannotated regions of the genome. The EST data is available as ESTDB at http://mycompdb.bioinfo-portal.cdac.in/cgi-bin/est/index.cgi.
3946 unique transcripts were successfully identified from the adult female A. stephensi midgut tissue. These data can be used for microarray development for better understanding of vector-parasite relationship and to study differences or similarities with other malaria vectors. Mapping of putative novel transcripts from A. stephensi on the A. gambiae genome proved fruitful in identification and annotation of several genes. Failure of some novel transcripts to map on the A. gambiae genome indicates existence of substantial genomic dissimilarities between these two potent malaria vectors.
Anopheles stephensi is a major malaria vector in the Indian subcontinent . Rapid urbanization and development in the region has stimulated a corresponding increase in their population resulting in frequent malaria outbreaks . Although, recent malaria epidemics occurred at higher frequencies, mortality was considerably low. For example during 2003, of the reported 1.78 million cases, only 1006 deaths were recorded in India .
Absence of an efficient vaccine , evolution of drug-resistance in the parasites , and insecticide-resistance in the mosquitoes  accentuate the need of an effective malaria control strategy. Human immunization against parasite proteins through transmission blocking vaccines (TBVs)  is one such strategy. Bacterial and fungal-based mosquito control methods are other alternatives but these suffer from major difficulties in practical application [7,8]. Transgenic mosquitoes could provide another control method , but successful application in field will require designing of appropriate vector-parasite study model . Ito et al. , showed transgenic expression of an antiparasitic peptide SM1 in mosquitoes leading to impairment in Plasmodium berghei development. However, the peptide failed to show such activity against Plasmodium falciparum [10,11], the human malaria parasite. Study of naturally occurring P. falciparum-resistant A. gambiae mosquitoes revealed a Plasmodium-responsive gene, Anopheles Plasmodium-responsive leucine-rich repeat 1 (APL1) , which could form a potent target for the transgenic approach against P. falciparum. Many other antiparasitic and/or immunologically active genes like SRPN6  from A. gambiae and A. stephensi, TEP1  and leucine-rich repeat protein (LRIM1)  from A. gambiae have also been identified recently. Moreover, availability of A. gambiae genome sequence  has improved the chances of discovery of more such potential genes in this insect.
In the pre-genomic era, EST (Expressed Sequence Tag) based studies were adopted to understand A. gambiae [17-20] and its role in malaria transmission . However, despite its importance as a malaria vector, A. stephensi has not been intensively investigated. Although EST [22-24] and microarray-based  studies on A. stephensi and Plasmodium exist, no major transcriptome based contributions have been reported so far. Here, we report the first large-scale effort in construction and analysis of EST libraries from midgut tissue of sugar-fed (SF) and P. yoelii infected blood-fed (post 24 h) (BF) female A. stephensi. In light of limited genomic and transcriptomic information for A. stephensi, these data would significantly enrich the molecular aspects of this insect and its role in malaria transmission.
Two cDNA libraries, SF and BF were prepared from sugar-fed and P. yoelii infected blood-fed (post 24 h) adult female A. stephensi mosquito midgut tissues, respectively. Single-pass sequencing yielded 15367 ESTs from both, SF (7061 ESTs) and BF (8306 ESTs) libraries as analyzed by phred [26,27] (quality ≥ 20) with minimum length greater than 100 bases. Vector, adapter, and primer sequences were removed using cross_match . Mouse and Plasmodium sequences were filtered using stand-alone BLAST. Average length of ESTs from both the libraries was approximately 380 bases and approximately 61% of the sequences were above 300 nucleotides. Table Table11 shows the summary of EST data obtained in this study. All the ESTs are deposited in GenBank with continuous accession numbers [GenBank: EX212289 – EX227655].
CAP3  based EST assembly and clustering of the combined dataset (15367 ESTs) resulted in 1319 contigs and 2627 singlets, forming 3946 unique transcripts (UTs). Similarly, independent assemblies were also performed for SF and BF libraries. Details are given in Table Table11.
To assign putative function to ESTs and UTs, we performed BLASTX search against the UniProtKB database. Summary of BLASTX results for SF, BF, and combined UTs are shown in Table Table2.2. BLASTX results for combined UTs are given in Additional file 1. Figure Figure11 shows BLAST hit distribution across various species of organisms for all UTs from the combined dataset. Putative functions were assigned only to 8946 ESTs (58.2%) and 1615 UTs (40.9%) (E-value ≤ 1e-5). Non-coding EST sequences usually fail to find a homolog in the protein databases during BLASTX search . We therefore, screened unannotated (with no BLAST hits) UTs (2331) for the presence of putative coding region using ESTScan program . 1513 UTs were predicted to contain a coding region, thereby suggesting these as novel genes. The remaining sequences could be potential untranslated regions (UTRs). Table Table33 shows ESTScan results of both the libraries and combined dataset.
A. stephensi UTs were also compared with the EST sequences of A. gambiae, Aedes aegypti, and Drosophila melanogaster using TBLASTX (E-value ≤ 1e-5). As compared to ESTs from Ae. aegypti and D. melanogaster, a higher number of UTs (45%) were homologous to A. gambiae ESTs. Only 39% and 34% UTs were identified homologous to ESTs from Ae. aegypti and D. melanogaster, respectively (Table (Table44).
Gene Ontology (GO) categories were assigned only to 505, 601, and 629 UTs from the SF, BF, and combined datasets, respectively using Blast2GO program . Additional file 2: Figure S2 illustrates percent similarity and E-value distribution obtained for all the UTs. Table Table55 shows percent distribution of UTs among various assigned GO terms (2nd level) according to the GO consortium  for SF and BF libraries. Details of assigned GO terms to each UT from combined dataset are given in Additional file 1. Statistical comparison of GO terms between the two libraries revealed an overrepresentation of metabolic process-related genes in BF library, whereas cellular process-related house keeping genes were dominant in SF library (Figure (Figure2).2). For details refer Additional file 3.
IDEG6 analysis  for BLASTX-annotated ESTs identified 114 differentially expressed genes (P < 0.05) between the two libraries. In brief, 58 genes (20 overexpressed and 38 exclusively expressed) in SF and 56 genes (23 overexpressed and 33 exclusively expressed) in BF library were found to be differentially regulated. Few unannotated genes (unannotated ESTs) were also found to be significantly altered in expression. Two unannotated genes (PU_Contig136 (P = 0) and PU_Contig398 (P = 0)) were exclusively expressed in SF library, while only one such exclusive gene (PI_Contig553 (P = 0)) was found significantly expressed in BF library. Moreover, PU_Contig33 and PI_Contig478 (P ≤ 0.05), PU_Contig9 and PI_Contig24 (P ≤ 0.05) pairs of unannotated genes were observed in both the libraries with significant changes in gene expression. Many genes were also exclusive to each library but showed no statistical significance. Details are given in Additional file 4.
To identify insect-specific genes in our UTs, we used data from Zhang et al. . Only 20 transcripts encoding insect-specific proteins were observed in our data and most of them were related to metabolic processes such as reductases and deaminases. A few receptor proteins, sensory and immunity-related proteins were also observed (Additional file 1).
Genome mapping and alignment of A. stephensi UTs on A. gambiae genome revealed many homologous genes between them (Additional file 1). 1387 UTs were successfully mapped on the A. gambiae genome, which also included 189 novel UTs. Remaining UTs (2812) failed to show any alignment with the A. gambiae genome. Table Table66 illustrates the details of mapping study.
ESTDB http://mycompdb.bioinfo-portal.cdac.in/cgi-bin/est/index.cgi, a database housing the entire EST dataset along with its annotations has also been developed. The database supports text- and sequence-based queries through a user-friendly interface. It also provides graphical display of contigs along with assembled ESTs.
A. stephensi is a predominant malaria vector in urban parts of the Indian subcontinent. In spite of its importance as a malaria vector, no in-depth transcriptomic information is available on the midgut tissue of A. stephensi during sugar feeding and parasite infection. We herein report generation, annotation, and analysis of ESTs from sugar-fed and P. yoelii infected adult female A. stephensi midgut tissues.
7061 high quality ESTs were obtained from the sugar-fed cDNA library and 8306 ESTs from the 24 h post blood-fed infected cDNA library. With 15367 ESTs, our study represents the first intensive effort in complementing gene sequence information for this mosquito. Although the genome of the closely related anopheline species, A. gambiae is available, discovery of novel transcripts (1513) in A. stephensi suggests a significant interspecies variation. In addition, mapping of novel transcripts (189) to the A. gambiae genome testifies the usefulness of our data in gene discovery process.
Like other insects, mosquitoes are also equipped with genes responsible for adaptation to environment changes. Identification of insect-specific genes could prove useful in understanding the molecular basis of their success in various ecological niches. Recently, Zhang et al.  identified stress and immune-response related proteins as a major fraction of insect-specific proteins. We have identified 20 such insect-specific genes like reductases, deaminases, receptor proteins, sensory- and immunity-related proteins in this study.
Comparative analysis of GO terms demonstrated striking differences between the two stages of the vector examined. Based on molecular function, gene ontologies, peptidase activity (GO:0008233, P = 0.00247923), catalytic activity (GO:0003824, P = 0.00488696), and endopeptidase activity (GO:0004175, P = 0.00597939) were found significantly upregulated in blood-fed infected condition (Figure (Figure2).2). In biological process ontologies such as digestion (GO:0007586), proteolysis (GO:0006508), cellular lipid metabolic process (GO:0044255), electron transport (GO:0006118), and acyl-CoA metabolic process (GO:0006637) were found overrepresented upon blood feeding (details in Additional file 3). We also identified many dominant and differentially expressed transcripts in the two cDNA libraries indicating their prominent role in the stage specific physiological and/or biochemical processes in the mosquito vector. Based on IDEG6 analysis of EST data, 114 BLASTX-annotated genes were differentially regulated upon sugar feeding and parasite infection with blood ingestion. Sugar-fed condition exhibited a significant upregulation in the expression of many ribosomal proteins, mitochondrial proteins and other housekeeping genes (P < 0.05, Additional file 4). Dominance of proteases like various isoforms of trypsin, chymotrypsin precursors, carboxypeptidases, and other proteases (P < 0.05) characterized the blood-fed infected female mosquitoes as reported earlier [17,36,37]. However, some isoforms of serine proteases were also significantly overexpressed in the sugar-fed tissue. We also identified 5 unannotated genes (PU_Contig136 (P ≤ 0.05), PU_Contig398 (P ≤ 0.05), PU_Contig398 (P ≤ 0.05), PU_Contig33 & PI_Contig478 (P ≤ 0.05), and PU_Contig9 & PI_Contig24 (P ≤ 0.05)), which were significantly altered during the two conditions. These require further characterization (Additional file 4).
Encountering antimicrobial peptides like cecropin-A precursor (P = 0.014491) and cecropin-B precursor (P = 0) with a prominent dominance in sugar-fed condition is a unique observation. It is noteworthy that cecropins are the first reported antimicrobial peptides from insects , also known to have antiparasitic activity in mosquitoes . However, the other well known antimicrobial peptide, defensins, with characteristic six cysteine/three disulfide bridge pattern [40,41], showed no differential expression (P > 0.05) between the two conditions. Defensins are primarily active against gram-positive bacteria and are induced by Plasmodium or other microbial infections in mosquitoes . Another transcript encoding an uncharacterized immune response-related protein [GenBank: EX221808] (P = 0.044556) was overexpressed upon sugar feeding. Lysozyme C-7 and salivary lysozyme (homologous to Lysozyme C-1 from A. gambiae) transcripts were also expressed in the sugar-fed tissue (P > 0.05). These molecules participate in innate immunity  by catalyzing hydrolysis of the peptidoglycan layer of bacterial cell wall.
Blood feeding causes excess protein and iron overload in mosquitoes . Blood-induced expression of protease transcripts would therefore be expected [17,36,37]. These proteolytic enzymes not only help in protein digestion but also facilitate establishment of parasite infection through proteolytic activation of enzymes, e.g., conversion of pro-chitinase to chitinase in Plasmodium gallinaceum, which digests the peritrophic matrix . Post-iron overdose caused by blood feeding also induces synthesis and secretion of iron storing molecules like ferritin, which defend mosquito cells from iron toxicity . In our study, increased expression of putative ferritin transcripts in the blood-fed tissue, e.g., ferritin subunit 1 and secreted ferritin G subunit (P < 0.05, Additional file 4) substantiated this fact. Transcripts encoding Protein G12 precursor were exclusively (n = 335, P = 0) seen in the blood-fed tissue as reported earlier . This protein shows homology with Bla g1 and Per a1, allergens from cockroaches, which are shed in the insect feces and upon inhalation these cause asthma in human beings . The other protein G12 counterparts, ANG12 from A. gambiae  and AEG12 from Ae. aegypti  are both induced upon blood feeding. Interestingly, AEG12 is postulated to have a function in digestion and it maps to a genomic region affecting susceptibility to parasite infection .
Serpins are serine protease inhibitors, deriving their name from their activity . Many studies have identified different genes and isoforms of serpins in A. gambiae [17,51,52]. In A. gambiae, Serpin 2 (SRPN2) is reported to negatively regulate ookinete killing and melanization thereby assisting midgut invasion by malaria parasites . Encountering this transcript [GenBank: EX215382] (P > 0.05) in blood-fed infected tissue corroborates this fact.
Mosquito and Plasmodium chitinases are shown to promote successful establishment of the parasite by digesting the midgut peritrophic matrix [53-55]. Chitinase expression is reported to increase upon bacterial and pathogen infection . A similar increase in chitinase expression (P = 0.008282) in the ookinete-infected tissue substantiates the fact.
As reported earlier, parasite invasion in mosquito midgut epithelia induces a cascade of changes leading to cell death by apoptosis . In the blood-fed infected tissue, we also observed expression of apoptotic transcripts like caspase-6 and ancaspase-7 . Transcripts encoding anti-apoptotic proteins, which modulate caspase activity were found to be expressed in sugar-fed mosquito tissues, e.g., defender against programmed cell death. In blood-fed infected tissue, we also observed an increase in the number of several enzymes participating in redox metabolism and detoxification, such as superoxide dismutase, peroxidase, isoforms of metallothionein, cytochrome P450, and glutathione-S-transferase (Additional file 4). Some of the oxidoreductases were found in both the libraries but an overall upregulation is evident upon blood feeding and parasite infection, as reported earlier [57,58].
Cytoskeletal remodeling in host cells is a hallmark of pathogen attachment and invasion during infection [59-61]. We also found many transcripts encoding cytoskeletal and its associated proteins during both the conditions. As the regulation of formation of the actin network in cell cytoskeleton is centered at Arp2/3 complex (ARP) , its overexpression is necessary during infection. A significant increase in ARPs, α and β tubulins are reported to be upregulated during parasite invasion in A. gambiae midgut . However, we did not find such difference. Pathogen establishment is a stress to the host cell  accompanied with oxidative burst leading to misfolding of proteins [65,66]. A vast variety of stress induced proteins, especially, heat shock proteins and chaperonins are produced by the cell to carry out proper protein folding during stress. Many such transcripts were also observed in our data (Additional file 1).
Tetraspanins are conserved membrane proteins traversing cell membrane four times . These are found associated with many other proteins, especially integrins. They are involved in intracellular signaling, cellular motility, and metastasis. We found tetraspanin transcripts (Additional file 1) in both conditions. In Drosophila , the tetraspanin family comprises more than 30 members suggesting a possibility of many such proteins in mosquitoes. Interestingly, in Manduca sexta, tetraspanin-integrin interactions have been reported necessary for transition of hemocytes during cell-mediated immune responses .
Proteins containing leucine-rich repeats (LRRs) like APL1 , LRIM1, and LRIM2  demonstrate inhibitory activity against Plasmodium infection in A. gambiae and Anopheles quadriannulatus . Many other LRR domain containing proteins like toll receptors are reported in insects and other organisms, which primarily participate in protein-protein interactions . They exhibit diverse functionality but a definitive role has not been established in insects. We also found a few transcripts encoding proteins with LRR domains in our study (Additional file 1).
ICHIT protein contains mucin domains, which participate in the formation of extracellular matrix , and in trapping microbial pathogens through their lectin-liking characteristics . It possesses two putative chitin-binding domains flanking a mucin domain, and is observed to increase upon bacterial and malaria challenge in A. gambiae . However, we observed an increase in ICHIT (P = 0.000019) transcripts in sugar-fed condition. This protein is also believed to be associated with the peritrophic matrix, which separates the blood meal from the midgut membrane. Found across many other organisms, a possible role of ICHIT in immune response is predicted against pathogens .
Septins are GTPases thought to be associated with cell division especially nuclear division, membrane trafficking, and organizing the cytoskeleton . As in other studies , we also observed septins and smt3 transcripts in the sugar-fed tissue. These together play a role in toll signaling . We found many other insignificantly expressed transcripts in both the conditions (Additional file 4), which might bear an indispensable role in mosquito life cycle, e.g., vitellogenin, which is an abundant yolk precursor protein participating in egg maturation .
In summary, our study identifies numerous transcripts from A. stephensi midgut tissue with known and unknown functions (Additional file 1). However, despite of massive sequencing, loss of rare transcripts is possible. This could be due to the overexpression of certain stage specific genes, e.g., blood-induced genes like trypsin. In addition, our study differs with respect to the use of incubation temperature (28°C) for parasite development in mosquitoes from work reported earlier (24°C) . However, we observed reasonable number of oocyst formation (average 65.3, n = 20). Blood feeding by these parasite-carrying mosquitoes also induced a significant parasitemia in uninfected mice, confirming completion of parasite life cycle in the vector at 28°C. Furthermore, in the view of low genomic and proteomic resemblance between P. yoelii and the human malaria parasites , observations from rodent models like ours, need an essential analysis and assessment before extrapolation. Nevertheless, the information generated in the form of transcriptome could certainly prove a boon in investigating other malaria parasites.
We have successfully obtained 3946 transcripts from the adult female A. stephensi mosquito midgut, which would be of considerable use in future research on this malaria vector. Mapping of transcripts onto the A. gambiae genome was beneficial in the gene discovery process.
Plasmodium parasites (P. yoelii) were obtained from the Malaria Research Center (Delhi, India). The parasite was first inoculated in adult BALB/c mice (UK/AIIMS strains) by intraperitoneal route. Time for effective parasitemia was determined on various post inoculation days (PID). Parasites were maintained in vivo throughout the study.
A. stephensi (NIV strain) mosquitoes were maintained on 10% glucose until blood feeding. Adult females (4 days old) were allowed to blood feed on P. yoelii infected-BALB/c mice. Prior to blood feeding, blood parasitemia levels in the infected mice were determined using Giemsa stain. Mice showing gametocyte percentages above 0.5 were used for blood feeding experiments as reported earlier . Fully engorged females were separated using an aspirator and maintained in the insectory with controlled temperature (28 ± 2°C) and humidity (80 ± 5%) under 12 h alternating dark/light cycles. At 24 h post blood feeding, mosquito midguts were dissected, stained with 0.5% mercurochrome, and oocyst numbers per midgut were determined using a light-contrast microscope (Olympus) at 100× magnification. Dissected midguts were stored in liquid nitrogen until cDNA library preparation. To confirm completion of Plasmodium sporogony cycle in mosquitoes at 28°C, after every 4th or 5th passage, the natural route of infection was confirmed i.e. parasite-infected mosquitoes (14–15 days post blood feeding) were allowed to feed on uninfected BALB/c mice and parasitemia was recorded. To determine ookinete infection in the infected tissue, we additionally performed a qualitative assay based on reverse transcriptase PCR (RT-PCR) for P. yoelii ookinete specific genes, pyCTRP  and pyECP1  (for details refer Additional file 2).
A set of 20–40 blood-fed infected and sugar-fed adult female midgut tissues were used for cDNA library preparation. The tissues were crushed in trizol (Invitrogen) using RNase-free glass dounce homogenizer. RNA was subsequently extracted, following the manufacturer's protocol. Quantification of RNA was performed using ND-1000 Nanodrop (Thermo Scientific). RNA integrity was checked using denaturing agarose gel electrophoresis. cDNA libraries were constructed using the Creator™ SMART™ cDNA construction kit (Clontech, Takara Bio Inc.) according to the manufacturer's protocol using 1 μg of total RNA. After digestion with Sfi I, cDNA fragments were size fractionated using CHROMA SPIN-400 columns according to the instructions provided. Fractions were checked on 1.5% agarose/EtBr gels. cDNA fragments ranging from 300 bp to 3 kb were pooled. All further steps including ligation to pDNR-LIB, precipitation, and electroporation (Biorad GenePulser) in DH10B E. coli (Invitrogen) were carried out following the supplier's instructions. Libraries were screened for inserts by colony PCR. Thereafter, primary libraries were amplified and stored in 25% glycerol stocks at -80°C. When required, clones were plated using LB (Luria-Bertani) agar containing 30 μg/ml chloramphenicol and incubated overnight at 37°C. Colonies were manually inoculated in 1 ml 2× LB broth containing chloramphenicol in a 96-well inoculation plate. Plasmid isolation was done using Montage Plasmid Miniprep96 kit (LSKP09624, Millipore Corporation) following the manufacturer's instructions. Plasmid concentrations were determined for a random set of clones from each 96-well plate using nanodrop and quality was checked on 1% agarose/EtBr gels. Approximately, 300–500 ng of plasmids containing cDNA insert were sequenced from their 5' end using BigDye Terminator version 3.1 chemistry (Applied Biosystems, Foster City, CA) and M13 primer (5'-GTAAAACGACGGCCAGTAGATCT-3') on an ABI 3730 Genetic analyzer (Applied Biosystems) following the manufacturer's protocol.
The EST analysis was performed using an in-house developed EST pipeline (Additional file 2). Base-calling of the trace files was performed using phred [25,26] (quality value ≥ 20). The vector, primer and adapter sequences were masked using cross_match. PolyA tails were removed using a program in PERL script. Trimmed ESTs less than 100 bases in length were discarded. An additional round of filtering was performed to remove vector sequence, adapter sequence, and polyA tail using seqclean . EST sequences representing mouse and Plasmodium genes were identified and removed using BLAST analysis. ESTs from BF and SF libraries were assembled separately and together using CAP3 program . The UTs (contigs plus singlets) obtained from both libraries were combined and assembled using CAP3. These were searched against the UniProtKB database using BLASTX and EST data of A. gambiae, Ae. aegypti, and D. melanogaster, using TBLASTX. UTs showing no significant hits with the UniProtKB database were scanned using ESTScan  to verify the presence of putative coding region. GO terms were assigned to all the UTs using Blast2GO program . Classifications were based on molecular function, biological processes, and cellular components. To identify overrepresented GO terms between the libraries, enrichment analysis (using Fisher's exact test at a significance threshold value of 0.05) was carried out in Blast2GO program.
Statistical comparison of gene expression in both the libraries was performed using the online version of IDEG6  implementing pairwise Fisher exact test (significance threshold of 0.05). The analysis was performed for BLASTX-annotated and unannotated ESTs separately. For unannotated ESTs, only ESTs containing a putative coding region were considered.
Files representing the 2L, 2R, 3R, 3L, X, unknown, and unplaced Y chromosomal sequences were downloaded from Ensembl . UTs were mapped onto the A. gambiae genome using Gmap version 2007-09-28  using default parameters. Information comprising number of exons, chromosome name, and locus, was parsed using PERL script.
MySQL relational database management system was used as the back-end and the front-end was designed using various modules of PERL (CGI, DBI and GD). The database is hosted on the web using Apache web-server.
All the ESTs were deposited in the GenBank database with accession numbers from EX212289 to EX227655.
ESTs: Expressed Sequence Tags; BF: Plasmodium yoelii infected blood-fed; SF: Sugar-fed; UTRs: untranslated regions; PERL: Practical Extraction and Reporting Language; and UTs: Unique transcripts (refers to singlets and contigs together).
DPP, DPD, and AD were involved in construction of cDNA libraries. DPP, VSM, SAW, RKC, GJK, PSG, AS, and KMD contributed in sequencing the libraries. SA designed the pipeline for EST sequence trimming, assembly, and construction of ESTDB database. ESTscan and UniprotKB BLAST analyses were performed by SA with the help of BB. DPD has performed GO analysis, insect-specific transcript identification, IDEG6 analysis, and mapping of ESTs on A. gambiae genome with the help of DPP & NG. DTM performed mosquito rearing, parasite maintenance, and parasite infection in mosquitoes. DTM, RRJ, and MSP were co-investigators with YSS. DPP and DPD were involved in preparing the manuscript. All authors read and approved the final manuscript.
Detailed report of EST analysis. Contains detailed information obtained for all transcripts by various analyses. The worksheet includes user ID, GenBank accession, library-specificity (SF/BF/Both), insect-specific (YES/NO), insect-specific protein ID (if present), genome mapping to A. gambiae ('YES', if mapped and 'No', if not mapped), BLAST results (e.g., sequence description, length of query sequence, no. of BLAST hits obtained (maximum 10), maximum E-value, mean similarity (average similarity of top 10 BLAST hits)), no. of GO IDs, GO IDs, EC No., and other genome mapping details.
Assessment of Plasmodium infection in mosquito midgut and additional figures. Contains protocol for PCR based assessment of Plasmodium yeolii ookinete infection in the female A. stephensi mosquito midgut and results (Additional file 2: Figure S1). E-value-, percent-, and similarity-distribution for SF, BF, and combined UTs (Additional file 2: Figure S2). Flow chart depicting flow of analysis for EST pre-processing and functional annotation (Additional file 2: Figure S3).
Statistical comparison of GO terms. The file contains detailed output of Blast2GO's Enrichment analysis based on Fisher's exact test (only significantly (P < 0.05) altered GO term representations are shown).
Statistical comparison for differential gene expression (IDEG6 analysis). The file contains statistical comparison of the genes expressed in both the libraries using IDEG6 tool. Comparison of annotated and unannotated genes is given in "Annotated" and "Unannotated" worksheets, respectively. Different degrees of blue shades in the normalized tag values represent the extent of gene expression. Significant P-values are shaded yellow.
We are truly grateful to Prof. David W. Severson (Department of Biological Sciences, University of Notre Dame, USA), Dr. R. N. Sharma (National Chemical Laboratory, Pune, India), and Dr. Kiran Pawar (NCCS) for their comments and critical review of the manuscript. We thank the Department of Biotechnology, Government of India for providing financial support to the project. We also thank the Council of Scientific and Industrial Research (CSIR, New Delhi, India) and Department of Biotechnology (New Delhi, India) for supporting DPP and DPD with fellowships, respectively. We express our gratitude towards Mr. Sarang Satoor, Incharge, DNA Sequencing Facility, NCCS, for his kind help and suggestions in sequencing. We gratefully acknowledge the active support and encouragement provided by the Director, NCCS, Padmashree, Dr. G. C. Mishra.