|Home | About | Journals | Submit | Contact Us | Français|
Infections with the human parasite Plasmodium falciparum continue to present a great challenge to global health. Fundamental questions regarding the molecular basis of virulence and immune evasion in P. falciparum have been only partially answered. Because of the parasite's intracellular location and complex life cycle, standard genetic approaches to the study of the pathogenesis of malaria have been limited. The present study presents a novel approach to the identification of the biological processes involved in host-pathogen interactions, one that is based on the analysis of in vivo P. falciparum transcripts. We demonstrate that a sufficient quantity of P. falciparum RNA transcripts can be derived from a small blood sample from infected patients for whole-genome microarray analysis. Overall, excellent correlation was observed between the transcriptomes derived from in vivo samples and in vitro samples with ring-stage P. falciparum 3D7 reference strain. However, gene families that encode surface proteins are overexpressed in vivo. Moreover, this analysis has identified a new family of hypothetical genes that may encode surface variant antigens. Comparative studies of the transcriptomes derived from in vivo samples and in vitro 3D7 samples may identify important strategies used by the pathogen for survival in the human host and highlight, for vaccine development, new candidate antigens that were not previously identified through the use of in vitro cultures.
The human parasite Plasmodium falciparum continues to pose a great challenge to the health of most of the world's population and to the economies of most of the world's countries [1, 2]. Patients infected with P. falciparum present with a range of outcomes, from asymptomatic parasitemia to severe disease and death. The host and parasite factors that mediate the severity of disease are only partially defined [3–9]. One approach to the identification of parasite virulence factors is the characterization of in vivo parasite biological processes. It has been demonstrated that malarial transcripts encoded on chromosome 2 from P. falciparum–infected patients' blood samples can be reliably assessed, despite the abundance of human RNA . The present study reports important differences in the expression of genes detected in freshly obtained in vivo samples and the in vitro 3D7 transcriptome and provides a new approach to the study of the pathogenesis of malaria.
During transmission season in October 2003, patients with malaria were recruited from a clinic in a region of Senegal where P. falciparum is hypoendemic. The present study was conducted as part of an ongoing drug-resistance study that has been described elsewhere  and was approved by the institutional review boards at the Harvard School of Public Health and Cheikh Anta Diop University. Children and adults who presented with fever and whose blood smears were positive for P. falciparum were enrolled in the study. Blood (5–15 mL) was drawn into EDTA-coated Vacutainer (Becton Dickinson) tubes and, to collect serum and buffy-coat deplete, was centrifuged. RNA was stabilized within 10 min by use of 4.5 volumes of Tri Reagent BD (Molecular Research Center). Samples were stored at −80°C until RNA extraction was completed, as described elsewhere .
Venous blood was applied to IsoCode paper for each sample, and DNA was later extracted in accordance with the manufacturer's instructions (Schleicher and Schuell). To determine the number of clones infecting each patient, genotyping was performed for the merozoite surface protein (MSP) 1 and MSP 2 alleles by use of polymerase chain reaction (PCR) with nested primers under standard PCR conditions . Samples with late-stage parasites, as detected by microscopy, whose species could not be determined morphologically were subjected to PCR with species-specific 18S rRNA primers .
To determine whether a sufficiently abundant number of mRNA transcripts could be identified in in vivo samples, the 5 samples that had the largest volume of blood and the highest parasitemias were analyzed by use of an oligonucleotide array, which contained probes that are based on the sequence of 3D7, the laboratory-adapted P. falciparum reference strain for which the complete genome is available . Total RNA was extracted by use of Tri Reagent BD, in accordance with the manufacturer's instructions. Labeling and hybridization of total RNA was performed as described elsewhere .
The match-only integral distribution algorithm was used to assess the expression level for each transcript . Background was assessed on the basis of the probe intensities of 100 negative control genes. Only those probe sets that had >10 probes per gene and that had a signal that was ≥ 1.5-fold higher than background were analyzed. In addition, a transcript was considered to be present if the expression level was >10 expression units and its probe signal distribution had log P < −.5. After background was subtracted, the average intensity of genes between the 30th and 90th percentiles were normalized to 200 expression units across all samples, to allow comparison .
We compared the expression level of each transcript in the in vivo samples and each life-cycle stage from the previously reported 3D7 reference strain cell cycle transcriptome by use of a Spearman's rank sum correlation coefficient . Genes were considered to be overexpressed in vivo when the in vivo expression levels were ≥ 2-fold greater than those derived from in vitro 3D7 samples with parasites of all life-cycle stages. The statistical significance of differences between the number of transcripts expressed in each gene family for each in vivo sample compared with in vitro 3D7 was tested by use of Fisher's exact test (Stata; version 7.0; Stata Corporation).
In addition to testing for significant differences in mRNA levels for individual genes between the in vivo and in vitro 3D7-derived RNA, genes grouped on the basis of function were evaluated. To do this, the gene ontology (GO) annotation (http://www.geneontology.org) and malarial metabolic pathways from PlasmoDB (http://www.plasmoDB.org) were downloaded into Genespring (version 6.1; Silicon Genetics). A list that contained genes that were at least 2-fold overexpressed, compared with expression in in vitro 3D7 samples, was generated for each in vivo sample. These lists were tested for overlap with malarial metabolic pathways and GO annotations by use of a hypergeometric distribution equation (Bioscript Library; version 2.0; Silicon Genetics), with P < .05 considered to be significant.
To confirm the overexpression of PF14_0752 in the in vivo samples, real-time PCR quantification of cDNA from in vivo and in vitro 3D7 samples was performed. cDNA was generated from the in vivo isolated RNA of samples 12 and 29 and of 2 additional samples, 35 and 43, that had been collected from children with similar symptoms of mild malaria. 3D7 RNA was obtained from sorbitol synchronized parasites in in vitro culture containing 10% serum under standard conditions and total RNA was isolated at ring stage by use of Tri Reagent BD . cDNA was synthesized from all samples by use of the SuperScript First-Strand Synthesis System (Invitrogen). One aliquot of total RNA from each sample was treated without RT. cDNA from these 4 in vivo samples and from samples with 3D7 ring-stage parasites was subjected to PCR with primers for PF14_0752 (forward, 5′-GAATTTAAAATGACGGAGGATTGTT-3′; reverse, 5′-AAGATCTAGTATGTTCGGTTTCATT-3′). PFB0120w was used as a loading control for parasite RNA, since it demonstrated similar expression under in vivo and in vitro conditions (forward, 5′-CAGCCCTCTTAGCTCTCAACTTC-3′; reverse, 5′-AGCAACAGCAGAGGCTATAGAACT-3′). Standard curves generated from genomic DNA were used to quantify cDNA in each sample and are reported relative to PFB0120w. Duplicate reactions were analyzed for each sample by use of real-time PCR with 1 μL of cDNA, gene-specific primers, and the fluorescent dye SYBR Green (SYBR Green PCR Master Mix; Applied Biosystems) in a 50-μL reaction volume. For each sample, real-time PCR quantification of cDNA generated without RT was subtracted as background. The reactions were performed by use of an ABI Prism 7700 sequence detector (Applied Biosystems).
To assess steady-state mRNA levels of P. falciparum in vivo, blood samples were obtained from 5 P. falciparum–infected patients with fever, symptoms compatible with mild malaria, and a parasitemia >1% who were evaluated at an outpatient clinic in Senegal, and the blood samples were analyzed by use of microarray-based hybridization (table 1). Microscopic examination demonstrated early ring-stage parasites in all blood smears. Sample 7 also contained rare schizont forms (<0.1% of parasite forms). Total RNA was isolated from the blood samples, fluorescently labeled, and hybridized to an array of high-density oligonucleotides .
Between 1872 and 2988 transcripts were detected in each of the in vivo samples (table 1). The expression level of each transcript was normalized between all samples and ordered from highest to lowest, to derive a transcriptome that reflects steady-state mRNA expression. Comparison of the rank correlation of the transcriptome in the in vivo samples (samples 6, 8, 12, and 29) with each life-cycle stage of the 3D7 strain grown in vitro demonstrated the highest similarity with early ring stages (correlation coefficient, 0.80–0.93) (figure 1). This is consistent with the observation that ring-stage parasites are predominant in peripheral blood of P. falciparum–infected patients. These 4 samples also had a high correlation with the 3D7 in vitro merozoite stage (correlation coefficient, 0.75–0.85) and the late schizont stage (correlation coefficient, 0.74–0.80). Sample 7 had a lower correlation with all life-cycle stages; its highest correlation was with the late schizont stage.
The results suggest that the majority of parasite genes expressed in in vitro 3D7 samples are also expressed in vivo. In addition, the steady-state levels of mRNA for most genes are similar. This analysis suggests that the in vivo biological processes of field isolates are very similar to those of the 3D7 strain grown in laboratory culture.
Although expression of the majority of genes was well correlated between the in vivo samples and in vitro 3D7 culture, we nevertheless were able to identify a number of parasite genes displaying a ≥ 2-fold expression level in vivo, compared with that in samples with all in vitro 3D7 life-cycle stages. The number of overexpressed genes ranged from 28 to 553 for each sample. These included genes that encode membrane proteins, proteins involved in metabolic processing, and hypothetical proteins (tables 2–7).
We wanted to determine whether the genes overexpressed in vivo could be correlated to any malarial metabolic pathways or specific gene functions. Therefore, the list of overexpressed genes from each in vivo sample was tested for overlap with the malarial metabolic pathways annotated in PlasmoDB and with gene function families by use of the GO annotations. Analysis of these differentially expressed genes did not reveal statistically significant overrepresentation of genes in any malarial metabolic pathways. However, overrepresentation was observed in the GO plasma membrane category for samples 8 (18 genes; P = 9.18 × 10−7) and 12 (19 genes; P = 3.2 × 10−7). This plasma membrane cellular component category includes genes that encode surface antigens such as rifins and vars that are known to play important roles in infection and pathogenesis.
We then examined individual genes that were commonly overexpressed in all in vivo samples, compared with expression in vitro 3D7 samples. Three genes—RESA-2 (PF11_0512), a putative long-chain fatty-acid ligase (PFC0050c), and a stevor (PFD0065w)—were overexpressed in all in vivo samples (table 2). RESA-2 was not detected in any 3D7 stage or in previous in vitro studies of laboratory-adapted strains [18, 19]. However, RESA-2 transcript was detected in freshly obtained samples from infected patients in French Guiana . This gene is a homologue of RESA (PFA0110w), which is a ring-stage surface antigen and a potential vaccine candidate . Other genes overexpressed in 4 of the 5 samples include a putative diphosphate synthetase (MAL8P1.22) and 2 additional genes that encode rifins (PF14_0004 and PFI0030c).
Five of the 12 genes we identified as being overexpressed in vivo were hypothetical proteins that did not have significant homology to other proteins of known function. The National Center for Biotechnology Information TBLASTN program was used to further characterize these genes. TBLASTN analysis revealed that PF14_0752 demonstrated high amino acid homology to 9 other P. falciparum hypothetical proteins (figure 2) . Eight of these 10 homologous genes are predicted to encode proteins with a single transmembrane domain. Nine of the 10 genes are located in the subtelomeric regions of chromosomes. Whole-genome single nucleotide polymorphism analysis showed that 7 of the 10 genes are within the top 10% of all genes with respect to allelic variation (C. Kidgell, J. Borevitz, J. Johnson, S. Volkman, D. Plouffe, K. Le Roch, D. Wirth, Y. Zhou, and E. Winzeler, unpublished data). This is consistent with polymorphic DNA sequences found in other genes that encode surface proteins . Recently described host targeting motifs that are present in proteins found at the red cell surface in infected erythrocytes are present in the majority of these sequences [24, 25]. The overexpression of PF14_0752 in vivo was verified by real-time PCR of cDNA from 4 in vivo samples (figure 3). An 8–28-fold overexpression was observed relative to that in samples with ring-stage 3D7 parasites. These results imply that this putative gene family encodes surface transmembrane proteins similar to genes in the GO function plasma membrane category.
The only GO function that is significantly overrepresented in the ≥2 fold overexpressed in vivo samples gene lists contains genes that encode parasite proteins that are expressed at the red cell membrane. The members of these multigene families have polymorphic regions that are thought to play a role in immune evasion . To examine the expression pattern of these gene families in detail, we determined the presence or absence of transcript for each rifin, var, and stevor gene family member by use of methods described above in each in vivo sample and in the 3D7 ring stage (figure 4). The number of rifin gene family members detected was significantly greater in all of the in vivo samples, compared with that in the in vitro 3D7 samples (P < .001). Six rifins (PF10_0394, PF10_0402, PF11_0011, PFB0060w, PFI1810w, and PFC1100w) were exclusively expressed in vivo. The rifins comprise the largest surface variant gene family. Unlike var genes, where a single transcript is expressed as a protein, multiple rifins appear to be translated . It has been reported that rifin gene expression is more prevalent in wild-type isolates but is absent or faint in long-term cultured laboratory strains, which is consistent with the present data .
Similarly, there was a trend toward a greater number of stevor genes being overexpressed in vivo, with sample 8 reaching statistical significance (P = .021). Stevor PF14_0767 was uniquely expressed in 1 in vivo sample and was not detected in in vitro 3D7 samples with parasites of any 3D7 life-cycle stage. Stevor proteins, which are localized to Maurer's cleft, are under immune pressure, and in vitro studies have shown multiple transcripts per single-cell parasite [28–30]. These results could have been caused by the presence of multiple clones in some samples; however, the 3 samples (samples 6, 8, and 29) that had monoclonal parasites also contained a proportionally greater number of distinct genes in the rifin and stevor families that were expressed. Conversely, a greater number of var gene transcripts were detected in in vitro 3D7 samples, compared with that in samples 6 (P = .022), 8 (P = .036), and 29 (P = .007).
Sample 7 was distinctive, with a lower correlation to in vitro 3D7 samples with parasites of all life-cycle stages as well as to the other in vivo samples (figure 1). This sample contained late-stage parasites in the peripheral blood that were confirmed to be P falciparum by detection of species-specific 18S rRNA (data not shown) . This sample had the largest number of genes (553) that were at least 2-fold overexpressed, compared with in vitro 3D7 stage-specific transcriptomes. As with the other in vivo samples, there were no significant differences in any annotated malarial metabolic pathway, and the only GO function differences were also in the GO-annotated plasma membrane gene list (72 genes; P = 2.6 × 10−21). The number of rifin and stevor gene family members expressed in this sample was significantly greater than that in the in vitro 3D7 sample with a ring-stage parasite (P < .001) (figure 4).
The present study has demonstrated that sufficient amounts of RNA can be extracted from in vivo samples to support a whole-genome microarray analysis of P. falciparum gene expression. An excellent correlation between expression levels of parasite transcripts in in vivo samples and in in vitro 3D7 samples was found. More importantly, this analysis was sensitive enough to detect specific differences in expression in vivo and in vitro. This approach has demonstrated the overexpression of genes that encode surface proteins and has identified a putative novel gene family that encodes surface proteins that may play a role in in vivo parasite biological processes.
The analysis of genes differentially expressed in vivo and in vitro has identified virulence genes in bacterial systems [31, 32]. This approach was adopted to study the pathogenesis of malaria in vivo. The total number of parasite transcripts detected in a small blood sample approaches the number of transcripts reported to be expressed in in vitro cultivated 3D7 ring-stage parasites . For this first analysis of whole-genome in vivo steady-state mRNAs, the potential host effect on parasite selection and RNA expression was minimized through selection of samples from patients with similar demographics and disease presentation. A surprisingly high correlation between the transcriptomes in all the samples obtained from children and those in the in vitro 3D7 samples with ring-stage parasites was found by use of a rank-based correlation statistic. The high correlation between the in vivo samples and in vitro 3D7 samples with parasites in stages preceding the ring stage is notable. Although there are clearly defined morphologic differences between the schizont, merozoite, and ring stages, biological processes may be shared in contiguous stages. For example, developmental clustering between stages was noted for a set of 300 open-reading frames involved in host cell invasion in these stages [15, 33]. These data demonstrate the reproducibility of this method and validate the results.
The high correlation between the in vivo transcriptomes and the 3D7 ring-stage transcriptome, as well as the lack of significant difference in malarial metabolic pathways and most GO functions, is striking. This observation implies that the basic molecular processes of natural isolates and the 3D7 strain are highly conserved, consistent with the observation that the 3D7 strain can infect human volunteers and Anopheles mosquitoes in experimental settings .
The lack of overall diversity among the in vivo transcriptomes obtained from a homogenous cohort of patients suggests that distinct transcriptomes, when identified, could be informative. The transcriptome derived from the only adult patient had a lower correlation with the other in vivo samples and with 3D7 stage-specific transcriptomes. The presence of parasites in stages other than rings in the peripheral blood may account for some of these differences; however, the overexpressed gene list contains transcripts that are at least 2-fold more abundant than those for all 3D7 life-cycle stages, including the schizont stage. Similar to the other in vivo samples, there was no difference in genes that encode malarial metabolic pathways or GO functions, aside from the plasma membrane GO function. The presence of late forms, as detected by microscopy, has been shown to correlate with a more severe outcome, and it is intriguing that this was found in the sample with a distinct transcriptome [35, 36]. A larger study that enrolls patients with differences in age and disease severity will be necessary to interpret the significance of unique transcriptomes.
The specific overexpression of genes that encode surface proteins, hypothesized to be involved in immune evasion, is consistent with the biological processes of the in vivo environment, which is rich in immune cells and factors, compared with the in vitro environment. Previous work has demonstrated changes in expression of surface-expressed proteins in P. falciparum when the parasite is in prolonged culture or under biologic or immune selection [37–39]. Gene family members that are uniquely expressed in vivo may simply represent diversity of gene expression or be required for in vivo survival. Var genes were not found to be overexpressed in vivo. Because more-recent data examining geographically distinct strains (C. Kidgell, J. Borevitz, J. Johnson, S. Volkman, D. Plouffe, K. Le Roch, D. Wirth, Y. Zhou, and E. Winzeler, unpublished data) has found marked sequence polymorphism in the region from which the var probes used in our microarray analysis were derived, our detection methods may have resulted in an underestimation of the number of var transcripts present. Because parasite-encoded surface proteins interact with the host immune system, comprehensive analysis of genes in this functional class isolated from patients who demonstrate immunity, compared with those isolated from patients who do not demonstrate immunity, may provide insight into this critical aspect of the host-pathogen interaction.
A major goal of this work was to identify, for further analysis, genes that encode hypothetical proteins that may play a role in in vivo biological processes. One gene that encodes a hypothetical protein (PF14_0752) was identified as being overexpressed in 4 of the 5 in vivo samples. Further analysis revealed additional P. falciparum homologues that encode hypothetical proteins; these homologues were predominantly located at the telomeres, with the majority containing a predicted transmembrane domain, a vacuolar export/host targeting signal, and sequence polymorphism. Taken together, these characteristics suggest a new family of surface proteins, and analysis of these genes is under way to test this hypothesis.
Parasites residing in vivo are challenged with unique features that are not present under in vitro conditions. The in vivo environment contains immune factors, endothelial ligands, and variation in microenvironments secondary to sequestration in different organs. The molecular analysis of in vivo biological processes by means of this new approach will identify genes that are important for survival of the parasite in the human host and provide additional candidates for vaccine development, to lessen disease severity and provide immunity.
National Institute of Allergy and Infectious Diseases (5K23AI054518 to J.P.D.); Fogarty International Training Grant (5D43TW001503 to D.F.W.); Ellison Medical Research Foundation (New Scholars Award to E.A.W.).