|Home | About | Journals | Submit | Contact Us | Français|
Plasmodium falciparum is the most prevalent and lethal of the malaria parasites infecting humans, yet the origin and evolutionary history of this important pathogen remain controversial. Here, we developed a novel polymerase chain reaction based single genome amplification strategy to identify and characterize Plasmodium spp. DNA sequences in fecal samples of wild-living apes. Among nearly 3,000 specimens collected from field sites throughout central Africa, we found Plasmodium infection in chimpanzees (Pan troglodytes) and western gorillas (Gorilla gorilla), but not in eastern gorillas (Gorilla beringei) or bonobos (Pan paniscus). Ape plasmodial infections were highly prevalent, widely distributed, and almost always comprised of mixed parasite species. Analysis of more than 1,100 mitochondrial, apicoplast and nuclear gene sequences from chimpanzees and gorillas revealed that 99% grouped within one of six host-specific lineages representing distinct Plasmodium species within the subgenus Laverania. One of these from western gorillas was comprised of parasites that were nearly identical to P. falciparum. In phylogenetic analyses of full-length mitochondrial sequences, human P. falciparum formed a monophyletic lineage within the gorilla parasite radiation. These findings indicate that P. falciparum is of gorilla and not of chimpanzee, bonobo or ancient human origin.
Malaria is a blood infection caused by mosquito (Anopheles spp.) borne apicomplexan parasites of the genus Plasmodium1-3. Of five Plasmodium species known to infect humans, P. falciparum causes by far the greatest morbidity and mortality, with several hundred million cases of clinical malaria and more than one million deaths occurring annually1,2. While much progress has been made in the treatment and prevention of P. falciparum4, the origin and natural reservoir(s) of this and related plasmodial pathogens remain controversial. Until recently, the closest known relative of P. falciparum was a chimpanzee parasite, P. reichenowi, which was assumed to have diverged from its human counterpart at the same time as the ancestors of chimpanzees and humans, more than 5 Myr ago5-8. Within the past year, other closely related Plasmodium strains were detected in chimpanzees (Pan troglodytes), western gorillas (Gorilla gorilla), and bonobos (Pan paniscus), raising the possibility that P. falciparum in humans could have arisen as a consequence of cross-species transmission from one or more of these apes9-12. However, all of these studies were limited by an analysis of only few apes, many of which were captive and living in close proximity to humans. In addition, all prior studies employed non-limiting dilution polymerase chain reaction (PCR) amplification methods that were prone to generate artifactual mosaic sequences by recombination between genetically distinct templates. Here, we used conventional and single template PCR amplification methods to screen and analyze wild-living chimpanzee, bonobo and gorilla populations across sub-Saharan Africa for P. falciparum related parasites.
To determine the geographic distribution, species association and prevalence of ape Plasmodium spp. infections, we adapted a previously described PCR based diagnostic method10 to amplify a 956 bp fragment of Plasmodium cytochrome B (cytB) sequences from fecal DNA (Supplementary Fig. 1a). Ape fecal samples were selected from existing specimen banks that we had collected earlier for molecular epidemiological studies of simian retrovirus infections13-16. Except for 28 samples from one habituated gorilla community at the DS field site (Fig. 1), all other specimens were derived from nonhabituated apes living in remote forest areas (Supplementary Table 1). Chimpanzee (n=1,827), gorilla (n=805) and bonobo (n=107) samples were subjected to diagnostic PCR, and all amplification products were sequenced to confirm Plasmodium infection. In addition, a subset of samples (n=1,027), including all specimens from eastern gorillas and bonobos, was subjected to microsatellite analysis of host genomic DNA14-16 to determine the number of individuals tested at particular field sites (Supplementary Table 1). Microsatellite analysis also provided quantitative estimates of specimen integrity (Supplementary Table 2) and redundant sampling (Supplementary Table 3), thereby allowing us to determine the sensitivity of the non-invasive diagnostic test by identifying the proportion of PCR positive specimens from infected apes who were sampled more than once. From 32 such individuals, we estimated the test sensitivity to be 57% (Supplementary Table 4) and calculated the prevalence of ape infection at each field site (Supplementary Table 1). The results revealed widespread Plasmodium infection in chimpanzees and western gorillas, but not in eastern gorillas or bonobos.
Ape malaria parasites were detected at 32 of 45 chimpanzee collection sites, and at 17 of 20 western gorilla collection sites (Fig. 1), including every site where at least 10 individuals were estimated to have been sampled. Plasmodium infection was endemic in Nigeria-Cameroon (P. t. ellioti), central (P. t. troglodytes) and eastern (P. t. schweinfurthii) chimpanzees as well as in western lowland gorillas (G. g. gorilla), with estimated prevalence rates ranging from 32% to 48% (Table 1). The true infection rates are likely to be higher still, since Plasmodium detection in fecal samples can be expected to be less sensitive than detection in blood, as is the case for urine and saliva samples17. Although wild-living western chimpanzees (P. t. verus) and Cross River gorillas (G. g. diehli) were not tested in this study, these two subspecies have previously been shown to harbor Plasmodium parasites in the wild9,10. Based on these data, it is clear that chimpanzees and western gorillas represent substantial Plasmodium reservoirs. Surprisingly, we did not find this to be true for eastern gorillas or bonobos. Screening 71 and 58 members of these two species at multiple field sites, we failed to detect Plasmodium infection in any of them (Supplementary Table 1). These findings suggest that malaria parasites are rare or absent in some wild-living ape communities, possibly reflecting regional, ecological or seasonal differences in the distribution and/or host specificities of the transmitting mosquito vector(s). Additional field studies are needed to determine whether eastern gorillas and bonobos are infected by Plasmodium parasites at other locations or if they harbor divergent parasites not detected by current diagnostic assays.
To examine the evolutionary relationships of the newly identified Plasmodium parasites, we constructed phylogenetic trees for a subset of the diagnostic cytB sequences. This analysis showed that all sequences, except for one P. ovale-like strain, fell into one large monophyletic clade that also included P. reichenowi and P. falciparum (Supplementary Fig. 2). Parasites related to P. reichenowi and P. falciparum have previously been classified into a subgenus, termed Laverania, to distinguish them from more divergent Plasmodium species18. Our results thus indicated that parasites from this subgenus were common and widespread among wild ape populations. However, the topology of the Laverania clade was highly unusual, characterized by only few discrete clades and multiple sequences with very short branches attached to internal branches. Moreover, repeated PCR analysis of the same fecal samples yielded sequences that clustered variably in different parts of the tree (Supplementary Fig. 2). These findings indicated simultaneous infection with genetically diverse Plasmodium parasites and suggested that conventional (bulk) PCR amplification had generated in vitro recombinants. To examine this possibility, we re-analyzed the same Plasmodium positive fecal samples by single genome amplification (SGA), a molecular strategy that has been used extensively to characterize the genetic identity and quasispecies complexity of human and simian immunodeficiency viruses (HIV/SIV)19-23. Fecal DNA was diluted so that fewer than 30% of all PCR reactions yielded an amplification product, which ensured amplification of single Plasmodium templates in most reactions19-23. All amplicons were sequenced directly and sequences containing mixed bases indicative of more than one amplified template were discarded. Using this approach to characterize the genetic complexity of malaria parasites in fecal samples, we could eliminate both Taq polymerase-induced recombination (template switching) and nucleotide misincorporations in finished sequences, thereby ensuring an accurate representation of plasmodial variants as they existed in vivo21-23.
Fig. 2 depicts the phylogenetic relationships of a subset of SGA derived mitochondrial cytB sequences (the entire set of 697 sequences is analyzed in Supplementary Fig. 3). As in the corresponding tree of bulk PCR-derived sequences (Supplementary Fig. 2), all SGA derived sequences, except for seven P. ovale, P. vivax and P. malariae-like strains, grouped within the Laverania radiation. However, unlike in the bulk PCR tree, Laverania sequences in the SGA tree clustered in a strictly host species-specific manner, forming three chimpanzee (C1-C3) and three gorilla (G1-G3) specific clades, each supported by high bootstrap values. Interestingly, this host specificity did not extend to the subspecies level, since P. t. ellioti, P. t. troglodytes and P. t. schweinfurthii derived sequences were interspersed; however, cytB sequences from P. t. schweinfurthii segregated into distinct subclades within two of the three chimpanzee lineages (C1, C2), suggesting a phylogeographic distribution of certain Plasmodium variants (Supplementary Figs. 3a and b). None of 363 chimpanzee derived Plasmodium cytB sequences was closely related to human P. falciparum. Instead, all human sequences grouped within a single clade of parasites (G1) that infected western gorillas at numerous sites in Cameroon (LB, BB, CP, NK, BQ, DD, MM, LM), the Central African Republic (DS, ND) and the Republic of Congo (GT) (Fig. 2 and Supplementary Fig. 3d). A notable finding of the SGA analysis, which was obscured by bulk PCR analysis, was that most apes were co-infected with parasites representing multiple different plasmodial lineages, including variants from (i) the same Laverania clade, (ii) different Laverania clades, or (iii) Laverania and non-Laverania clades (Supplementary Table 5). Of 65 chimpanzee and 53 gorilla samples characterized, 48 (74%) and 37 (70%), respectively, harbored more than one genetically distinct parasite strain, and 36 (55%) and 23 (43%) contained members of two or more major Plasmodium clades (Supplementary Fig. 3). Given this high frequency of co-infection with divergent parasites, conventional recombination-prone PCR approaches are not appropriate for generating ape Plasmodium sequences for phylogenetic analysis. Moreover, previously reported ape Plasmodium sequences9-12 must be interpreted with caution since they were subject to these same confounding factors.
To test the robustness of the phylogenetic relationships depicted in Fig. 2, we used SGA to amplify additional genomic regions from cytB positive fecal samples, targeting loci in the mitochondrial, apicoplast and nuclear Plasmodium genomes. These regions included 390 bp of the caseinolytic protease C (clpC) gene (n=126), 772 bp of the lactase dehydrogenase (ldh) gene (n=46), and 3.4 kb (n=165) and 3.3 kb (n=127) fragments that together spanned the entire mitochondrial genome (Supplementary Fig. 1a). Phylogenetic analyses of each of these genomic loci revealed very similar topologies. In trees of clpC (Supplementary Fig. 4), ldh (Supplementary Fig. 5) and mitochondrial sequences (Supplementary Figs. 6 and 7), Laverania sequences formed the same number of chimpanzee (C1-C3) and gorilla (G1-G3) specific clades, albeit with some variations in the relationships among these lineages. Importantly, there was no evidence of recombination between chimpanzee and gorilla specific parasites, although many of them infected apes at the same field sites. This suggested that Laverania parasites are largely host specific (recombination between parasites infecting the same host species could not be assessed because of mixed Plasmodium infections). These findings, together with the extent of genetic diversity that distinguishes the various clades, argue strongly for the existence of six distinct Plasmodium species within the Laverania subgenus (Supplementary Figs. 3-8). Formal classification of these lineages must await additional taxonomic evaluation.
The new SGA-derived ape Plasmodium sequences call for a reassessment of the origin of human P. falciparum. Among over 600 sequences derived from ape samples spanning most of central Africa, we failed to find a single chimpanzee parasite that was sufficiently closely related to P. falciparum to represent a progenitor (Fig. 2 and Supplementary Figs. 3-8). Thus, P. reichenowi, as well as other chimpanzee Plasmodium species, can be excluded as precursors of P. falciparum. Instead, all new phylogenetic evidence points to a western gorilla origin of human P. falciparum (Fig. 2 and Supplementary Figs. 3-8). To investigate how often gorilla parasites might have colonized humans, we constructed phylogenetic trees from complete mitochondrial genome equivalents of the closest Plasmodium relatives of human P. falciparum (Fig. 3). In a tree of concatenated CytB, CoxI and CoxIII protein sequences (980 amino acids), all available human P. falciparum sequences (n=105) coalesced to a single common ancestor nested within the G1 clade of gorilla parasites (Fig. 3a). Nucleotide sequences from the remaining (non-coding) portions of the mitochondrial genome yielded a very similar topology, again showing that human P. falciparum formed a monophyletic lineage within the gorilla P. falciparum radiation (Fig. 3b). These findings, together with the observation that human parasites exhibit substantially less sequence diversity than the various ape Plasmodium species, including the closest gorilla relatives (Table 2), provide compelling evidence for a gorilla origin of human P. falciparum. Moreover, the monophyly of the human parasite sequences (Fig. 3) may indicate that all extant human strains evolved from a single gorilla-to-human cross-species transmission event. Notably, four recently reported Plasmodium sequences from captive bonobos11 also clustered closely with P. falciparum. However, unlike the gorilla sequences, the bonobo sequences were interspersed with the human sequences (Fig. 3). This finding, together with the fact that the bonobo parasites encoded dihydrofolate reductase-thymidylate synthase (dhfr-ts) drug resistance mutations prevalent in the local human population11, suggests that the bonobos became infected with human parasites while housed in an urban sanctuary. In fact, the topologies in Fig. 3 are consistent with more than one human-to-bonobo transmission, although some (or all) of the substitutions that distinguish bonobo and human sequences could represent PCR misincorporations since they were not generated by SGA methods11.
Using single template amplification strategies and a much larger collection of ape specimens than previously analyzed, we show here that wild-living chimpanzees and western gorillas are naturally infected with at least nine Plasmodium species. Among more than 1,100 SGA derived mitochondrial, apicoplast and nuclear gene sequences from 80 chimpanzee and 55 gorilla samples, we found a total of nine sequences that were related to P. malariae, P. ovale or P. vivax (Supplementary Table 5). All others grouped within one of six chimpanzee- or gorilla-specific lineages representing distinct Plasmodium species, three of which had not previously been described. Importantly, all currently available human P. falciparum sequences comprised a single lineage nested within the G1 clade of gorilla parasites. This finding indicates that human P. falciparum is of gorilla origin, and not of chimpanzee9,10,12, bonobo11 or ancient human5 origin, and that all known human strains may have resulted from a single cross-species transmission event. What is still unclear is when gorilla P. falciparum entered the human population and whether present day ape populations represent a source for recurring human infection. It has been suggested that the limited levels of genetic diversity seen at many loci in human P. falciparum reflect a relatively recent selective sweep8. Our data suggest that this bottleneck or “Eve event” was instead the consequence of cross-species transmission of a gorilla parasite. It is difficult to time this event without reliable dates to calibrate the Plasmodium phylogenetic trees. Previous estimates of dates in the evolution of Plasmodium have relied largely on the belief that P. falciparum and P. reichenowi diverged at the same time as the ancestors of humans and chimpanzees6-8,24, an assumption that is now groundless. Others have proposed a much shorter time scale coincident with the emergence of agricultural societies in sub-Saharan Africa, the incomplete penetration of protective human gene polymorphisms (e.g., hemoglobin C) that are selected by P. falciparum infection, or the speciation of African mosquito vectors3,25. Complete sequence analysis of members of the ape Plasmodium species identified here may help to resolve this conundrum. The second question, whether additional cross-species transmissions of Laverania parasites have given rise to human infections, is more immediately approachable. An alignment of over 100 ape Plasmodium mitochondrial genome sequences reveals ape-specific single nucleotide polymorphisms (Supplementary Fig. 1b), which can now be used to screen plasmodial sequences from humans living in close proximity to wild gorillas and chimpanzees. Such studies can inform malaria eradication efforts about potential zoonotic Plasmodium reservoirs and provide insights into adaptive changes that might be required for ape Plasmodium infection of humans26.
For more detailed methods see Supplementary Methods.
Fecal samples from wild-living chimpanzees, gorillas and bonobos were selected from existing specimens banks13-16 based on their geographic location, available host genetic information, and species and subspecies origin (Supplementary Table 1).
Plasmodium mitochondrial, apicoplast and nuclear gene sequences were amplified as described9,10,27,28, but using modified primers and PCR conditions suitable for fecal DNA. Bulk PCR positive fecal samples were subsequently subjected to SGA analysis.
The prevalence rates of Plasmodium infection were estimated based on the proportion of PCR positive fecal samples, correcting for specimen degradation, repeat sampling, and the sensitivity of the diagnostic test.
We thank Severin Loul, Aime Mebanga, Bienvenue Yangda, and Florian Liegeois for field work in Cameroon; the Cameroonian Ministries of Health, Forestry and Wildlife, and Research for permission to collect samples in Cameroon; the Water and Forest Ministry for permission to collect samples in the Central African Republic; the Ministries of Science and Technology and Forest Economy for permission to collect samples in the Republic of Congo; the Ministry of Scientific Research and Technology, and the Department of Ecology and Management of Plant and Animal Resources of the University of Kisangani for permission to collect samples in the DRC; Mwanza Ndunda (Centre de Recherche en Ecologie et Foresterie), Sally Coxe (Bonobo Conservation Initiative), Albert Lokasola (Vie Sauvage), and Angelique Todd (Max Planck Institute for Evolutionary Anthropology) for logistical support; Richard Carter for helpful discussions; Maria Salazar, Yalu Chen and Barry Cochran for expert technical assistance; and Jamie White for artwork and manuscript preparation. This work was supported by grants from the National Institutes of Health (R01 AI50529, R01 AI58715, U19 AI 067854, R03 AI074778, T32 GM008111, T32 AI007245, P30 AI 27767), the National Science Foundation (0755823), the Agence Nationale de Recherche sur le Sida (12152/12182), the Great Ape Conservation Fund of the U.S. Fish and Wildlife Service, the Arthur L. Greene Fund, the Wallace Global Fund, the Bristol Myers Freedom to Discover Program, and the Wellcome Trust. RSR was supported by a Howard Hughes Medical Institute Med-into-Grad Fellowship.
To screen wild ape populations for Plasmodium infection, we selected 2,739 fecal samples from an existing bank of chimpanzee (P. troglodytes), western gorilla (G. gorilla), eastern gorilla (G. beringei), and bonobo (Pan paniscus) specimens previously collected for molecular epidemiological studies of simian immunodeficiency virus (SIVcpz and SIVgor)14-16,31 and simian foamy virus (SFVcpz)13. All of these specimens, except for 28 samples from a group of habituated western gorillas (Makumba group) at the DS field site, were collected from non-habituated apes living in remote forest areas. Fecal samples were first subjected to host mitochondrial DNA analysis to determine their species and subspecies origin13-16,31. A subset was then selected for host microsatellite analysis to determine the number of individuals at particular field sites (Supplementary Table 1). These included 198 chimpanzee samples from the GT field site, 189 eastern gorilla samples from the KE, LU and OP field sites, and 119 bonobo samples from the LK and KR field sites (Supplementary Table 2). For estimates of sample degradation (Supplementary Table 2) and oversampling (Supplementary Table 3), we also included microsatellite results that we had obtained earlier for specimens collected from central chimpanzees14-16 and western gorillas15 in Cameroon, as well as from eastern chimpanzees31 in the Democratic Republic of Congo.
Fecal DNA was extracted14 and used to amplify four (GT) or eight polymorphic microsatellite loci (KE, LU, OP, LK, KR) as described14,15. Amplification products were analyzed on an automated sequencer (Applied Biosystems) and sized using GeneMapper 4.0 (Applied Biosystems). For individual identification, samples were first grouped by field site and mitochondrial DNA haplotype. Within each haplotype, samples were then grouped by microsatellite genotypes, but allowing for one allelic mismatch between samples. Specimens were classified as degraded if they failed to amplify two or more (GT), or three or more (all other sites) microsatellite loci. Samples with evidence of DNA admixture (multiple peaks for the same locus) were discarded.
Fecal samples were first screened for Plasmodium cytB sequences (956 bp) as described10, but using modified PCR conditions as well as a different second round reverse primer to generate a 166 bp longer amplicon. Nested PCR was performed using DW2 (5’- TAATGCCTAGACGTATTCCTGATTATCCAG -3) and DW4 (5’- TGTTTGCTTGGGAGCTGTAATCATAATGTG -3) in the first round, and Pfcytb1 (5’-CTCTATTAATTTAGTTAAAGCACA -3) and PLAS2a (5’- GTGGTAATTGACATCCWATCC -3) in the second round of PCR. For the first round, 2.5μl of fecal DNA was used in a 25μl reaction volume, containing 0.5μl dNTPs (10mM of each dNTP), 20pmol of each primer (DW2 and DW4), 2.5μl PCR buffer, and 0.25μl Expand Long Template enzyme mix (Roche). Cycling conditions included an initial denaturation step of 2 minutes at 94°C, followed by 15 cycles of denaturation (94°C, 10 sec), annealing (45°C, 30 sec), and elongation (68°C, 2 min), followed by 35 cycles of denaturation (94°C, 10 sec), annealing (48°C, 30 sec), and elongation (68°C, 2 min; with 15 seconds increments for each successive cycle), followed by a final elongation step of 10 min at 68°C. For the second PCR round, 1μl of the first round product was used in 25μl reaction volume, containing 0.5μl dNTPs (10mM of each dNTP), 20pmol of each primer (Pfcytb1 and PLAS2a), 2.5μl PCR buffer, and 0.25μl Expand Long Template enzyme mix. Cycling conditions included an initial denaturation step of 2 min at 94°C, followed by 60 cycles of denaturation (94°C, 10 sec), annealing (52°C, 30 sec) and elongation (68°C 1 min), followed by a final elongation step of 10 min at 68°C. Amplified products were gel purified and sequenced directly to confirm Plasmodium infection.
Samples positive for Plasmodium cytB sequences were then screened for apicoplast clpC (390 bp) and nuclear ldh (772 bp) sequences. Amplification of the clpC fragment was performed as described9, but using modified PCR conditions and a different second round reverse primer to generate a 117 bp longer amplicon. Nested PCR was performed using primers TFM1421+ (5’- AAAACTGAATTAGCAAAAATATTA -3) and TFM1423RC (5’- CGAGCTCCATATAAAGGAT -3) in the first round, and CLPCF1 (5’- TCTAAACAATTATTTGGTTCTG -3) and CLPCR2 (5’- GTTAATCTATTTARTAATTCHGGTTTAA -3’) in the second round of PCR. Amplification of the ldh fragment was also performed as described28,32,33, but using different PCR conditions. Nested PCR was performed using primers JNB272 (5’- ATGGCACCAAAAGCAAAAAT -3) and JNB273 (5’- GCCTTCATTCTSYTAGTTTCAGC -3) for the first round, and LDH1 (5’- GGNTCDGGHATGATHGGAGG -3) and Fv2n (5’- AACRASAGGWGTACCACC -3) for the second round. PCR conditions were the same as described for the cytB fragment. Amplified products were gel purified and sequenced directly to confirm Plasmodium infection.
Finally, cytB positive samples were subjected to nested PCR, aiming to amplify larger fragments (3.4kb and 3.3kb in length, respectively), which together spanned the entire Plasmodium mitochondrial genome (Supplementary Fig. 1a). The 3.4kb fragment was amplified using Pf936p (5’-GAGAAAAATGYAATCCWGTWACACAATA-3’) and DW4 in the first round, and Pf1031p (5’-GATGCAAAACATTRWCCTAATAAGTA-3’) and PLAS2a in the second round of PCR. The 3.3kb fragment was amplified using McytP (5’-TATCCAAATCTATTAAGTCTTG-3’) and Pf1916n (5’ – GCGTTCGTTCTTATAGTGTAGGC-3 ’) in the first round, and Pf4450p (5’-CTGTTCCTATTATATGGTTTATGTGTGC-3’) and Pf1880n (5’-CCTTTAATGTAGTTTCCTCACAGCTT-3’) in the second round of PCR. For the first round of amplification, 2.5μl of fecal DNA was used in a 25μl reaction volume, containing 0.5μl dNTPs (10mM of each dNTP), 20pmol of each first round primers, 2.5μl PCR buffer, and 0.25μl Expand Long Template enzyme mix (Roche). Cycling condition included an initial denaturation step of 2 min at 94°C, followed by 15 cycles of denaturation (94°C, 10 sec), annealing (45°C, 30 sec) and elongation (68°C, 4 min), followed by 35 cycles of denaturation (94°C, 10 sec), annealing (48°C, 30 sec) and elongation (68°C, 4 min; with 15 seconds increments for each successive cycle), followed by a final elongation step of 10 min at 68°C. For the second round of amplification, 2μl of the first round PCR product was used in a 50μl volume, containing 1μl dNTPs (10mM of each dNTP), 20pmol of each second round primers, 5μl PCR buffer, and 0.5μl Expand Long Template enzyme mix. Cycling condition included an initial denaturation step of 2 min at 94°C, followed by 60 cycles of denaturation (94°C 10 sec), annealing (52°C, 30 sec) and elongation (68°C, 4 min), followed by a final elongation step of 10 min at 68°C. Amplified products were gel purified, but only a small fragment was sequenced to confirm Plasmodium infection.
To derive sequences suitable for phylogenetic analyses, a subset of bulk PCR positive chimpanzee (n=80) and gorilla (n=55) fecal samples was subjected to SGA analyses21,22. According to a Poisson distribution, the DNA dilution that yields PCR products in no more than 30% of wells contains one amplifiable template per positive PCR more than 80% of the time. Fecal DNA was thus endpoint diluted in 96-well plates, and the dilution that yielded less than 30% positive wells was used to generate between 1 and 40 different SGA sequences per sample (Supplementary Table 5). The same primers and PCR conditions used for bulk amplification of cytB, mtDNA-3.4kb, mtDNA-3.3kb, clpC and ldh fragments were also used for SGA analyses. Amplification products were gel purified, and sequenced directly using Sequencher version 4.6 (Gene Codes Corporation). Sequences that contained double peaks as an indicator of more than one amplified template were discarded.
To estimate the sensitivity of the diagnostic cytB PCR test, we determined the proportion of PCR positive specimens from Plasmodium infected apes that were sampled more than once on the same day. Other replicate samples were excluded since the duration of natural ape Plasmodium infections is unknown. The sensitivity of Plasmodium nucleic acid detection was then calculated as the fraction of positive tests per total number of samples tested (Supplementary Table 4). Including data from 32 such apes, we estimated the sensitivity to be 57% (with confidence limits determined assuming binomial sampling). It should be noted that this approach led to a systematic (albeit small) overestimation of the assay sensitivity, since it did not account for infected apes that yielded only negative replicate samples. Moreover, Plasmodium detection in fecal samples is very likely less sensitive than in blood, as is the case in urine and saliva17. Thus, the prevalence rates in Table 1 and Supplementary Table 1 should be interpreted as minimum estimates of Plasmodium infection rates in wild apes. The specificity of fecal Plasmodium detection was 1.00, since all amplification products were sequence confirmed.
For sites where the number of sampled chimpanzees was known (Supplementary Table 1), Plasmodium prevalence rates were estimated based on the proportion of infected individuals. For each ape, the probability that it would be detected as being infected, if it was truly infected, was calculated taking into consideration the sensitivity of the diagnostic PCR test and the number of samples analyzed, with 95% confidence limits determined assuming binomial sampling. For the remaining field sites where the number of sampled individuals was not known, prevalence rates were estimated based on the number of fecal samples, but correcting for specimen degradation and oversampling. As shown in Supplementary Table 2, microsatellite analysis of 1,027 fecal samples indicated an average degradation factor of 13%. Microsatellite analyses also provided a quantitative estimate of oversampling. Because of regional differences in sample collection, oversampling values were calculated separately for the different ape species and subspecies. As shown in Supplementary Table 3, central chimpanzees, western gorillas, eastern chimpanzees, eastern gorillas and bonobos were each assumed to have been sampled on average 1.77, 1.84, 3.74, 2.01 and 1.84 times, respectively. Using these corrections, the proportion of Plasmodium infected chimpanzees was estimated for each field site, again taking into account the sensitivity of the diagnostic test. From these determinations, prevalence rates and their confidence limits were calculated.
Ape derived Plasmodium sequences were aligned with human and simian reference sequences using CLUSTAL W34. Sites that could not be aligned unambiguously were excluded. Trees were constructed from mitochondrial cytB sequences (956 bp, Supplementary Figs. 2 and 3; 240 bp, Supplementary Fig. 8), apicoplast clpC sequences (390 bp, Supplementary Fig. 4), nuclear ldh sequences (772 bp, Supplementary Fig. 5), and mitochondrial half genomes (3,361 bp, Supplementary Fig. 6; 3,277 bp, Supplementary Fig. 7). In addition, trees were constructed from mitochondrial coding (Fig. 3a) and non-coding regions (Fig. 3b). Deduced CoxI, CoxIII, and CytB protein sequences were concatenated into a single 980 amino acid sequence. The non-protein coding portion of the mtDNA-3.3kb fragment comprised 2,447 nucleotides following the removal of ambiguous sites. Phylogenetic trees were inferred using PhyML30. The class of evolutionary model was chosen using ModelTest35, and parameters were iteratively estimated in PhyML, using the GTR+I+G model for nucleotide and the LG+I+G36 model for amino acid sequence trees. Bootstrap values were calculated with 100 replicates37. Posterior probability values were calculated using MrBayes29, using an average standard deviation of partition frequencies < 0.01 as a convergence diagnostic. A neighbor-joining phylogenetic tree (Supplemental Fig. 8) was calculated with CLUSTAL W, using the Kimura 2-parameter model of evolution with bootstrap support based on 1000 bootstrap replicates34,37.
All new SGA derived ape Plasmodium sequences have been submitted to GenBank, with accession numbers listed in Supplementary Table 6.