|Home | About | Journals | Submit | Contact Us | Français|
This article outlines genome-scale approaches that can be used to identify mutations in malaria (Plasmodium) parasites that underlie drug resistance and contribute to treatment failure. These approaches include genetic mapping by linkage or genome-wide association studies, drug selection and characterization of resistant mutants, and the identification of genome regions under strong recent selection. While these genomic approaches can identify candidate resistance loci, genetic manipulation is needed to demonstrate causality. We therefore also describe the growing arsenal of available transfection approaches for direct incrimination of mutations suspected to play a role in resistance. Our intention is both to review past progress and highlight promising approaches for future investigations.
The successful management of drug resistance is key to effective malaria control. The spread of resistance to chloroquine (CQ) and sulfadoxine/pyrimethamine in Plasmodium falciparum effectively ended international efforts to eliminate malaria in the 1960s and 1970s, resulting in a resurgence of malaria related deaths  and a dampening of political will to control this disease. Since then, resistance has arisen to a succession of other antimalarial drugs, including mefloquine, quinine and atovaquone . The current treatments of choice for malaria are artemisinin-based combination therapies (ACTs). These consist of artemisinin derivatives, which kill parasites rapidly but have short half-lives in the blood, combined with a partner compound such as mefloquine, lumefantrine, amodiaquine or piperaquine, which have longer half lives and clear the remaining parasites . The rationale behind the use of drug combinations is both to maximize treatment efficacy and to reduce the rate at which resistance arises and neutralizes treatment efficacy. The current success of ACTs has led to a renewed interest in malaria elimination. ACTs have now been introduced into the majority of malaria-endemic countries, and malaria mortality rates have been dramatically reduced in many regions . However, reduced clearance rates of parasites from the blood of artesunate or ACT-treated patients in western Cambodia now raise concerns regarding the emergence of artemisinin resistance [5–7]. Drug resistance is also a barrier to control of other human malarias. CQ resistance in Plasmodium vivax is now established and complicates treatment of this pathogen ; similarly, resistance to sulfadoxine-pyrimethamine is widespread in P. vivax and Plasmodium malariae [9–11].
Understanding the genetic basis of antimalarial drug resistance has three major benefits. First, this knowledge can be used to track the spread of resistance alleles. If the mutations that underlie resistance are known, then PCR-based screening of infections can be used to map the distribution and rate of spread of resistance mutations and inform local treatment policies [12,13]. Without knowledge of the genetic basis of resistance, such decisions have previously been based on clinical drug efficacy trials that are costly and labor intensive. Second, the identification of the causal mutations provides a means to delineate mechanisms of drug action , and can suggest ways in which drugs may be modified to restore efficacy. Finally, discerning the genetic basis of resistance provides the tools needed to better understand disease evolution and to elaborate strategies to curtail resistance [15,16].
Key point mutations have already been identified in a variety of genes that affect levels of drug susceptibility in P. falciparum . These include the CQ resistance transporter (pfcrt) and the multidrug resistance gene (pfmdr1), which influence parasite susceptibility to CQ, quinine, lumefantrine and mefloquine. Similarly, mutations in the mitochondrial cytochrome b gene can mediate atovaquone resistance, and mutations in dihydrofolate reductase (dhfr) and dihydropteroate synthase (dhps) underlie resistance to the antifolate drugs pyrimethamine and sulfadoxine, respectively. However, with the possible exception of atovaquone resistance, known genes do not fully explain variation in drug response, suggesting that additional genes are involved [17,18]. For artemisinin, the genetic basis of tolerance, manifesting as reduced rates of parasite clearance from the blood, is unknown in P. falciparum, as is the basis of CQ resistance in P. vivax.
Classical genetics, utilizing genetic crosses and examination of segregation patterns of drug resistance phenotypes and genetic loci in the resulting progeny, has been extremely effective in studies of antimalarial drug resistance . Three crosses of P. falciparum have been completed. The most comprehensively analyzed is the cross between a Southeast Asian (Dd2) and a Central American (HB3) parasite . Remarkably, this single genetic cross resulted in the mapping and subsequent identification of mutations involved in a wide range of drug-related phenotypes (Table 1). In addition, crosses of 7G8 (South America) × GB4 (Ghana)  and 3D7 (Africa) × HB3 have recently been genotyped . Genetic crosses are cumbersome to perform with P. falciparum and require specialized facilities, as both parasite-infected mosquitoes and chimpanzees are needed. Furthermore, increased restrictions on the use of nonhuman primates severely restrict the implementation of this approach. In the case of P. falciparum, the parent lines must be cultured in vitro until gametocytes are produced and allowed to infect mosquitoes. After 2 weeks, the mosquitoes are then allowed to feed on splenectomized chimpanzees. The progeny of the cross can then be cloned by limiting dilution from infected chimpanzee blood, the recombinant parasites identified and the drug resistance phenotypes measured. The requirement for cloning independent progeny renders it impractical to apply classical genetics to the other human parasite species that cannot be cultured in vitro, although this restriction does not apply for rodent malaria models. Carlton et al. were able to clone 59 recombinant progeny in mice by limiting dilution from two Plasmodium chabaudi crosses .
In the P. falciparum cross between HB3 and Dd2, 35 genetically distinct progeny were recovered after the examination of over 1000 cloned parasites from chimpanzee blood . Similarly, 32 independent progeny were isolated from the 7G8 × GB4 cross , and 51 recombinants were obtained from the earlier 3D7 × HB3 cross . A striking feature of P. falciparum crosses is that quantitative trait loci (QTL) regions can be mapped to short regions of the genome. For example Su et al. were able to map resistance to CQ to a 40-kb segment of chromosome 7 containing nine genes , while Yuan et al. were able to map genes for trimethoprim and triamterene resistance to a 59-kb region of chromosome 4 that harbored ten genes, including dhfr . This allows the regions harboring the primary determinants to be mapped with a precision that would be unattainable in human linkage studies where quantitative trait loci (QTL) regions typically span 10–20 cM (equivalent to 10–20 Mb) and contain 50–200 genes. The high resolution of QTL mapping in P. falciparum is due to this parasite’s particularly high recombination rate (see Box 1) .
The small numbers of independent progeny from genetic crosses in Plasmodium limit the statistical power, but these numbers are sufficient to detect major genes underlying a broad range of resistance phenotypes (Table 1). Isolation of greater numbers of progeny would both improve the power of genetic crosses and increase the resolution of QTL mapping. We note that HB3 features in both the 3D7 × HB3 and the Dd2 × HB3 cross; hence, both crosses can be combined in a single pedigree. Their joint analysis could be particularly valuable for mapping resistance phenotypes with low penetrance or complex multigenic traits.
Densely spaced markers are not essential for classical linkage mapping, because a single round of recombination separates the parents from each progeny. To date RFLPs [20,27] or microsatellite maps  have been used to localize chromosomal regions associated with resistance. These methods will soon be replaced by SNP-based markers. The latter are both quicker and easier to genotype and score and also contain more markers, and thus provide greater resolution to define the boundaries of genome blocks inherited from the two parents. A variety of microarray-based SNP genotyping platforms are now available for P. falciparum [28–30]. Similarly, a molecular inversion probe array containing 10,000 common P. falciparum SNPs is now available .
In addition to locating QTLs, classical linkage mapping also allows for interactions between loci to be defined, because progeny phenotypes containing either one or more loci linked to resistance can be determined. Sá et al. described mapping amodiaquine resistance in two P. falciparum crosses . This study provided compelling evidence that mutant pfcrt and pfmdr1 alleles could combine to mediate amodiaquine resistance, and that some PfMDR1 haplotypes can influence the level of CQ resistance mediated primarily by mutant PfCRT. Similarly, Ferdig et al. identified interactions between QTLs that influence resistance to quinine .
Two recent studies have expanded the scope of classical linkage analysis in P. falciparum by incorporating high-throughput methods to measure resistance phenotypes. Yuan et al. measured in vitro drug resistance to 1279 chemicals (from the Library of Pharmacologically Active Compounds [LOPAC] collection of known bioactives) in the parents of the three genetic crosses and two additional parasite lines (D10 and W2) . These authors identified 149 compounds that demonstrated differential responses in one or more of the crosses, and were, therefore amenable to linkage analysis. For three exemplar drugs, the authors determined the levels of susceptibility in each progeny, clearly mapping the dihydroergotamine methanesulfonate response to pfmdr1 on chromosome 5, and the trimethoprim and triamterene response to dhfr on chromosome 4. Further mapping of responses to chemicals in this, and other compound libraries, has the potential to rapidly identify the genes that drive parasite susceptibility, and subsequently help define gene function. Intriguingly, Yuan et al. observed dramatic differences in the chemical response profiles of Dd2 and W2 parasites: W2 was more sensitive than Dd2 to most of the 149 drugs exhibiting differential activity. Dd2 was derived from W2 following selection with mefloquine and underwent pfmdr1 gene amplification, thereby implicating copy number variation in this gene as an important genetic change resulting in altered parasite susceptibility to these chemically diverse agents. In the second study, Gonzalez et al. examined QTLs for expression variation at 5150 transcripts in the progeny of the Dd2 × HB3 genetic cross, and identified 981 transcripts that mapped to QTLs . Remarkably, levels of 269 different transcripts encoded by genes at multiple locations in the genome mapped to a QTL on chromosome 5 that contained pfmdr1 and 13 other genes. This chromosome 5 region is amplified in the Dd2 parent and was associated with upregulation of 228 of the 269 transcripts in progeny that inherited the pfmdr1 segment from Dd2. This study underscores how drug-selected copy number alterations at a single locus can have dramatic consequences for expression throughout the genome.
Linkage group selection (LGS) represents an elegant new approach to genetic mapping that has recently been developed by groups working on rodent malaria models [35,36]. In LGS, uncloned populations of progeny parasites are selected using drugs (or other selection pressures) and are then compared with control progeny populations that remain unselected. Control and drug-treated pools of parasites are then genotyped across the genome using a quantitative typing method capable of measuring allele frequencies within each population. The central premise of this approach is that alleles involved in drug resistance will rise to a high frequency in populations treated with drugs relative to untreated control populations, while allele frequencies will remain essentially unchanged in control and treatment groups in genome regions distant from the resistance determinant(s) (Figure 1).
To date, LGS has been exclusively used for mapping traits in rodent malaria parasites (P. chabaudi and Plasmodium yoelii) [35,36] and in yeast. In yeast, the method was termed extreme QTL analysis, as extreme phenotypes within the population were selected . Using this approach, genes underlying resistance to artemisinin, obtained by repeated passaging under drug pressure, were mapped in P. chabaudi (Table 1) [36,38]. Furthermore, candidate resistance mutations were identified by comparing the genome sequences of resistant parasites used in the crosses with the sensitive line from which they were derived .
Linkage group selection has a number of advantages over classical linkage analysis (Table 2). First, cumbersome cloning and identification of recombinants is not needed, and phenotyping is not necessary as resistant phenotypes are enriched by drug selection. Furthermore, costs of genotyping are reduced because populations, rather than individual progeny, are genotyped. Eliminating the need to clone progeny is particularly important, because this extends the range of Plasmodium species that are amenable to linkage analysis. For example, LGS provides a potentially powerful approach for mapping CQ resistance in P. vivax, which is logistically challenging using a classical genetics approach. Crosses should be achievable by propagating CQ-sensitive and -resistant parasites in nonhuman primates (such as macaques), feeding blood mixtures to mosquitoes, and subsequently, allowing those mosquitoes to infect additional animals, with the recombinant progeny produced via meiosis in the mosquito midgut. CQ selection could then be applied either in vivo, by treating the primates, or in vitro using a short-term culture. Similarly for P. falciparum, selection of crosses could be carried out using pooled asexual blood stage progeny in culture flasks (once the progeny have developed through the liver stage into the culture-amenable asexual blood stage forms). Eliminating the need to clone increases the numbers of recombinants that can be analyzed, thereby improving statistical power and narrowing QTL regions. LGS thus represents a particularly powerful approach to analyzing the genetic basis of complex multigenic phenotypes.
Effective LGS requires genotyping methods that can accurately measure allelic representation in pools of progeny parasites. To date, this has been performed using pyrosequencing or amplified fragment length polymorphisms (AFLPs) [35,36]. Next-generation sequencing provides an attractive genotyping approach for LGS, because allele frequencies within pools can be assessed by measuring sequence read depth. The Plasmodium genome can now be sequenced to approximately 60-times read depth with 108-bp paired end reads in a single lane of a flow cell on the Illumina (CA, USA) genome analyzer. Multiplexing of samples within lanes is feasible. Hence, just a single lane is sufficient to compare control and selected lines with approximately 30-times depth at a cost of approximately US$2000 (as of late 2010). In this way both the initial identification of QTL regions and the fine mapping of candidate genes and mutations could be achieved in a single step.
Genome-wide association studies (GWAS) analyze statistical associations between genetic markers and trait phenotypes in parasite samples derived from natural populations. While GWAS has considerable promise, this approach has not yet been widely used for Plasmodium . GWAS is now commonly used for studies of complex genetic disease in humans. Such studies typically use thousands of cases and controls to detect the genes. We suggest that GWAS studies of plants [40,41] may provide a better model for Plasmodium for the following reasons. First, Plasmodium are hermaphrodites with mixed mating systems involving both inbreeding and outbreeding, similar to many plants , and parasite population structures parallel those observed in certain plants . Second, drug resistance traits tend to involve major gene effects and causative alleles are at high frequency as expected for phenotypic traits driven by strong selection , similar to domestication traits in crops such as maize. Third, most drug resistance phenotypes have a strong heritable basis, and can be accurately measured in the laboratory [45,46]. These three characteristics of drug resistance traits should allow for successful GWAS studies of drug resistance in Plasmodium to be conducted with much smaller sample sizes than are typical for human GWAS, where traits of interest are often complex and difficult to measure accurately. For example, Wootton et al. were able to effectively map pfcrt on chromosome 7 using a panel of 80 parasites and microsatellite markers . Key features of Plasmodium population structure that are relevant to GWAS are summarized below.
The quality of phenotype data is critical to the success of both GWAS and linkage analysis. Drug resistance is measured by examining the rate of growth of cultured parasites exposed to different concentrations of drugs. Fluorophore-based labeling of parasites provide a means for rapid indirect measurement of parasite growth and development of high-throughput assays [48–50], and such methods are increasingly replacing more cumbersome approaches using radiolabeling. For P. falciparum, tissue culture-based propagation of parasites allows replicated measurement of drug resistance under standardized conditions, and precise quantification of experimental error. For other species such as P. vivax, long-term parasite culture is not possible and resistance must be measured in parasites collected directly from patients. This limits the possibility for replication, and there is also the concern that host red blood cell characteristics such as hemoglobinopathies may influence resistance measures. One study estimated the proportion of variation explained by genetic factors in resistance data generated using short-term culture of P. falciparum obtained directly from patients . Genetics explained a large proportion (49–79%) of the variance in resistance for lumefantrine, quinine and mefloquine. However, only 17–39% of the variance in susceptibility to artemisinin derivatives or CQ was attributable to genetics. By implication, nongenetic factors can explain much of the measured variation for these drugs.
Plasmodium parasite population structure varies dramatically between locations, with a spectrum that is similar to inbred plant populations [42,51]. This has important practical and statistical implications for association mapping (Table 3). African parasite populations characteristically have high transmission rates (as high as three infective bites per person per night), while South American and Southeast Asian populations have much lower levels of transmission (often <1 infective bite per person per year). As a consequence most infections in Africa contain multiple genotypes. To obtain single genotype infections for GWAS, either a large number of infections must be genotyped in order to identify rare hosts infected with single clones, or individual parasites must be cloned from multiple infections. By contrast, less than 50% of Southeast Asian and less than 10% of South American infections carry multiple genotypes, which greatly simplifies the collection of parasite samples. In this case, infections that contain a single predominant genotype can be identified by genotyping a small number of SNP or microsatellite markers [42,52].
The outcrossing rate is critical for GWAS because this determines the rate at which association between adjacent markers (linkage disequilibrium [LD]) is broken down by recombination. Populations with high rates of transmission and frequent polyclonal infections will tend to have higher levels of outcrossing than parasites from low transmission areas. LD is rapidly eroded by recombination in high-transmission areas, resulting in minimal LD. Accordingly, at a continental level, the decay of LD occurs more rapidly in African P. falciparum populations than in South American and Asian populations (Figure 2). For example r2, which measures the correlation between pairs of markers, decays to half its maximal value between markers spaced approximately 100 bp apart in many African populations, but extends to approximately 150 kb in South American populations (Figure 2) [29,51,53]. The extent of recombination and LD influences the marker density needed to detect associations with drug resistance loci, as well as the precision with which resistance genes can be located (Figure 3). In African populations, complete genome sequencing and the direct genotyping of causative loci may be needed, but localization of causal genes and mutations can be very precise. By contrast, genome regions underlying resistance may be detectable using sparse genotyping in South America, but narrowing these regions to specific candidate genes or mutations may be difficult.
When drug resistance alleles spread through populations they carry along flanking alleles, generating islands of LD and purging variation. This genetic hitchhiking causes drug resistance alleles to show unexpectedly strong associations with nearby marker loci. Marker spacing, based on genome-wide LD between markers, may therefore be conservative. Hitchhiking has been well documented in Plasmodium drug resistance loci. In the case of both pfcrt and dhfr, genetic variation can be essentially nil for approximately 10 kb while variation is often reduced for more than 100 kb [15,47].
Population structure is important for GWAS studies, because deviations from random mating can generate false-positive associations. In P. falciparum there are substantial differences in allele frequencies between continents. The fixation index (FST), a standard measure of population differentiation, ranges from 0.24 to 0.43 in comparisons between Asia, Africa and South America . Population subdivision also differs dramatically between locations. Within Africa, parasite populations show limited spatial structure. However, Southeast Asian and South American populations often demonstrate stronger differentiation [42,54]. Furthermore, parasites sampled from a single location may contain representatives of two or more populations, which differ in both allele frequencies and in resistance phenotypes . Mu et al. provide an example of this from Cambodia . Their analyses revealed that a Cambodian population sample contained a set of parasites that grouped with parasites from Thailand, and a separate group that showed limited diversity. In addition to population subdivision, South American and Southeast Asian populations frequently exhibit a strong relatedness structure. For example, 27 groups of 2–8 parasite isolates were identical at more than 95% of 335 microsatellite markers in a sample of 185 Thai isolates. A total of 74% of parasites in this sample showed significant relatedness to one or more other isolates . The inclusion of isolates sampled from two or more populations differing in both phenotype and allele frequency inevitably results in spurious associations between resistance and genetic markers that play no role in resistance. Similarly, the sampling of multiple related organisms that share similar alleles and phenotypes will also drive spurious associations (Figure 4).
False-positive associations can be minimized by both sensible sampling and by statistical methods (Box 2). Ideally, samples should be collected from a single country over a limited time period. In addition, preliminary genotyping should be used to exclude polyclonal infections and to identify identical multilocus genotypes . Remaining bias due to sampling can be taken into account using statistical methods. Principal component analysis (PCA)-based methods allow for the identification of outlier genotypes, and statistical adjustment to account for differences in allele frequency. Powerful mixed-model approaches have been developed by plant geneticists [40,56]. These methods simultaneously adjust for both differences between and relatedness within populations. We believe this approach will prove to be extremely effective for Plasmodium, particularly in regions of low transmission. Finally, population structure tends to result in an excess of significant test results. This inflation can be quantified by comparing the expected and observed p-value distributions.
Mu et al. have completed the only SNP-based GWAS study of P. falciparum to date, but a flurry of additional studies are currently underway . Mu et al. assembled a collection of parasites from Southeast Asia (Thailand and Cambodia, totaling 146 samples), Africa (26 samples) and South America (14 samples), and measured in vitro resistance to seven antimalarial drugs. They genotyped all isolates using an Affymetrix (CA, USA) molecular inversion probe array designed to interrogate 3354 SNPs. Quantitative drug response data (taking into account population structure) were then analyzed using two different approaches. Strong associations were observed between parasite responses to CQ and mutant pfcrt, and between quinine and pfmdr1. In addition, a strong association between mefloquine and dihydroartemisinin with SNPs in a SURFIN gene on chromosome 1 was observed. While it is unlikely that a SURFIN is involved in drug resistance, a neighboring causal gene may be associated with SNPs in this locus. This landmark study provides some encouragement to those interested in association analyses, given that strong associations were observed with a relatively low density of polymorphic SNPs. In the Southeast Asian population studied there were 1216 polymorphic SNPs with a minor allele frequency of more than 0.2 (one polymorphic SNP every 19 kb). Complete genome sequences are optimal for GWAS, because the high density SNP information generated increases the power to detect associations, and causative SNPs may be genotyped directly. We expect that next-generation sequencing will increasingly replace indirect genotyping methods for Plasmodium GWAS.
Genome-wide association studies may generate a number of candidate gene regions, some of which will be false positives. How can we identify and prioritize candidate regions and weed out these false positives? Drug resistance alleles have spread recently within parasite populations owing to strong selection. In this situation, neutral mutations ‘hitchhike’ to high frequency with selected mutations, generating haplotypes that show unexpectedly high levels of LD [57,58]. Sequence variation can be examined across the genome to identify regions where characteristic hitchhiking has occurred. Importantly, selection for traits other than drug resistance may also generate such signatures, so not all genome regions under selection will contain drug resistance genes. However, if GWAS hits coincide with independent evidence for selection at a particular genome region, this provides a strong rationale for prioritization. Two statistical approaches are particularly useful for detection of recent positive selection within genomes.
When alleles spread rapidly through populations, flanking mutations hitchhike alongside selected bases and generate islands of LD. LD is expected to be strong around newly arisen mutations, but to break down rapidly over time. Hence, the observation that common alleles at a locus show extended regions of LD relative to other alleles strongly suggests the action of selection. Long-range haplotype (LRH) tests identify genome regions containing unexpectedly long haplotypes by examining the size of haplotype blocks surrounding different alleles at a locus [59,60]. Mu et al. used this approach in their genome scan . LRH tests identified selective events that localize to pfcrt, pfmdr1 and the SURFIN locus on chromosome 1, providing proof-of-principle for the utility of this approach. Furthermore, these results were replicated in the three populations, giving confidence in the results. In the case of pfcrt, resistant alleles were fixed in the Southeast Asian population, precluding successful identification of this locus using standard LRH tests. However, implementation of a cross population Extended Haplotype Homozygosity (XP-EHH) test, which compares haplotype structure between populations, provided evidence for the action of strong selection at this locus.
Comparisons of genetic structure can provide powerful tests for selection. This was first suggested by Lewontin and Krakauer who reasoned that loci involved in local adaptation should show greater levels of differentiation between populations than neutrally evolving loci in which allele frequencies are determined by genetic drift alone . As an example, this approach was effective at identifying known selected genes in comparisons of different dog breeds  and different ecotypes of the plant Arabidopsis . Malaria drug treatment frequently varies between countries. As a consequence, resistant alleles show elevated population differentiation relative to neutral alleles. Genome-wide comparisons of FST, a measure of genetic differentiation, provide a robust method of identifying genome regions involved in local adaptation, as illustrated by two malaria studies [64,65]. Similarly, longitudinal sampling of parasite populations provides a powerful approach to identifying drug resistance genes, as resistance alleles are expected to increase in frequency under drug pressure.
Candidate regions identified by GWAS and/or selection scans may contain a number of promising candidate SNPs, of which only a subset may influence resistance. Determining which SNPs should be targeted for transfection is therefore critical. Biological features provide one obvious criterion that can be used to identify putative causative SNPs. For example, nonsynonymous SNPs (causing amino acid changes) that reside within the active site of an enzyme would provide a more promising candidate polymorphism than synonymous SNPs. Similarly, derived SNPs that have arisen within P. falciparum following divergence from the related species Plasmodium reichenowi and that have spread to high frequency are particularly good candidates. High-frequency derived alleles are expected to be enriched in sites targeted by selection . Derived SNPs can be identified by comparing P. falciparum with P. reichenowi.
This approach proved valuable for identifying the key genetic determinant of CQ resistance, namely pfcrt. Initial work revealed a number of SNPs that distinguished wild-type and resistant alleles. Comparison of resistance alleles from four independent origins (Southeast Asia/Africa, two independent South America alleles and Papua New Guinea) demonstrated that pfcrt K76T was the only SNP that distinguished all CQ-resistant strains from CQ-sensitive parasites . The key role for K76T in CQ resistance was subsequently confirmed by transfection studies .
SNPs that show strong association with drug resistance phenotypes based on GWAS studies, and that demonstrate evidence for strong recent selection, provide particularly attractive candidates. Evidence from multiple independent tests of selection can be combined to give such corroborative evidence a statistical basis. Grossman et al. applied this approach to localize SNPs that are targeted by selection in the human genome . Similarly, results from GWAS and selection scans can also be combined, using a similar total evidence approach .
Selection-based approaches allow drug-resistance determinants to be detected before resistance has evolved in the field. In vitro selection of drug-resistant P. falciparum is possible because this parasite can be propagated to very large numbers and maintained indefinitely under standard conditions of asexual culture . Selection is also possible in rodent parasites. Gene identification is made possible by incorporating sequencing, traditional genotyping tools and/or high-density oligonucleotide microarrays. Experimental selection can also help identify drug targets and unravel mechanisms of drug action, thereby benefitting the design of new or improved drugs.
The drug selection/molecular characterization approach to identifying drug resistance genes is only truly useful if the same genes are selected in both the laboratory and in the field. The results of selection experiments using pyrimethamine, CQ, mefloquine, halofantrine and atovaquone are interesting in this respect because the major genetic determinants of resistance in parasite isolates from endemic settings have now been clearly elucidated. The majority of the laboratory-based studies have identified mutations within the same genes as those selected in the field (Table 4). For example, mutations within pfcrt, dhfr and cytochrome b (cyt b) occur in independent selection experiments using CQ, pyrimethamine and atovaquone, while pfmdr1 copy number amplification is repeatedly observed under mefloquine selection. The same amino acids are affected in some cases. For example residues 16, 108 and 164 of DHFR have been observed to mutate following selection with pyrimethamine. However, in other experiments, while the same loci are affected in vitro and in vivo, different amino acid changes can occur in the laboratory (Table 4). For example, atovaquone selection experiments have identified multiple mutations in cytochrome b that are not observable in the field. Similarly, CQ selection (of an unusual CQ-sensitive line harboring all PfCRT mutations except the one at amino acid 76) resulted in a PfCRT K76I mutation rather than the K76T mutation that has arisen multiple times in the field. Similarly, while point mutations in dhfr underlie pyrimethamine resistance in nature, copy number amplification of this gene has been observed in selection experiments on several occasions. However, some studies have failed to find mutations in the expected genes. Lim et al. observed copy number amplification on chromosome 3 in CQ-selected parasites: deamplification resulted in loss of the resistance phenotype suggesting a causal relationship . Similarly, two studies failed to observe copy number or sequence changes in pfmdr1 following mefloquine or halofantrine selection [72,73].
Overall these studies demonstrate that: laboratory selection experiments tend to accurately identify the genes that respond to selection in the field; the changes observed may differ from those occurring in nature; and there are limited numbers of pathways for the evolution of drug resistance . These results are encouraging, but at odds with results from some other organisms. Laboratory-selected resistance to insecticides tends to have a polygenic basis , while in the field this trait is generally monogenic or oligogenic (i.e., based on one or a few genes) [76,77]. Population size and the strength of selection are key factors determining the outcome of selection experiments, as mutations causing large phenotypic effects are rare in relation to mutations with small effects [44,75]. Hence, experimental selection regimens involving small numbers of organisms are likely to select for many mutations of relatively small effect, while selection regimens involving large populations are more likely to select for fixation of fewer mutations of large phenotypic effect. This can also explain the polygenic nature of laboratory-derived mosquito resistance to insecticides [75,78–80], as there is a limit to the size of mosquito colonies used in selection experiments. In comparison, Plasmodium selection experiments characteristically involve large numbers of parasites (typically 107–109 per flask).
Plasmodium falciparum selection experiments are often conducted by gradually increasing drug doses (step-wise selection) in relatively small parasite populations (Table 4). Because the selection of rare mutations conferring high-level drug resistance requires large population sizes, an alternative selection protocol termed ‘single-step selection’, has also been applied. This involves exposing large populations of parasites to high doses of drugs [81–83]. This type of selection approximates that occurring in drug-treated parasite populations in patients, and has been successfully used to select parasites resistant to CQ, 5-fluoroorate and atovaquone [82–84]. In addition to population size, the success of drug selection experiments is determined by parasite mutation rate . Rathod et al. demonstrated that cultured P. falciparum lines can vary by up to 1000-fold in the rate at which they acquire drug resistance-conferring mutations . These authors reported that certain lines readily developed resistance to structurally and mechanistically unrelated compounds, a propensity termed ‘accelerated resistance to multiple drugs’. The use of parasites with a high mutation rate is one approach to increasing the probability of selecting for mutants.
The genetic background may also be critical to the success of selection experiments, as illustrated by the work on CQ resistance. Multiple mutations in pfcrt, forming geographically distinct haplotypes, are associated with CQ resistance in parasites from diverse geographical origins . In vitro selection of resistant parasites bearing the full complement of mutations necessary for viable CQ resistance is therefore highly unlikely when starting from a wild-type fully sensitive parasite. However, by using a CQ-sensitive parasite line that carried all but one mutation (at position 76), selection of CQ-resistant parasites was achieved [66,84]. When conducting selection experiments, it is critical to start with a newly cloned parasite line to purge the accumulation of neutral mutations that occur during long-term culture, as such neutral mutations complicate the detection of causal mutations underlying adaptation to drug selection.
The ultimate goal of in vitro drug selection experiments is to generate resistant parasites that can be characterized for functional mutations. This has previously been carried out by inspecting candidate genes. However, genomic approaches (next-generation sequencing and tiling microarrays) now allow genome-wide searches for mutations that have occurred following selection. A recent study, in which next-generation sequencing uncovered functional variants acquired by a laboratory-selected yeast strain, clearly demonstrates the feasibility of this approach . This approach is well suited for P. falciparum because of the small size of its genome and its haploid nature, which facilitates sequencing. However, the major challenge is how to distinguish functional variants from false positives. One direct approach is to generate replicate selection lines, if possible, in order to identify mutations that have repeatedly occurred in the same codon, gene or biochemical pathway. These can then be prioritized for verification by classical Sanger sequencing. Transfection can then provide a definitive proof of the functional role of identified variants (see below).
Elegant work using the rodent malaria parasite P. chabaudi provides proof-of-principle that the laboratory selection/sequencing strategy is feasible for Plasmodium (Figure 5) [38,86]. These studies used 36-bp single reads and 50-bp paired end reads on the Illumina Genome Analyzer to sequence parasite lines from a clonal lineage of P. chabaudi that had been exposed to selection with a succession of different drugs. This approach permitted the identification of point mutations occurring in lines selected with CQ, mefloquine, artemisinin, or sulfadoxine/pyrimethamine [38,87,88]. In these studies the genome regions containing putatively causal mutations were first defined using LGS or classical linkage mapping.
Tiling microarrays can also efficiently screen the parasite genome for copy number changes and point mutations. Dharia et al. used tiling microarrays to examine laboratory mutants showing resistance to fosmidomycin, leading them to identify a 100-kb amplification that contained 23 genes, including 1-deoxy-d-xylulose 5-phosphate reductoisomerase (pfdxr), the target of fosmidomycin . Recently, Rottman et al. used this approach to identify the P-type cation-transporter PfATPase4 as the cause of laboratory-selected resistance to the promising drug class spiroindolones . They selected six replicate cultures of the parasite line Dd2 with this drug and examined mutations in the generated resistant parasites using tiling arrays. All six resistant mutants contained nonsynonymous point mutations and/or copy number amplification at the pfatp4 locus, and direct causality was established by transfection. Similarly, Jiang et al. examined mutants derived following CQ and quinine selection experiments using both expression and comparative genomic hybridization microarrays . These authors observed multiple changes, including copy number amplification at pfmdr1, deletion of 15 genes from chromosome 10, and altered expression of ten genes including the vacuolar type H+ pumping pyrophosphatase 2 (PfVP2) as well as Ca2+/H+, drug/metabolite and lipid transporters.
The methods described in the earlier sections of this article allow the identification of candidate genes involved in drug response. Genetic manipulation represents a robust approach to conclusively determine whether these genes play a role in antimalarial drug resistance. In this section we describe the range of transfection methods available for Plasmodium, focusing on an appropriate use of these approaches to functionally analyze candidate resistance genes in P. falciparum (summarized in Table 5).
Since the first reports of transformation of Plasmodium asexual blood stages [91–94] were published, transfection technologies have developed into a versatile set of tools for the analysis of parasite gene function. In P. falciparum, DNA is introduced either by directly electroporating ring stage-infected erythrocytes, or by preloading erythrocytes that are subsequently infected (detailed protocols for both techniques can be found in , available from ). Transfection efficiency remains low – in the range of 10−6 , presumably because the transfected DNA has to cross four membranes to reach the nucleus. Owing to its AT-richness and repetitive nature, Plasmodium DNA is notoriously difficult to clone in Escherichia coli, and when plasmid size exceeds approximately 9.5 kb, plasmids often become unstable. All transfection approaches that require a gene to be cloned in its entirety are therefore technically challenging for larger candidate genes (approximately 3–4 kb; considering that most commonly used Plasmodium vector backbones are approximately 5–6 kb), although this depends on the individual gene sequence. If the entire sequence cannot be cloned, insights into gene function can be gained from expressing individual functional domains, or through approaches that do not require cloning the entire coding sequence (e.g., gene disruption or allelic exchange).
Available positive selectable markers include human DHFR , Aspergillus terreus blasticidin S deaminase (bsd) , neomycin phosphotransferase (neo) from transposon Tn5 , and Streptomyces alboninger puromycin-N-acetyltransferase (pac) . These confer resistance to WR99210, blasticidin, G418 and puromycin, respectively. Of these, human DHFR and bsd are the most widely used. In addition, negative selectable markers have recently become available, including Herpes simplex virus thymidine kinase (tk)  and yeast cytosine deaminase/uracil phosphoribosyl transferase (cdup) . These render parasites susceptible to the ‘suicide’ drugs ganciclovir and 5-fluorocytosine (5-FC), respectively. However, the usefulness of the tk system is limited by incomplete killing of parasites with a single marker gene copy, and a significant bystander effect on neighboring parasites that do not possess the marker gene .
The genetic background of the recipient strain can significantly impact results obtained from transfection experiments, and thus the choice of strain is critical. The cleanest experimental approach is to transfect candidate drug-resistance determinants into a drug-sensitive strain, and test for gain of resistance. This was achieved with pfcrt, whose mutant alleles (Dd2 and 7G8) were shown to confer CQ resistance in the GC03 background . Recently, a similar allelic exchange strategy with the 7G8 allele revealed that the degree of pfcrt-mediated CQ resistance was strain-dependent: in the D10 strain mutant pfcrt only conferred a tolerance phenotype, characterized by an ability to survive several generations of exposure to high concentrations of drug, and reduced susceptibility to the monodesethyl-CQ metabolite, contrasting with a barely altered IC50 value to the parent drug CQ . Similarly, mutations in the 3´ end of the pfmdr1 gene were found to affect the degree of CQ resistance in the 7G8 background but not in the 3BA6 background [104,105].
In P. falciparum, DNA is introduced in a circular form, so that even if integration is the ultimate goal, the plasmid will at first be maintained episomally. Growth rates of episomally transformed parasites are lower than those of parasites with an integrated selectable marker, as unequal segregation of the plasmid in mature schizont forms means that daughter cells will inherit different plasmid copy numbers and some daughter cells will not inherit any plasmid copies at all. This variability in plasmid copy number between individual cells is a major disadvantage of the episomal approach, as it renders the direct comparison of different parasite populations difficult. Nevertheless, episomal transformation can provide valuable insights into gene function. For example, episomal transformation provided the first evidence for the role of pfcrt mutations in CQ resistance . Episomal expression was also used to demonstrate the ability of human DHFR to mediate resistance to the potent antimalarial agent WR99210, illustrating functional complementation of the parasite dhfr target by the human ortholog . Resistance was lost upon sustained release of drug pressure leading to the attrition of episomes from the culture, providing an internal control that the resistance phenotype was a consequence of the introduced plasmid DNA. Episomal transformation can also be used to overexpress a candidate gene , if gene amplification is the suspected resistance mechanism. Finally, episomal transformation with an epitope-tagged version of a gene of interest offers a rapid way to determine protein localization in the cell . Of note, episomal plasmids are generally not maintained through the sexual and mosquito cycle.
In P. falciparum the integration of transfected DNA into the genome occurs infrequently, making the generation of stable genetically modified parasite lines an inefficient, slow and tedious process. Parasites tend to maintain the introduced plasmid DNA as large episomal concatemers. Long periods of culture with repeated rounds of cycling off and on drug are therefore required to obtain integration and to rid parasites of persistent episomes, so that the rare integrant parasites can be successfully selected. Integration predominantly occurs by single crossover homologous recombination and only rarely by double crossover homologous recombination.
Allelic exchange remains the ‘gold-standard experiment’ to prove that mutations in a candidate gene cause a certain phenotype. For example, replacement of a wild-type/drug-sensitive allele of a gene with a mutated allele from a drug-resistant strain should decrease parasite susceptibility to the drug in question. This strategy allows the comparison of different alleles expressed from the same, endogenous promoter within the same genetic background. Importantly, integration is usually achieved by single crossover homologous recombination, so that the endogenous allele is not actually removed from the genome. However, the allelic exchange strategy should be designed in way so that integration of the allelic exchange plasmid renders the endogenous allele nonfunctional (by separating it from its promoter and/or truncating the coding sequence) (Figure 6A).
While technically challenging, this approach has been tremendously useful in the study and confirmation of the genetic basis of drug-resistance in P. falciparum. The first to use allelic exchange as a means to study drug resistance determinants were Triglia and colleagues who showed that mutations in the pfdhps gene confer sulfadoxine resistance . Later, a series of allelic exchange experiments unequivocally demonstrated that mutations in the pfcrt gene can confer verapamil-reversible CQ resistance  and that this phenotype was critically dependent on presence of the K76T polymorphism . Allelic exchange also confirmed that pfmdr1 mutations could modulate the degree of parasite susceptibility to a wide variety of drugs [104,105].
In P. falciparum, gene disruptions have been traditionally achieved by single crossover homologous recombination (Figure 6B) . For example, a single crossover strategy was used to disrupt one of the two copies of pfmdr1 in the drug-resistant P. falciparum strain, resulting in increased parasite susceptibility to mefloquine, lumefantrine, halofantrine, quinine and artemisinin . Similarly, gene disruption by a single crossover recombination implicated a role for P. falciparum multidrug-resistance-associated protein (PfMRP) in the transport of glutathione and several antimalarial drugs .
A major drawback of the single crossover approach is the fact that the gene of interest is not actually removed from the genome. Homologous sequences can thus theoretically recombine again and restore the wild-type locus. This means that drug pressure has to be maintained even after obtaining a pure, clonal gene-disrupted parasite line. However, with the recent development of negative selectable markers it has now become possible to select for rare double crossover recombination events and thus achieve a true gene deletion (Figure 6C) . In a groundbreaking study, Maier and colleagues used this approach to successfully delete a staggering 53 genes encoding proteins exported into the host red blood cell .
A major restriction to the functional analysis of Plasmodium genes by gene targeting is presented by the haploid nature of the asexual blood stages, in which transformation and selection take place. This precludes the functional analysis of genes that are essential at this stage of development, or even of genes whose disruption results in severe growth defects. Repeated failure to disrupt a gene in well-controlled experiments can in fact provide supporting evidence for an essential role in the asexual blood stages – ideally, these experiments should demonstrate that the chromosomal gene locus is accessible to targeting if the gene in question is provided at the same time on a separate complementation plasmid .
Another approach to study the function of a gene of interest is the downregulation of its expression by decreasing mRNA stability through genetic truncation of the 3´-UTR. For example, single crossover-mediated exchange of the endogenous pfcrt 3´-UTR with a truncated version (known to reduce expression levels based on a luciferase reporter assay) resulted in a 30–40% decrease in PfCRT expression levels, and rendered the recombinant parasites less resistant to CQ . Integration of the same truncated pfcrt 3´-UTR in the pfnhe-1 locus had similar effects on PfNHE-1 expression levels and, in some genetic backgrounds, quinine resistance . This approach is particularly suitable for the functional analysis of genes that are refractory to disruption owing to their essential nature in the asexual blood stages.
The Bxb1 mycobacteriophage integrase system enables the rapid integration of plasmid-borne sequences into the P. falciparum genome. This occurs via site-specific recombination between chromosomal attB and plasmid-contained attP sites, which is mediated by the Bxb1 integrase provided on a helper plasmid  (Figure 6D). Using this system, genetically and phenotypically homogeneous recombinant parasite populations can be generated within as little as 18–24 days, making this system an attractive alternative to conventional transgene expression systems. The recombinant lines usually contain single copy integrations and are genetically stable, as the attL and attR sites resulting from the recombination event are asymmetric and thus refractory to excision by the integrase. The only limitation to this approach is that attB receptor sites must first be engineered into the genome using conventional methodologies (by either single or double crossover homologous recombination). However, several parasite lines containing the attB target sequence within the nonessential cg6 gene (in 3D7 and Dd2) , or subtelomeric var gene (in the cytoadherent A4 line)  have already been generated and are available from the Fidock laboratory, NY, USA.
Apicomplexa lack naturally occurring transposons, but the lepidopteran transposable element piggyBac has recently been adapted to function in P. falciparum . The piggyBac transposase, transiently expressed from a helper plasmid, mediates the integration of a drug selectable marker flanked by inverted terminal repeats into a TTAA target sequence . Integration occurs randomly, at a transformation efficiency of up to 10−3, and is stable in the absence of the transposase, making this system particularly useful for transgene expression, for trapping genes or gene elements (reviewed in ), and for random mutagenesis. In a ‘tour de force’ of 81 independent transfections, Balu and colleagues used this system to obtain 177 unique mutant clones, almost all with single insertions, which were distributed throughout the genome . However, the tendency for integration to occur in the 5´-UTR, and not the coding region itself, means that a vast number of mutants will be required to achieve the near saturation of the genome required for a random mutagenesis screen.
The recently developed FK506 binding protein (FKBP) destabilization domain system is particularly suitable for the expression of transgenes that have deleterious effects on parasite growth. The addition of a mutant version of human FKBP12 (FK506-binding protein 12, F36V L106P) to either the N- or C-terminus of a protein results in the degradation of the fusion protein, unless it is being protected from degradation by Shld1, an analog of the FKBP12-ligand rapamycin . This allows for the suppression of transgene expression during transfection and selection of recombinant parasites. The FKBP system is functional both for episomally expressed and integrated transgenes, the latter also enabling conditional knockdown approaches through addition of the FKBP domain to an endogenous gene .
Alternatively to the FKBP system, tetracycline-regulated transgene expression can be used to control expression of transgenes . In this system, transgene expression is activated by an artificial Toxoplasma gondii transactivator, which binds tet operator elements fused to a minimal promoter. In the presence of anhydrotetracycline, the transactivator cannot bind to the tet operator and transgene expression is rapidly, but reversibly, downregulated. Of note, low levels of anhydrotetracycline are nontoxic to the parasite even for prolonged periods. This allows the transfection of parasites with deleterious transgenes, as anhydrotetracyline can simply be added continuously to the culture medium to suppress transgene expression throughout the time required to select transfected parasites.
Transfection technologies have also been developed for several model malaria parasites, including the nonhuman primate malaria parasites Plasmodium knowlesi (reviewed in ) and Plasmodium cynomolgi , and the rodent malaria species Plasmodium berghei (discussed below), P. yoelii [125,126] and P. chabaudi . Transient transfection has also been reported for P. vivax but remains of limited practical value in the absence of a continuous in vitro culture system for this important human pathogen .
To date, transfection technologies are most advanced for P. berghei. In this species integration of transfected, linearized DNA occurs rapidly, efficiently and almost exclusively by homologous recombination. Combined with a transfection efficiency of 10−2 to 10−3 (when using a recently improved protocol) [129,130], this allows relatively fast and straightforward epitope tagging (often through single crossover homologous recombination) or disruption (usually achieved by double crossover homologous recombination) of genes of interest. Indeed, clonal double-crossover knockout lines can be obtained in as little as 4 weeks, allowing an elevated throughput functional analysis . This has resulted in the disruption of several dozens of genes to date (a database containing information on genetically modified rodent malaria parasite lines can be found at [132,202]). A recently developed conditional knockout method based on the Flp recombinase-mediated excision of a FRT-flanked gene sequence during the mosquito stages also enables the functional analysis of genes, which are essential for asexual blood stage growth, at other stages of development (notably the mosquito and liver stages) . Transgenes can also easily be stably inserted into the ssu-rrna target sequence of the functionally redundant c or d rrna genes . Finally, the allelic exchange of a P. berghei gene with its P. falciparum ortholog offers another exciting approach to study gene function that is particularly attractive for stages that are technically difficult or ethically impossible to obtain from the human malaria parasite [135,136].
One limitation of the P. berghei model is that drugs used for selection must not be toxic to the rodent host. This has allowed at most two consecutive rounds of genetic manipulation and drug selection, first using T. gondii dhfr-ts that confers resistance to pyrimethamine, then human DHFR that confers resistance to both pyrimethamine and WR99210 . However, the development of the negative selectable marker yeast cytosine deaminase/uridyl phosphoribosyl transferase (yFCU) now allows marker recycling , and recent dramatic improvements of transfection technologies also make it possible to use green fluorescent protein combined with fluorescence activated cell sorting as a selection procedure [129,130], further extending the usefulness and versatility of this malaria model system.
Genetic technologies have advanced rapidly since the first reports of Plasmodium transfection in 1995. Yet allelic exchange and knockout experiments in P. falciparum are still plagued by the requirement to introduce circular DNA and screen for rare events of homologous recombination. Understanding the molecular basis of why P. falciparum cannot be transfected by linear DNA to directly obtain double crossover events, and why P. berghei can, would greatly facilitate efforts to expedite the genetic manipulation of P. falciparum. Similarly, new DNA delivery techniques are required to permit whole-genome mutagenesis screens, and there is a striking need to identify new systems to stably propagate AT-rich Plasmodium DNA, as Escherichia coli is often unsuitable.
We list a number of research priorities in the executive summary. Here, we focus on one of these – the problem of delayed parasite clearance following ACT treatment – that is particularly urgent. Slow clearance of P. falciparum following treatment with artemisinin has now been confirmed in western Cambodia . Furthermore, this trait clearly has a genetic basis as genetically related parasites from different patients tend to show similar clearance rates . The spread of alleles conferring the delayed clearance to neighboring Southeast Asian countries or sub-Saharan Africa would be a major setback to control efforts. Identification of the genetic determinants of delayed clearance is a priority if we are to effectively manage this problem and prevent the spread of alleles underlying this phenotype.
A major challenge is now identifying a robust phenotype associated with delayed clearance, which can be measured in the laboratory, because available data shows that standard measures of resistance involving growth inhibition assays do not correlate with clearance times . Robust phenotype assays that can be replicated in cultured parasites would allow for the mapping of the determinants of delayed clearance using classical linkage mapping approaches or GWAS studies. Transcriptome analysis shows promise in this respect. Transcriptional signatures that are predictive of the delayed clearance phenotype could be used in lieu of parasitological phenotypes [Bozdech Z, Pers. Comm.] for genetic mapping purposes, just as cholesterol levels provide a useful proxy phenotype for studies of heart disease in humans.
If replicated laboratory measurement of this phenotype proves difficult, two alternative approaches might prove useful. First, GWAS studies may be possible using the delayed clearance phenotype itself. The recent report that parasite genetics explains more than 50% of the variance in clearance rate is encouraging in this respect, although confidence intervals are large . The accuracy with which this trait can be measured in the field could also be improved by a pseudoreplication strategy. In Cambodia, identical parasite genotypes are commonly sampled from different patients. Inclusion of multiple identical genotypes does not contribute to the power of GWAS studies, but averaging clearance rates across like genotypes can improve accuracy of phenotype measurement.
Second, scans for selection may be effective at identifying genome regions involved in delayed clearance. Clearance phenotypes differ dramatically between western Cambodia and neighboring countries in Southeast Asia, while gene flow between these locations is otherwise high [31,64,65]. We would therefore expect alleles conferring delayed clearance to be at a high frequency in western Cambodia, but at much lower frequencies in neighboring countries with faster rates of clearance. Systematic scans for genome regions showing high FST or use of the XP-EHH statistic provide two approaches to identify genome regions exposed to strong selection in Cambodia. The rapid and effective mobilization of available genomic methods is essential to rapidly determine the genetic basis of this phenotype and thus aid in preventing its spread.
The authors recieved funding support from the NIH: R01 AI048071 and R01 AI075145 (to Tim Anderson) and R01 AI50234 and 085584 (to David Fidock). David Fidock also gratefully acknowledges funding support from the Medicines for Malaria Venture (Geneva). Andrea Ecker is currently supported by a long-term fellowship from the International Human Frontier Science Program Organization.
Financial & competing interests disclosure
The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
Papers of special note have been highlighted as:
of considerable interest