Although asexual reproduction via clonal propagation has been proposed as the principal reproductive mechanism across parasitic protozoa of the Leishmania genus, sexual recombination has long been suspected, based on hybrid marker profiles detected in field isolates from different geographical locations. The recent experimental demonstration of a sexual cycle in Leishmania within sand flies has confirmed the occurrence of hybridisation, but knowledge of the parasite life cycle in the wild still remains limited. Here, we use whole genome sequencing to investigate the frequency of sexual reproduction in Leishmania, by sequencing the genomes of 11 Leishmania infantum isolates from sand flies and 1 patient isolate in a focus of cutaneous leishmaniasis in the Çukurova province of southeast Turkey. This is the first genome-wide examination of a vector-isolated population of Leishmania parasites. A genome-wide pattern of patchy heterozygosity and SNP density was observed both within individual strains and across the whole group. Comparisons with other Leishmania donovani complex genome sequences suggest that these isolates are derived from a single cross of two diverse strains with subsequent recombination within the population. This interpretation is supported by a statistical model of the genomic variability for each strain compared to the L. infantum reference genome strain as well as genome-wide scans for recombination within the population. Further analysis of these heterozygous blocks indicates that the two parents were phylogenetically distinct. Patterns of linkage disequilibrium indicate that this population reproduced primarily clonally following the original hybridisation event, but that some recombination also occurred. This observation allowed us to estimate the relative rates of sexual and asexual reproduction within this population, to our knowledge the first quantitative estimate of these events during the Leishmania life cycle.
Sexual reproduction is predicted to be a rare event in Leishmania parasites, as evidenced by detection of rare parasite hybrids in natural populations using molecular methods. Recently, a sexual cycle has been detected experimentally in parasites within the sand fly vector (that transmits this pathogenic microorganism to mammalian species including man, causing human leishmaniasis). In this study, we have used whole genome sequencing to investigate genetic variation at the highest level of resolution in Leishmania parasites isolated from sand flies in a defined focus of leishmaniasis in southeast Turkey. Using a range of analytical tools, we show that variation in these parasites arose following a single cross between two diverse strains and subsequent recombination between the progeny, despite mainly clonal reproduction in the parasite population. We have thus been able to derive quantitative estimates of the relative rates of sexual and asexual reproduction during the Leishmania life cycle for the first time, information that will be critical to our understanding of the epidemiology and evolution of this genus.
Defining mechanisms by which Plasmodium virulence is regulated is central to understanding the pathogenesis of human malaria. Serial blood passage of Plasmodium through rodents1-3, primates4 or humans5 increases parasite virulence, suggesting that vector transmission regulates Plasmodium virulence within the mammalian host. In agreement, disease severity can be modified by vector transmission6-8, which is assumed to ‘reset’ Plasmodium to its original character3. However, direct evidence that vector transmission regulates Plasmodium virulence is lacking. Here we utilise mosquito transmission of serially blood passaged (SBP) Plasmodium chabaudi chabaudi9 to interrogate regulation of parasite virulence. Analysis of SBP P.c. chabaudi before and after mosquito transmission demonstrates that vector transmission intrinsically modifies the asexual blood-stage parasite, which in turn, modifies the elicited mammalian immune response, which in turn, attenuates parasite growth and associated pathology. Attenuated parasite virulence associates with modified expression of the pir multi-gene family. Vector transmission of Plasmodium therefore regulates gene expression of probable variant antigens in the erythrocytic cycle, modifies the elicited mammalian immune response, and thus regulates parasite virulence. These results place the mosquito at the centre of our efforts to dissect mechanisms of protective immunity to malaria for the development of an effective vaccine.
The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host–parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.
Chromatin diminution is the programmed elimination of specific DNA sequences during development. It occurs in diverse species, but the function(s) of diminution and the specificity of sequence loss remain largely unknown. Diminution in the nematode Ascaris suum occurs during early embryonic cleavages and leads to the loss of germline genome sequences and the formation of a distinct genome in somatic cells. We found that ~43 Mb (~13%) of genome sequence is eliminated in A. suum somatic cells, including ~12.7 Mb of unique sequence. The eliminated sequences and location of the DNA breaks are the same in all somatic lineages from a single individual, and between different individuals. At least 685 genes are eliminated. These genes are preferentially expressed in the germline and during early embryogenesis. We propose that diminution is a mechanism of germline gene regulation that specifically removes a large number of genes involved in gametogenesis and early embryogenesis.
Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
Genome assembly; validation; evaluation
Genome projects now produce draft assemblies within weeks thanks to advanced high-throughput sequencing technologies. For milestone projects like E. coli or H. sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT, post-assembly genome-improvement toolkit) to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence, and to exploit reference genomes (if available) for improving scaffolding and generating annotations. The protocol is most accessible for bacterial and small Eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes approximately 24 hours: it doubles the average contig size and annotates over 4300 gene models.
Next generation sequencing; automatic finishing; gap closing; genome annotation; contig ordering
The cell surface of Trypanosoma brucei, like many protistan blood parasites, is crucial for mediating host-parasite interactions and is instrumental to the initiation, maintenance and severity of infection. Previous comparisons with the related trypanosomatid parasites T. cruzi and Leishmania major suggest that the cell-surface proteome of T. brucei is largely taxon-specific. Here we compare genes predicted to encode cell surface proteins of T. brucei with those from two related African trypanosomes, T. congolense and T. vivax. We created a cell surface phylome (CSP) by estimating phylogenies for 79 gene families with putative surface functions to understand the more recent evolution of African trypanosome surface architecture. Our findings demonstrate that the transferrin receptor genes essential for bloodstream survival in T. brucei are conserved in T. congolense but absent from T. vivax and include an expanded gene family of insect stage-specific surface glycoproteins that includes many currently uncharacterized genes. We also identify species-specific features and innovations and confirm that these include most expression site-associated genes (ESAGs) in T. brucei, which are absent from T. congolense and T. vivax. The CSP presents the first global picture of the origins and dynamics of cell surface architecture in African trypanosomes, representing the principal differences in genomic repertoire between African trypanosome species and provides a basis from which to explore the developmental and pathological differences in surface architectures. All data can be accessed at: http://www.genedb.org/Page/trypanosoma_surface_phylome.
The African trypanosome (Trypanosoma brucei) is a single-celled, vector-borne parasite that causes Human African Trypanosomiasis (or ‘sleeping sickness’) throughout sub-Saharan Africa and, along with related species T. congolense and T. vivax, a similar disease in wild and domestic animals. Together, the African trypanosomes have significant effects on human and animal health and associated costs for socio-economic development in Africa. Genes expressed on the trypanosome cell surface are instrumental in causing disease and sustaining infection by resisting the host immune system. Here we compare repertoires of genes with predicted cell-surface expression in T. brucei, T. congolense and T. vivax and estimate the phylogeny of each predicted cell-surface gene family. This ‘cell-surface phylome’ (CSP) provides a detailed analysis of species-specific gene families and of gene gain and loss in shared families, aiding the identification of surface proteins that may mediate specific aspects of pathogenesis and disease progression. Overall, the CSP suggests that each trypanosome species has modified its surface proteome uniquely, indicating that T. brucei, T. congolense and T. vivax have subtly distinct mechanisms for interacting with both vertebrate and insect hosts.
Schistosome infection begins with the penetration of cercariae through healthy unbroken host skin. This process leads to the transformation of the free-living larvae into obligate parasites called schistosomula. This irreversible transformation, which occurs in as little as two hours, involves casting the cercaria tail and complete remodelling of the surface membrane. At this stage, parasites are vulnerable to host immune attack and oxidative stress. Consequently, the mechanisms by which the parasite recognises and swiftly adapts to the human host are still the subject of many studies, especially in the context of development of intervention strategies against schistosomiasis infection. Because obtaining enough material from in vivo infections is not always feasible for such studies, the transformation process is often mimicked in the laboratory by application of shear pressure to a cercarial sample resulting in mechanically transformed (MT) schistosomula. These parasites share remarkable morphological and biochemical similarity to the naturally transformed counterparts and have been considered a good proxy for parasites undergoing natural infection. Relying on this equivalency, MT schistosomula have been used almost exclusively in high-throughput studies of gene expression, identification of drug targets and identification of effective drugs against schistosomes. However, the transcriptional equivalency between skin-transformed (ST) and MT schistosomula has never been proven. In our approach to compare these two types of schistosomula preparations and to explore differences in gene expression triggered by the presence of a skin barrier, we performed RNA-seq transcriptome profiling of ST and MT schistosomula at 24 hours post transformation. We report that these two very distinct schistosomula preparations differ only in the expression of 38 genes (out of ∼11,000), providing convincing evidence to resolve the skin vs. mechanical long-lasting controversy.
Schistosomiasis is an endemic parasitic disease affecting ∼200 million people in the most socioeconomically deprived regions of the world. Human infection occurs during water contact where free-living larvae called cercariae penetrate host skin and become parasitic organisms called schistosomula. This stage represents the first encounter of the parasites with the host and is also regarded as one of the most vulnerable stages of the parasite's life cycle. Therefore, schistosomula are the focus of many studies, many of which look at changes in the expression of genes as a way of understanding the process of infection, identifying potential drug targets and vaccine candidates. Because collecting enough parasitic material from natural infections is not possible for certain types of studies (for example, gene expression studies), a mechanical transformation of the cercariae into schistosomula is often used instead and assumed as a good proxy for the natural transformation process. However, the equivalency of gene expression profiles between naturally transformed parasites and the mechanically transformed counterparts has never been studied. In this report, we analyse differences in gene expression patterns between these two different parasite preparations and provide enough data to resolve a long-lasting controversy.
The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.
Molecular interactions between a parasite and its host are key to the ability of the parasite to enter the host and persist. Our understanding of the genes and proteins involved in these interactions is limited. To better understand these processes it would be advantageous to have a range of methods to predict pairs of genes involved in such interactions. Correlated gene expression profiles can be used to identify molecular interactions within a species. Here we have extended the concept to different species, showing that genes with correlated expression are more likely to encode proteins, which directly or indirectly participate in host–parasite interaction. We go on to examine our predictions of molecular interactions between the malaria parasite and both its mammalian host and insect vector. Our approach could be applied to study any interaction between species, for example, between a host and its parasites or pathogens, but also symbiotic and commensal pairings.
The cestode Echinococcus granulosus - the agent of cystic echinococcosis, a zoonosis affecting humans and domestic animals worldwide - is an excellent model for the study of host-parasite cross-talk that interfaces with two mammalian hosts. To develop the molecular analysis of these interactions, we carried out an EST survey of E. granulosus larval stages. We report the salient features of this study with a focus on genes reflecting physiological adaptations of different parasite stages.
We generated ∼10,000 ESTs from two sets of full-length enriched libraries (derived from oligo-capped and trans-spliced cDNAs) prepared with three parasite materials: hydatid cyst wall, larval worms (protoscoleces), and pepsin/H+-activated protoscoleces. The ESTs were clustered into 2700 distinct gene products. In the context of the biology of E. granulosus, our analyses reveal: (i) a diverse group of abundant long non-protein coding transcripts showing homology to a middle repetitive element (EgBRep) that could either be active molecular species or represent precursors of small RNAs (like piRNAs); (ii) an up-regulation of fermentative pathways in the tissue of the cyst wall; (iii) highly expressed thiol- and selenol-dependent antioxidant enzyme targets of thioredoxin glutathione reductase, the functional hub of redox metabolism in parasitic flatworms; (iv) candidate apomucins for the external layer of the tissue-dwelling hydatid cyst, a mucin-rich structure that is critical for survival in the intermediate host; (v) a set of tetraspanins, a protein family that appears to have expanded in the cestode lineage; and (vi) a set of platyhelminth-specific gene products that may offer targets for novel pan-platyhelminth drug development.
This survey has greatly increased the quality and the quantity of the molecular information on E. granulosus and constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic analyses focused on cestodes and platyhelminths.
Cestodes are a neglected group of platyhelminth parasites, despite causing chronic infections to humans and domestic animals worldwide. We used Echinococcus granulosus as a model to study the molecular basis of the host-parasite cross-talk during cestode infections. For this purpose, we carried out a survey of the genes expressed by parasite larval stages interfacing with definitive and intermediate hosts. Sequencing from several high quality cDNA libraries provided numerous insights into the expression of genes involved in important aspects of E. granulosus biology, e.g. its metabolism (energy production and antioxidant defences) and the synthesis of key parasite structures (notably, the one exposed to humans and livestock intermediate hosts). Our results also uncovered the existence of an intriguing set of abundant repeat-associated non-protein coding transcripts that may participate in the regulation of gene expression in all surveyed stages. The dataset now generated constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic studies focused on cestodes and platyhelminths. In particular, the detailed characterization of a range of newly discovered genes will contribute to a better understanding of the biology of cestode infections and, therefore, to the development of products allowing their efficient control.
The new release of SchistoDB (http://SchistoDB.net) provides a rich resource of genomic data for key blood flukes (genus Schistosoma) which cause disease in hundreds of millions of people worldwide. SchistoDB integrates whole-genome sequence and annotation of three species of the genus and provides enhanced bioinformatics analyses and data-mining tools. A simple, yet comprehensive web interface provided through the Strategies Web Development Kit is available for the mining and visualization of the data. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, gene ontology terms, sequence motifs, protein characteristics and phylogenetic relationships. Search strategies can be saved within a user’s profile for future retrieval and may also be shared with other researchers using a unique web address.
The infectious form of many parasitic nematodes, which afflict over one billion people globally, is a developmentally arrested third-stage larva (L3i). The parasitic nematode Strongyloides stercoralis differs from other nematode species that infect humans, in that its life cycle includes both parasitic and free-living forms, which can be leveraged to investigate the mechanisms of L3i arrest and activation. The free-living nematode Caenorhabditis elegans has a similar developmentally arrested larval form, the dauer, whose formation is controlled by four pathways: cyclic GMP (cGMP) signaling, insulin/IGF-1-like signaling (IIS), transforming growth factor β (TGFβ) signaling, and biosynthesis of dafachronic acid (DA) ligands that regulate a nuclear hormone receptor. We hypothesized that homologous pathways are present in S. stercoralis, have similar developmental regulation, and are involved in L3i arrest and activation. To test this, we undertook a deep-sequencing study of the polyadenylated transcriptome, generating over 2.3 billion paired-end reads from seven developmental stages. We constructed developmental expression profiles for S. stercoralis homologs of C. elegans dauer genes identified by BLAST searches of the S. stercoralis genome as well as de novo assembled transcripts. Intriguingly, genes encoding cGMP pathway components were coordinately up-regulated in L3i. In comparison to C. elegans, S. stercoralis has a paucity of genes encoding IIS ligands, several of which have abundance profiles suggesting involvement in L3i development. We also identified seven S. stercoralis genes encoding homologs of the single C. elegans dauer regulatory TGFβ ligand, three of which are only expressed in L3i. Putative DA biosynthetic genes did not appear to be coordinately regulated in L3i development. Our data suggest that while dauer pathway genes are present in S. stercoralis and may play a role in L3i development, there are significant differences between the two species. Understanding the mechanisms governing L3i development may lead to novel treatment and control strategies.
Parasitic nematodes infect over one billion people worldwide and cause many diseases, including strongyloidiasis, filariasis, and hookworm disease. For many of these parasites, including Strongyloides stercoralis, the infectious form is a developmentally arrested and long-lived thirdstage larva (L3i). Upon encountering a host, L3i quickly resume development and mature into parasitic adults. In the free-living nematode Caenorhabditis elegans, a similar developmentally arrested third-stage larva, known as the dauer, is regulated by four key cellular mechanisms. We hypothesized that similar cellular mechanisms control L3i arrest and activation. Therefore, we used deep-sequencing technology to characterize the S. stercoralis transcriptome (RNAseq), which allowed us to identify S. stercoralis homologs of components of these four mechanisms and examine their temporal regulation. We found similar temporal regulation between S. stercoralis and C. elegans for components of two mechanisms, but dissimilar temporal regulation for two others, suggesting conserved as well as novel modes of developmental regulation for L3i. Understanding L3i development may lead to novel control strategies as well as new treatments for strongyloidiasis and other diseases caused by parasitic nematodes.
The concept of specific chemotherapy was developed a century ago by Paul Ehrlich and others. Dyes and arsenical compounds that displayed selectivity against trypanosomes were central to this work 1,2, and the drugs that emerged remain in use for treating Human African Trypanosomiasis (HAT) 3. Ehrlich recognised the importance of understanding the mechanisms underlying selective drug action and resistance for the development of improved HAT therapies, but these mechanisms have remained largely mysterious. Here, we use all five current HAT drugs for genome-scale RNA interference (RNAi) target sequencing (RIT-seq) screens in Trypanosoma brucei, revealing the transporters, organelles, enzymes and metabolic pathways that function to facilitate anti-trypanosomal drug action. RIT-seq profiling identifies both known drug importers 4,5 and the only known pro-drug activator 6, and links more than fifty additional genes to drug action. A specific bloodstream stage invariant surface glycoprotein (ISG75) family mediates suramin uptake while the AP-1 adaptin complex, lysosomal proteases and major lysosomal transmembrane protein, as well as spermidine and N-acetylglucosamine biosynthesis all contribute to suramin action. Further screens link ubiquinone availability to nitro-drug action, plasma membrane P-type H+-ATPases to pentamidine action, and trypanothione and multiple putative kinases to melarsoprol action. We also demonstrate a major role for aquaglyceroporins in pentamidine and melarsoprol cross-resistance. These advances in our understanding of mechanisms of anti-trypanosomal drug efficacy and resistance will aid the rational design of new therapies and help to combat drug resistance, and provide unprecedented levels of molecular insight into the mode of action of anti-trypanosomal drugs.
DFMO; eflornithine; ISG75; nifurtimox; RNAi
Functional studies will facilitate characterization of role and essentiality of newly available genome sequences of the human schistosomes, Schistosoma mansoni, S. japonicum and S. haematobium. To develop transgenesis as a functional approach for these pathogens, we previously demonstrated that pseudotyped murine leukemia virus (MLV) can transduce schistosomes leading to chromosomal integration of reporter transgenes and short hairpin RNA cassettes. Here we investigated vertical transmission of transgenes through the developmental cycle of S. mansoni after introducing transgenes into eggs. Although MLV infection of schistosome eggs from mouse livers was efficient in terms of snail infectivity, >10-fold higher transgene copy numbers were detected in cercariae derived from in vitro laid eggs (IVLE). After infecting snails with miracidia from eggs transduced by MLV, sequencing of genomic DNA from cercariae released from the snails also revealed the presence of transgenes, demonstrating that transgenes had been transmitted through the asexual developmental cycle, and thereby confirming germline transgenesis. High-throughput sequencing of genomic DNA from schistosome populations exposed to MLV mapped widespread and random insertion of transgenes throughout the genome, along each of the autosomes and sex chromosomes, validating the utility of this approach for insertional mutagenesis. In addition, the germline-transmitted transgene encoding neomycin phosphotransferase rescued cultured schistosomules from toxicity of the antibiotic G418, and PCR analysis of eggs resulting from sexual reproduction of the transgenic worms in mice confirmed that retroviral transgenes were transmitted to the next (F1) generation. These findings provide the first description of wide-scale, random insertional mutagenesis of chromosomes and of germline transmission of a transgene in schistosomes. Transgenic lines of schistosomes expressing antibiotic resistance could advance functional genomics for these significant human pathogens.
Sequence data from this study have been submitted to the European Nucleotide Archive (http://www.ebi.ac.uk/embl) under accession number ERP000379.
Schistosomes, or blood flukes, are responsible for the major neglected tropical disease called schistosomiasis, which afflicts over 200 million people in impoverished regions of the developing world. The genome sequence of these parasites has been decoded. Integration sites of retroviral transgenes into the chromosomes of schistosomes were investigated by high-throughput sequencing. Transgene integrations were mapped to the genome sequence of Schistosoma mansoni. Integrations were distributed apparently randomly across each of the eight chromosomes, including the seven autosomes and the sex chromosomes Z and W. Integration events of transgenes were characterized in chromosomes of cercariae that were progeny of schistosome eggs infected with pseudotyped virions. Also, transgenic cercariae were employed to infect mice and transgenes were detected in the F1 eggs. Together these findings confirmed vertical transmission of transgenes through the schistosome germline, through both the asexual and the sexual reproductive phases of the developmental cycle. Moreover, germline-transmitted retroviral transgenes encoding drug resistance to the aminoglycoside antibiotics allowed schistosomes to survive toxic concentrations of the antibiotic G418. These findings represent the first reports of wide-scale insertional mutagenesis of schistosome chromosomes and vertical transmission of a transgene through the schistosome germline.
Genome sequencing of many eukaryotic pathogens and the volume of data available on public resources have created a clear requirement for a consistent vocabulary to describe the range of developmental forms of parasites. Consistent labeling of experimental data and external data, in databases and the literature, is essential for integration, cross database comparison, and knowledge discovery. The primary objective of this work was to develop a dynamic and controlled vocabulary that can be used for various parasites. The paper describes the Ontology for Parasite Lifecycle (OPL) and discusses its application in parasite research.
The OPL is based on the Basic Formal Ontology (BFO) and follows the rules set by the OBO Foundry consortium. The first version of the OPL models complex life cycle stage details of a range of parasites, such as Trypanosoma sp., Leishmaniasp., Plasmodium sp., and Shicstosoma sp. In addition, the ontology also models necessary contextual details, such as host information, vector information, and anatomical locations. OPL is primarily designed to serve as a reference ontology for parasite life cycle stages that can be used for database annotation purposes and in the lab for data integration or information retrieval as exemplified in the application section below.
OPL is freely available at http://purl.obolibrary.org/obo/opl.owl and has been submitted to the BioPortal site of NCBO and to the OBO Foundry. We believe that database and phenotype annotations using OPL will help run fundamental queries on databases to know more about gene functions and to find intervention targets for various parasites. The OPL is under continuous development and new parasites and/or terms are being added.
The pir genes comprise the largest multi-gene family in Plasmodium, with members found in P. vivax, P. knowlesi and the rodent malaria species. Despite comprising up to 5% of the genome, little is known about the functions of the proteins encoded by pir genes. P. chabaudi causes chronic infection in mice, which may be due to antigenic variation. In this model, pir genes are called cirs and may be involved in this mechanism, allowing evasion of host immune responses. In order to fully understand the role(s) of CIR proteins during P. chabaudi infection, a detailed characterization of the cir gene family was required.
The cir repertoire was annotated and a detailed bioinformatic characterization of the encoded CIR proteins was performed. Two major sub-families were identified, which have been named A and B. Members of each sub-family displayed different amino acid motifs, and were thus predicted to have undergone functional divergence. In addition, the expression of the entire cir repertoire was analyzed via RNA sequencing and microarray. Up to 40% of the cir gene repertoire was expressed in the parasite population during infection, and dominant cir transcripts could be identified. In addition, some differences were observed in the pattern of expression between the cir subgroups at the peak of P. chabaudi infection. Finally, specific cir genes were expressed at different time points during asexual blood stages.
In conclusion, the large number of cir genes and their expression throughout the intraerythrocytic cycle of development indicates that CIR proteins are likely to be important for parasite survival. In particular, the detection of dominant cir transcripts at the peak of P. chabaudi infection supports the idea that CIR proteins are expressed, and could perform important functions in the biology of this parasite. Further application of the methodologies described here may allow the elucidation of CIR sub-family A and B protein functions, including their contribution to antigenic variation and immune evasion.
Toxoplasma gondii is a zoonotic protozoan parasite which infects nearly one third of the human population and is found in an extraordinary range of vertebrate hosts. Its epidemiology depends heavily on horizontal transmission, especially between rodents and its definitive host, the cat. Neospora caninum is a recently discovered close relative of Toxoplasma, whose definitive host is the dog. Both species are tissue-dwelling Coccidia and members of the phylum Apicomplexa; they share many common features, but Neospora neither infects humans nor shares the same wide host range as Toxoplasma, rather it shows a striking preference for highly efficient vertical transmission in cattle. These species therefore provide a remarkable opportunity to investigate mechanisms of host restriction, transmission strategies, virulence and zoonotic potential. We sequenced the genome of N. caninum and transcriptomes of the invasive stage of both species, undertaking an extensive comparative genomics and transcriptomics analysis. We estimate that these organisms diverged from their common ancestor around 28 million years ago and find that both genomes and gene expression are remarkably conserved. However, in N. caninum we identified an unexpected expansion of surface antigen gene families and the divergence of secreted virulence factors, including rhoptry kinases. Specifically we show that the rhoptry kinase ROP18 is pseudogenised in N. caninum and that, as a possible consequence, Neospora is unable to phosphorylate host immunity-related GTPases, as Toxoplasma does. This defense strategy is thought to be key to virulence in Toxoplasma. We conclude that the ecological niches occupied by these species are influenced by a relatively small number of gene products which operate at the host-parasite interface and that the dominance of vertical transmission in N. caninum may be associated with the evolution of reduced virulence in this species.
Coccidian parasites have a major impact on human and animal health world-wide and are among the most successful and widespread parasitic protozoa. They include Neospora caninum which is a leading cause of abortion in cattle and one of its nearest relatives, Toxoplasma gondii. Despite its close phylogenetic relationship to Toxoplasma, Neospora has a far more restricted host range, does not infect humans and its epidemiology depends predominantly on efficient vertical transmission. The divergent biology of these two closely related species provides a unique opportunity to study the mechanisms of host specificity, pathogenesis and zoonotic potential not only in these, but other Coccidia. We have sequenced the genome of Neospora and the transcriptomes of both species to show that despite diverging some 28 million years ago, both genome and gene expression remain remarkably conserved. Evolution has focused almost exclusively on molecules which control the interaction of the parasite with the host cell. We show that some secreted invasion-related proteins and surface genes which are known to control virulence and host cell interactions in Toxoplasma are dramatically altered in their expression and functionality in Neospora and propose that evolution of these genes may underpin the ecological niches inhabited by coccidian parasites.
So-called next-generation sequencing (NGS) has provided the ability to sequence on a massive scale at low cost, enabling biologists to perform powerful experiments and gain insight into biological processes. BamView has been developed to visualize and analyse sequence reads from NGS platforms, which have been aligned to a reference sequence. It is a desktop application for browsing the aligned or mapped reads [Ruffalo, M, LaFramboise, T, Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011;27:2790–6] at different levels of magnification, from nucleotide level, where the base qualities can be seen, to genome or chromosome level where overall coverage is shown. To enable in-depth investigation of NGS data, various views are provided that can be configured to highlight interesting aspects of the data. Multiple read alignment files can be overlaid to compare results from different experiments, and filters can be applied to facilitate the interpretation of the aligned reads. As well as being a standalone application it can be used as an integrated part of the Artemis genome browser, BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome. Single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions. The application will also calculate simple analyses of the read mapping, including reporting the read counts and reads per kilobase per million mapped reads (RPKM) for genes selected by the user.
Availability: BamView and Artemis are freely available software. These can be downloaded from their home pages:
Requirements: Java 1.6 or higher.
genome browser; next-generation sequencing; visualization; Artemis; BamView
Schistosomiasis is one of the most prevalent parasitic diseases, affecting millions of people in developing countries. Amongst the human-infective species, Schistosoma mansoni is also the most commonly used in the laboratory and here we present the systematic improvement of its draft genome. We used Sanger capillary and deep-coverage Illumina sequencing from clonal worms to upgrade the highly fragmented draft 380 Mb genome to one with only 885 scaffolds and more than 81% of the bases organised into chromosomes. We have also used transcriptome sequencing (RNA-seq) from four time points in the parasite's life cycle to refine gene predictions and profile their expression. More than 45% of predicted genes have been extensively modified and the total number has been reduced from 11,807 to 10,852. Using the new version of the genome, we identified trans-splicing events occurring in at least 11% of genes and identified clear cases where it is used to resolve polycistronic transcripts. We have produced a high-resolution map of temporal changes in expression for 9,535 genes, covering an unprecedented dynamic range for this organism. All of these data have been consolidated into a searchable format within the GeneDB (www.genedb.org) and SchistoDB (www.schistodb.net) databases. With further transcriptional profiling and genome sequencing increasingly accessible, the upgraded genome will form a fundamental dataset to underpin further advances in schistosome research.
Schistosomiasis is a disease caused by parasitic blood flukes of the genus Schistosoma. Human-infective species are prevalent in developing countries, where they represent a major disease burden as well as an impediment to socioeconomic development. In addition to its clinical relevance, Schistosoma mansoni is the species most widely used for laboratory experimentation. In 2009, the first draft of the S. mansoni and S. japonicum genomes were published. Both genome sequences represented a great step forward for schistosome research, but their highly fragmented nature compromised the quality of potential downstream analyses. In this study, we have substantially improved both the genome and the transcriptome resources for S. mansoni. We collated existing data and added deep DNA sequence data from clonal worms and RNA sequence data from four key time points in the life cycle of the parasite. We were able to identify transcribed regions to single-base resolution and have profiled gene expression from the free-living larvae to the early human parasitic stage. We uncovered extensive use of single transcripts from multiple genes, which the organism subsequently resolves by trans-splicing. All data from this study comprise a major new release of the genome, which is publicly and easily accessible.
MicroRNAs (miRNAs) play key roles in regulating post-transcriptional gene expression and are essential for development in the free-living nematode Caenorhabditis elegans and in higher organisms. Whether microRNAs are involved in regulating developmental programs of parasitic nematodes is currently unknown. Here we describe the the miRNA repertoire of two important parasitic nematodes as an essential first step in addressing this question.
The small RNAs from larval and adult stages of two parasitic species, Brugia pahangi and Haemonchus contortus, were identified using deep-sequencing and bioinformatic approaches. Comparative analysis to known miRNA sequences reveals that the majority of these miRNAs are novel. Some novel miRNAs are abundantly expressed and display developmental regulation, suggesting important functional roles. Despite the lack of conservation in the miRNA repertoire, genomic positioning of certain miRNAs within or close to specific coding genes is remarkably conserved across diverse species, indicating selection for these associations. Endogenous small-interfering RNAs and Piwi-interacting (pi)RNAs, which regulate gene and transposon expression, were also identified. piRNAs are expressed in adult stage H. contortus, supporting a conserved role in germline maintenance in some parasitic nematodes.
This in-depth comparative analysis of nematode miRNAs reveals the high level of divergence across species and identifies novel sequences potentially involved in development. Expression of novel miRNAs may reflect adaptations to different environments and lifestyles. Our findings provide a detailed foundation for further study of the evolution and function of miRNAs within nematodes and for identifying potential targets for intervention.
► Microsatellite typing of Leishmania donovani complex isolates discriminates intercontinental groups. ► Genome-wide SNP profiling reveals diversity in a homogeneous population. ► Identification of a novel divergent lineage within a small geographic region. ► SNP-typing of samples resistant and sensitive to treatment drugs.
The species of the Leishmania donovani species complex cause visceral leishmaniasis, a debilitating infectious disease transmitted by sandflies. Understanding molecular changes associated with population structure in these parasites can help unravel their epidemiology and spread in humans. In this study, we used a panel of standard microsatellite loci and genome-wide SNPs to investigate population-level diversity in L. donovani strains recently isolated from a small geographic area spanning India, Bihar and Nepal, and compared their variation to that found in diverse strains of the L. donovani complex isolates from Europe, Africa and Asia. Microsatellites and SNPs could clearly resolve the phylogenetic relationships of the strains between continents, and microsatellite phylogenies indicated that certain older Indian strains were closely related to African strains. In the context of the anti-malaria spraying campaigns in the 1960s, this was consistent with a pattern of episodic population size contractions and clonal expansions in these parasites that was supported by population history simulations. In sharp contrast to the low resolution provided by microsatellites, SNPs retained a much more fine-scale resolution of population-level variability to the extent that they identified four different lineages from the same region one of which was more closely related to African and European strains than to Indian or Nepalese ones. Joining results of in vitro testing the antimonial drug sensitivity with the phylogenetic signals from the SNP data highlighted protein-level mutations revealing a distinct drug-resistant group of Nepalese and Indian L. donovani. This study demonstrates the power of genomic data for exploring parasite population structure. Furthermore, markers defining different genetic groups have been discovered that could potentially be applied to investigate drug resistance in clinical Leishmania strains.
MLMT, multi-locus microsatellite typing; VL, visceral leishmaniasis; CL, cutaneous leishmaniasis; PKDL, post kala-azar dermal leishmaniasis; SSG, sodium stibogluconate; Leishmania infantum; Visceral leishmaniasis; Diversity; Population markers; Population genetics; Drug resistance
Motivation: High-throughput sequencing (HTS) technologies have made low-cost sequencing of large numbers of samples commonplace. An explosion in the type, not just number, of sequencing experiments has also taken place including genome re-sequencing, population-scale variation detection, whole transcriptome sequencing and genome-wide analysis of protein-bound nucleic acids.
Results: We present Artemis as a tool for integrated visualization and computational analysis of different types of HTS datasets in the context of a reference genome and its corresponding annotation.
Availability: Artemis is freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute websites: http://www.sanger.ac.uk/resources/software/artemis/.
Candida parapsilosis is one of the most common causes of Candida infection worldwide. However, the genome sequence annotation was made without experimental validation and little is known about the transcriptional landscape. The transcriptional response of C. parapsilosis to hypoxic (low oxygen) conditions, such as those encountered in the host, is also relatively unexplored.
We used next generation sequencing (RNA-seq) to determine the transcriptional profile of C. parapsilosis growing in several conditions including different media, temperatures and oxygen concentrations. We identified 395 novel protein-coding sequences that had not previously been annotated. We removed > 300 unsupported gene models, and corrected approximately 900. We mapped the 5' and 3' UTR for thousands of genes. We also identified 422 introns, including two introns in the 3' UTR of one gene. This is the first report of 3' UTR introns in the Saccharomycotina. Comparing the introns in coding sequences with other species shows that small numbers have been gained and lost throughout evolution. Our analysis also identified a number of novel transcriptional active regions (nTARs). We used both RNA-seq and microarray analysis to determine the transcriptional profile of cells grown in normoxic and hypoxic conditions in rich media, and we showed that there was a high correlation between the approaches. We also generated a knockout of the UPC2 transcriptional regulator, and we found that similar to C. albicans, Upc2 is required for conferring resistance to azole drugs, and for regulation of expression of the ergosterol pathway in hypoxia.
We provide the first detailed annotation of the C. parapsilosis genome, based on gene predictions and transcriptional analysis. We identified a number of novel ORFs and other transcribed regions, and detected transcripts from approximately 90% of the annotated protein coding genes. We found that the transcription factor Upc2 role has a conserved role as a major regulator of the hypoxic response in C. parapsilosis and C. albicans.
Transcriptional profiling, pathogenesis, RNA-seq, Candida