Generating the raw data for a de novo genome assembly project for a target eukaryotic species is relatively easy. This democratization of access to large-scale data has allowed many research teams to plan to assemble the genomes of non-model organisms. These new genome targets are very different from the traditional, inbred, laboratory-reared model organisms. They are often small, and cannot be isolated free of their environment – whether ingested food, the surrounding host organism of parasites, or commensal and symbiotic organisms attached to or within the individuals sampled. Preparation of pure DNA originating from a single species can be technically impossible, but assembly of mixed-organism DNA can be difficult, as most genome assemblers perform poorly when faced with multiple genomes in different stoichiometries. This class of problem is common in metagenomic datasets that deliberately try to capture all the genomes present in an environment, but replicon assembly is not often the goal of such programs. Here we present an approach to extracting, from mixed DNA sequence data, subsets that correspond to single species’ genomes and thus improving genome assembly. We use both numerical (proportion of GC bases and read coverage) and biological (best-matching sequence in annotated databases) indicators to aid partitioning of draft assembly contigs, and the reads that contribute to those contigs, into distinct bins that can then be subjected to rigorous, optimized assembly, through the use of taxon-annotated GC-coverage plots (TAGC plots). We also present Blobsplorer, a tool that aids exploration and selection of subsets from TAGC-annotated data. Partitioning the data in this way can rescue poorly assembled genomes, and reveal unexpected symbionts and commensals in eukaryotic genome projects. The TAGC plot pipeline script is available from https://github.com/blaxterlab/blobology, and the Blobsplorer tool from https://github.com/mojones/Blobsplorer.
next-generation sequencing; metagenomics; assembly; parasites; symbionts; commensals; contaminants
Chromatin diminution is the programmed elimination of specific DNA sequences during development. It occurs in diverse species, but the function(s) of diminution and the specificity of sequence loss remain largely unknown. Diminution in the nematode Ascaris suum occurs during early embryonic cleavages and leads to the loss of germline genome sequences and the formation of a distinct genome in somatic cells. We found that ~43 Mb (~13%) of genome sequence is eliminated in A. suum somatic cells, including ~12.7 Mb of unique sequence. The eliminated sequences and location of the DNA breaks are the same in all somatic lineages from a single individual, and between different individuals. At least 685 genes are eliminated. These genes are preferentially expressed in the germline and during early embryogenesis. We propose that diminution is a mechanism of germline gene regulation that specifically removes a large number of genes involved in gametogenesis and early embryogenesis.
Next-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it. Often, following publication, details of the annotation effort are only available in summary form, hindering subsequent exploitation of the data. To promote best-practice in annotation and to ensure that data remain accessible, we have written afterParty, a web application that allows users to assemble, annotate and publish novel transcriptomes using only a web browser.
afterParty is a robust web application that implements best-practice transcriptome assembly, annotation, browsing, searching, and visualization. Users can turn a collection of reads (from Roche 454 chemistry) or assembled contigs (from any sequencing chemistry, including Illumina Solexa RNA-Seq) into a searchable, browsable transcriptome resource and quickly make it publicly available. Contigs are functionally annotated based on similarity to known sequences and protein domains. Once assembled and annotated, transcriptomes derived from multiple species or libraries can be compared and searched. afterParty datasets can either be created using the existing afterParty server, or using local instances that can be built easily using a virtual machine. afterParty includes powerful visualization tools for transcriptome dataset exploration and uses a flexible annotation architecture which will allow additional types of annotation to be added in the future.
afterParty's main use case scenario is one in which a working biologist has generated a large volume of transcribed sequence data and wishes to turn it into a useful resource that has some durability. By reducing the effort, bioinformatics skills, and computational resources needed to annotate and publish a transcriptome, afterParty will facilitate the annotation and sharing of sequence data that would otherwise remain unavailable. A typical metazoan transcriptome containing several tens of thousands of contigs can be annotated in a few minutes of interactive time and a few days of computational time.
Transcriptome; Assembly; Annotation
Wolbachia, endosymbiotic bacteria of the order Rickettsiales, are widespread in arthropods but also present in nematodes. In arthropods, A and B supergroup Wolbachia are generally associated with distortion of host reproduction. In filarial nematodes, including some human parasites, multiple lines of experimental evidence indicate that C and D supergroup Wolbachia are essential for the survival of the host, and here the symbiotic relationship is considered mutualistic. The origin of this mutualistic endosymbiosis is of interest for both basic and applied reasons: How does a parasite become a mutualist? Could intervention in the mutualism aid in treatment of human disease? Correct rooting and high-quality resolution of Wolbachia relationships are required to resolve this question. However, because of the large genetic distance between Wolbachia and the nearest outgroups, and the limited number of genomes so far available for large-scale analyses, current phylogenies do not provide robust answers. We therefore sequenced the genome of the D supergroup Wolbachia endosymbiont of Litomosoides sigmodontis, revisited the selection of loci for phylogenomic analyses, and performed a phylogenomic analysis including available complete genomes (from isolates in supergroups A, B, C, and D). Using 90 orthologous genes with reliable phylogenetic signals, we obtained a robust phylogenetic reconstruction, including a highly supported root to the Wolbachia phylogeny between a (A + B) clade and a (C + D) clade. Although we currently lack data from several Wolbachia supergroups, notably F, our analysis supports a model wherein the putatively mutualist endosymbiotic relationship between Wolbachia and nematodes originated from a single transition event.
Wolbachia; phylogenomics; mutualism; Litomosoides sigmodontis; endosymbiosis
The left-right asymmetry of snails, including the direction of shell coiling, is determined by the delayed effect of a maternal gene on the chiral twist that takes place during early embryonic cell divisions. Yet, despite being a well-established classical problem, the identity of the gene and the means by which left-right asymmetry is established in snails remain unknown. We here demonstrate the power of new genomic approaches for identification of the chirality gene, “D”. First, heterozygous (Dd) pond snails Lymnaea stagnalis were self-fertilised or backcrossed, and the genotype of more than six thousand offspring inferred, either dextral (DD/Dd) or sinistral (dd). Then, twenty of the offspring were used for Restriction-site-Associated DNA Sequencing (RAD-Seq) to identify anonymous molecular markers that are linked to the chirality locus. A local genetic map was constructed by genotyping three flanking markers in over three thousand snails. The three markers lie either side of the chirality locus, with one very tightly linked (<0.1 cM). Finally, bacterial artificial chromosomes (BACs) were isolated that contained the three loci. Fluorescent in situ hybridization (FISH) of pachytene cells showed that the three BACs tightly cluster on the same bivalent chromosome. Fibre-FISH identified a region of greater that ∼0.4 Mb between two BAC clone markers that must contain D. This work therefore establishes the resources for molecular identification of the chirality gene and the variation that underpins sinistral and dextral coiling. More generally, the results also show that combining genomic technologies, such as RAD-Seq and high resolution FISH, is a robust approach for mapping key loci in non-model systems.
Summary: High-quality draft genomes are now easy to generate, as sequencing and assembly costs have dropped dramatically. However, building a user-friendly searchable Web site and database for a newly annotated genome is not straightforward. Here we present Badger, a lightweight and easy-to-install genome exploration environment designed for next generation non-model organism genomes.
Availability: Badger is released under the GPL and is available at http://badger.bio.ed.ac.uk/. We show two working examples: (i) a test dataset included with the source code, and (ii) a collection of four filarial nematode genomes.
Linking behavioural phenotypes to their underlying genotypes is crucial for uncovering the mechanisms that underpin behaviour and for understanding the origins and maintenance of genetic variation in behaviour. Recently, interest has begun to focus on the transcriptome as a route for identifying genes and gene pathways associated with behaviour. For many behavioural traits studied at the phenotypic level, we have little or no idea of where to start searching for “candidate” genes: the transcriptome provides such a starting point. Here we consider transcriptomic changes associated with oviposition in the parasitoid wasp Nasonia vitripennis. Oviposition is a key behaviour for parasitoids, as females are faced with a variety of decisions that will impact offspring fitness. These include choosing between hosts of differing quality, as well as making decisions regarding clutch size and offspring sex ratio. We compared the whole-body transcriptomes of resting or ovipositing female Nasonia using a “DeepSAGE” gene expression approach on the Illumina sequencing platform. We identified 332 tags that were significantly differentially expressed between the two treatments, with 77% of the changes associated with greater expression in resting females. Oviposition therefore appears to focus gene expression away from a number of physiological processes, with gene ontologies suggesting that aspects of metabolism may be down-regulated during egg-laying. Nine of the most abundant differentially expressed tags showed greater expression in ovipositing females though, including the genes purity-of-essence (associated with behavioural phenotypes in Drosophila) and glucose dehydrogenase (GLD). The GLD protein has been implicated in sperm storage and release in Drosophila and so provides a possible candidate for the control of sex allocation by female Nasonia during oviposition. Oviposition in Nasonia therefore clearly modifies the transcriptome, providing a starting point for the genetic dissection of oviposition.
In the last decade, many diverse RNAi (RNA interference) pathways have been discovered that mediate gene silencing at epigenetic, transcriptional and post-transcriptional levels. The diversity of RNAi pathways is inherently linked to the evolution of Ago (Argonaute) proteins, the central protein component of RISCs (RNA-induced silencing complexes). An increasing number of diverse Agos have been identified in different species. The functions of most of these proteins are not yet known, but they are generally assumed to play roles in development, genome stability and/or protection against viruses. Recent research in the nematode Caenorhabditis elegans has expanded the breadth of RNAi functions to include transgenerational epigenetic memory and, possibly, environmental sensing. These functions are inherently linked to the production of secondary siRNAs (small interfering RNAs) that bind to members of a clade of WAGOs (worm-specific Agos). In the present article, we review briefly what is known about the evolution and function of Ago proteins in eukaryotes, including the expansion of WAGOs in nematodes. We postulate that the rapid evolution of WAGOs enables the exceptional functional plasticity of nematodes, including their capacity for parasitism.
Argonaute; helminth; microRNA (miRNA); nematode; RNA interference (RNAi); small interfering RNA (siRNA); Ago, Argonaute; ALG, Ago-like gene; At, Arabidopsis thaliana; CSR, chromosome segregation- and RNAi-deficient; miRNA, microRNA; piRNA, piwi-interacting RNA; PRG, Piwi-related gene; RdRP, RNA-dependent RNA polymerase; RDE, RNAi-defective; RISC, RNA-induced silencing complex; RNAi, RNA interference; siRNA, small interfering RNA; WAGO, worm-specific Ago
Anguillicola crassus is an economically and ecologically important parasitic nematode of eels. The native range of A. crassus is in East Asia, where it infects Anguilla japonica, the Japanese eel. A. crassus was introduced into European eels, Anguilla anguilla, 30 years ago. The parasite is more pathogenic in its new host than in its native one, and is thought to threaten the endangered An. anguilla across its range. The molecular bases for the increased pathogenicity of the nematodes in their new hosts is not known.
A reference transcriptome was assembled for A. crassus from Roche 454 pyrosequencing data. Raw reads (756,363 total) from nematodes from An. japonica and An. anguilla hosts were filtered for likely host contaminants and ribosomal RNAs. The remaining 353,055 reads were assembled into 11,372 contigs of a high confidence assembly (spanning 6.6 Mb) and an additional 21,153 singletons and contigs of a lower confidence assembly (spanning an additional 6.2 Mb). Roughly 55% of the high confidence assembly contigs were annotated with domain- or protein sequence similarity derived functional information. Sequences conserved only in nematodes, or unique to A. crassus were more likely to have secretory signal peptides. Thousands of high quality single nucleotide polymorphisms were identified, and coding polymorphism was correlated with differential expression between individual nematodes. Transcripts identified as being under positive selection were enriched in peptidases. Enzymes involved in energy metabolism were enriched in the set of genes differentially expressed between European and Asian A. crassus.
The reference transcriptome of A. crassus is of high quality, and will serve as a basis for future work on the invasion biology of this important parasite. The polymorphisms identified will provide a key tool set for analysis of population structure and identification of genes likely to be involved in increased pathogenicity in European eel hosts. The identification of peptidases under positive selection is a first step in this programme.
Heliconius butterflies represent a recent radiation of species, in which wing pattern divergence has been implicated in speciation. Several loci that control wing pattern phenotypes have been mapped and two were identified through sequencing. These same gene regions play a role in adaptation across the whole Heliconius radiation. Previous studies of population genetic patterns at these regions have sequenced small amplicons. Here, we use targeted next-generation sequence capture to survey patterns of divergence across these entire regions in divergent geographical races and species of Heliconius. This technique was successful both within and between species for obtaining high coverage of almost all coding regions and sufficient coverage of non-coding regions to perform population genetic analyses. We find major peaks of elevated population differentiation between races across hybrid zones, which indicate regions under strong divergent selection. These ‘islands’ of divergence appear to be more extensive between closely related species, but there is less clear evidence for such islands between more distantly related species at two further points along the ‘speciation continuum’. We also sequence fosmid clones across these regions in different Heliconius melpomene races. We find no major structural rearrangements but many relatively large (greater than 1 kb) insertion/deletion events (including gain/loss of transposable elements) that are variable between races.
Heliconius; colour pattern; divergence; target enrichment; speciation; genomic islands
The cestode Echinococcus granulosus - the agent of cystic echinococcosis, a zoonosis affecting humans and domestic animals worldwide - is an excellent model for the study of host-parasite cross-talk that interfaces with two mammalian hosts. To develop the molecular analysis of these interactions, we carried out an EST survey of E. granulosus larval stages. We report the salient features of this study with a focus on genes reflecting physiological adaptations of different parasite stages.
We generated ∼10,000 ESTs from two sets of full-length enriched libraries (derived from oligo-capped and trans-spliced cDNAs) prepared with three parasite materials: hydatid cyst wall, larval worms (protoscoleces), and pepsin/H+-activated protoscoleces. The ESTs were clustered into 2700 distinct gene products. In the context of the biology of E. granulosus, our analyses reveal: (i) a diverse group of abundant long non-protein coding transcripts showing homology to a middle repetitive element (EgBRep) that could either be active molecular species or represent precursors of small RNAs (like piRNAs); (ii) an up-regulation of fermentative pathways in the tissue of the cyst wall; (iii) highly expressed thiol- and selenol-dependent antioxidant enzyme targets of thioredoxin glutathione reductase, the functional hub of redox metabolism in parasitic flatworms; (iv) candidate apomucins for the external layer of the tissue-dwelling hydatid cyst, a mucin-rich structure that is critical for survival in the intermediate host; (v) a set of tetraspanins, a protein family that appears to have expanded in the cestode lineage; and (vi) a set of platyhelminth-specific gene products that may offer targets for novel pan-platyhelminth drug development.
This survey has greatly increased the quality and the quantity of the molecular information on E. granulosus and constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic analyses focused on cestodes and platyhelminths.
Cestodes are a neglected group of platyhelminth parasites, despite causing chronic infections to humans and domestic animals worldwide. We used Echinococcus granulosus as a model to study the molecular basis of the host-parasite cross-talk during cestode infections. For this purpose, we carried out a survey of the genes expressed by parasite larval stages interfacing with definitive and intermediate hosts. Sequencing from several high quality cDNA libraries provided numerous insights into the expression of genes involved in important aspects of E. granulosus biology, e.g. its metabolism (energy production and antioxidant defences) and the synthesis of key parasite structures (notably, the one exposed to humans and livestock intermediate hosts). Our results also uncovered the existence of an intriguing set of abundant repeat-associated non-protein coding transcripts that may participate in the regulation of gene expression in all surveyed stages. The dataset now generated constitutes a valuable resource for gene prediction on the parasite genome and for further genomic and proteomic studies focused on cestodes and platyhelminths. In particular, the detailed characterization of a range of newly discovered genes will contribute to a better understanding of the biology of cestode infections and, therefore, to the development of products allowing their efficient control.
The heartworm Dirofilaria immitis is an important parasite of dogs. Transmitted by mosquitoes in warmer climatic zones, it is spreading across southern Europe and the Americas at an alarming pace. There is no vaccine, and chemotherapy is prone to complications. To learn more about this parasite, we have sequenced the genomes of D. immitis and its endosymbiont Wolbachia. We predict 10,179 protein coding genes in the 84.2 Mb of the nuclear genome, and 823 genes in the 0.9-Mb Wolbachia genome. The D. immitis genome harbors neither DNA transposons nor active retrotransposons, and there is very little genetic variation between two sequenced isolates from Europe and the United States. The differential presence of anabolic pathways such as heme and nucleotide biosynthesis hints at the intricate metabolic interrelationship between the heartworm and Wolbachia. Comparing the proteome of D. immitis with other nematodes and with mammalian hosts, we identify families of potential drug targets, immune modulators, and vaccine candidates. This genome sequence will support the development of new tools against dirofilariasis and aid efforts to combat related human pathogens, the causative agents of lymphatic filariasis and river blindness.—Godel, C., Kumar, S., Koutsovoulos, G., Ludin, P., Nilsson, D., Comandatore, F., Wrobel, N., Thompson, M., Schmid, C. D., Goto, S., Bringaud, F., Wolstenholme, A., Bandi, C., Epe, C., Kaminsky, R., Blaxter, M., Mäser, P. The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets.
comparative genomics; filaria; transposon; Wolbachia
Caenorhabditis elegans is a preeminent model organism, but the natural ecology of this nematode has been elusive. A four-year survey of French orchards published in BMC Biology reveals thriving populations of C. elegans (and Caenorhabditis briggsae) in rotting fruit and plant stems. Rather than being simply a 'soil nematode', C. elegans appears to be a 'plant-rot nematode'. These studies signal a growing interest in the integrated genomics and ecology of these tractable animals.
See research article http://www.biomedcentral.com/1741-7007/10/59
Restriction site-associated DNA sequencing (RAD-Seq) is a genome complexity reduction technique that facilitates large-scale marker discovery and genotyping by sequencing. Recent applications of RAD-Seq have included linkage and QTL mapping with a particular focus on non-model species. In the current study, we have applied RAD-Seq to two Atlantic salmon families from a commercial breeding program. The offspring from these families were classified into resistant or susceptible based on survival/mortality in an Infectious Pancreatic Necrosis (IPN) challenge experiment, and putative homozygous resistant or susceptible genotype at a major IPN-resistance QTL. From each family, the genomic DNA of the two heterozygous parents and seven offspring of each IPN phenotype and genotype was digested with the SbfI enzyme and sequenced in multiplexed pools.
Sequence was obtained from approximately 70,000 RAD loci in both families and a filtered set of 6,712 segregating SNPs were identified. Analyses of genome-wide RAD marker segregation patterns in the two families suggested SNP discovery on all 29 Atlantic salmon chromosome pairs, and highlighted the dearth of male recombination. The use of pedigreed samples allowed us to distinguish segregating SNPs from putative paralogous sequence variants resulting from the relatively recent genome duplication of salmonid species. Of the segregating SNPs, 50 were linked to the QTL. A subset of these QTL-linked SNPs were converted to a high-throughput assay and genotyped across large commercial populations of IPNV-challenged salmon fry. Several SNPs showed highly significant linkage and association with resistance to IPN, and population linkage-disequilibrium-based SNP tests for resistance were identified.
We used RAD-Seq to successfully identify and characterise high-density genetic markers in pedigreed aquaculture Atlantic salmon. These results underline the effectiveness of RAD-Seq as a tool for rapid and efficient generation of QTL-targeted and genome-wide marker data in a large complex genome, and its possible utility in farmed animal selection programs.
Atlantic salmon; RAD sequencing; Aquaculture; Infectious pancreatic necrosis; Recombination; Single nucleotide polymorphism; Paralogous sequence variant
Anguillicolidae Yamaguti, 1935 is a family of parasitic nematode infecting fresh-water eels of the genus Anguilla, comprising five species in the genera Anguillicola and Anguillicoloides. Anguillicoloides crassus is of particular importance, as it has recently spread from its endemic range in the Eastern Pacific to Europe and North America, where it poses a significant threat to new, naïve hosts such as the economic important eel species Anguilla anguilla and Anguilla rostrata. The Anguillicolidae are therefore all potentially invasive taxa, but the relationships of the described species remain unclear. Anguillicolidae is part of Spirurina, a diverse clade made up of only animal parasites, but placement of the family within Spirurina is based on limited data.
We generated an extensive DNA sequence dataset from three loci (the 5' one-third of the nuclear small subunit ribosomal RNA, the D2-D3 region of the nuclear large subunit ribosomal RNA and the 5' half of the mitochondrial cytochrome c oxidase I gene) for the five species of Anguillicolidae and used this to investigate specific and generic boundaries within the family, and the relationship of Anguillicolidae to other spirurine nematodes. Neither nuclear nor mitochondrial sequences supported monophyly of Anguillicoloides. Genetic diversity within the African species Anguillicoloides papernai was suggestive of cryptic taxa, as was the finding of distinct lineages of Anguillicoloides novaezelandiae in New Zealand and Tasmania. Phylogenetic analysis of the Spirurina grouped the Anguillicolidae together with members of the Gnathostomatidae and Seuratidae.
The Anguillicolidae is part of a complex radiation of parasitic nematodes of vertebrates with wide host diversity (chondrichthyes, teleosts, squamates and mammals), most closely related to other marine vertebrate parasites that also have complex life cycles. Molecular analyses do not support the recent division of Anguillicolidae into two genera. The described species may hide cryptic taxa, identified here by DNA taxonomy, and this DNA barcoding approach may assist in tracking species invasions. The propensity for host switching, and thus the potential for invasive behaviour, is found in A. crassus, A. novaezelandiae and A. papernai, and thus may be common to the group.
Anguillicola; Anguillicoloides; Invasive; Host switch; Cryptic species; DNA-taxonomy; Barcoding
Drug resistance in the malaria parasite Plasmodium falciparum severely compromises the treatment and control of malaria. A knowledge of the critical mutations conferring resistance to particular drugs is important in understanding modes of drug action and mechanisms of resistances. They are required to design better therapies and limit drug resistance.
A mutation in the gene (pfcrt) encoding a membrane transporter has been identified as a principal determinant of chloroquine resistance in P. falciparum, but we lack a full account of higher level chloroquine resistance. Furthermore, the determinants of resistance in the other major human malaria parasite, P. vivax, are not known. To address these questions, we investigated the genetic basis of chloroquine resistance in an isogenic lineage of rodent malaria parasite P. chabaudi in which high level resistance to chloroquine has been progressively selected under laboratory conditions.
Loci containing the critical genes were mapped by Linkage Group Selection, using a genetic cross between the high-level chloroquine-resistant mutant and a genetically distinct sensitive strain. A novel high-resolution quantitative whole-genome re-sequencing approach was used to reveal three regions of selection on chr11, chr03 and chr02 that appear progressively at increasing drug doses on three chromosomes. Whole-genome sequencing of the chloroquine-resistant parent identified just four point mutations in different genes on these chromosomes. Three mutations are located at the foci of the selection valleys and are therefore predicted to confer different levels of chloroquine resistance. The critical mutation conferring the first level of chloroquine resistance is found in aat1, a putative aminoacid transporter.
Quantitative trait loci conferring selectable phenotypes, such as drug resistance, can be mapped directly using progressive genome-wide linkage group selection. Quantitative genome-wide short-read genome resequencing can be used to reveal these signatures of drug selection at high resolution. The identities of three genes (and mutations within them) conferring different levels of chloroquine resistance generate insights regarding the genetic architecture and mechanisms of resistance to chloroquine and other drugs. Importantly, their orthologues may now be evaluated for critical or accessory roles in chloroquine resistance in human malarias P. vivax and P. falciparum.
Second-generation sequencing has made possible the sequencing of genomes of interest for even small research groups. However, obtaining separate clean cultures and clonal or inbred samples of metazoan hosts and their bacterial symbionts is often difficult. We present a computational pipeline for separating metazoan and bacterial DNA in silico rather than at the bench. The method relies on the generation of deep coverage of all the genomes in a mixed sample using Illumina short-read sequencing technology, and using aggregate properties of the different genomes to identify read sets belonging to each. This inexpensive and rapid approach has been used to sequence several nematode genomes and their bacterial endosymbionts in the last year in our laboratory and can also be used to visualize and identify unexpected contaminants (or possible symbionts) in genomic DNA samples. We hope that this method will enable researchers studying symbiotic systems to move from gene-centric to genome-centric approaches.
Symbiont; Second-generation sequencing; Genome; Nematode; Illumina
Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis.
To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response genes tested was upregulated in response to desiccation. These were the antioxidants glutathione peroxidase, dj-1 and 1-Cys peroxiredoxin, an shsp sequence and an lea gene.
P. superbus appears to utilise a strategy of combined constitutive and inducible gene expression in preparation for entry into anhydrobiosis. The apparent lineage expansion of lea genes, together with their constitutive and inducible expression, suggests that LEA3 proteins are important components of the anhydrobiotic protection repertoire of P. superbus.
The sequencing of the complete genome of the nematode Caenorhabditis elegans was a landmark achievement and ushered in a new era of whole-organism, systems analyses of the biology of this powerful model organism. The success of the C. elegans genome sequencing project also inspired communities working on other organisms to approach genome sequencing of their species. The phylum Nematoda is rich and diverse and of interest to a wide range of research fields from basic biology through ecology and parasitic disease. For all these communities, it is now clear that access to genome scale data will be key to advancing understanding, and in the case of parasites, developing new ways to control or cure diseases. The advent of second-generation sequencing technologies, improvements in computing algorithms and infrastructure and growth in bioinformatics and genomics literacy is making the addition of genome sequencing to the research goals of any nematode research program a less daunting prospect. To inspire, promote and coordinate genomic sequencing across the diversity of the phylum, we have launched a community wiki and the 959 Nematode Genomes initiative (www.nematodegenomes.org/). Just as the deciphering of the developmental lineage of the 959 cells of the adult hermaphrodite C. elegans was the gateway to broad advances in biomedical science, we hope that a nematode phylogeny with (at least) 959 sequenced species will underpin further advances in understanding the origins of parasitism, the dynamics of genomic change and the adaptations that have made Nematoda one of the most successful animal phyla.
genome; nematode; next-generation sequencing; second-generation sequencing; wiki
Understanding polyphenism, the ability of a single genome to express multiple morphologically and behaviourally distinct phenotypes, is an important goal for evolutionary and developmental biology. Polyphenism has been key to the evolution of the Hymenoptera, and particularly the social Hymenoptera where the genome of a single species regulates distinct larval stages, sexual dimorphism and physical castes within the female sex. Transcriptomic analyses of social Hymenoptera will therefore provide unique insights into how changes in gene expression underlie such complexity. Here we describe gene expression in individual specimens of the pre-adult stages, sexes and castes of the key pollinator, the buff-tailed bumblebee Bombus terrestris.
cDNA was prepared from mRNA from five life cycle stages (one larva, one pupa, one male, one gyne and two workers) and a total of 1,610,742 expressed sequence tags (ESTs) were generated using Roche 454 technology, substantially increasing the sequence data available for this important species. Overlapping ESTs were assembled into 36,354 B. terrestris putative transcripts, and functionally annotated. A preliminary assessment of differences in gene expression across non-replicated specimens from the pre-adult stages, castes and sexes was performed using R-STAT analysis. Individual samples from the life cycle stages of the bumblebee differed in the expression of a wide array of genes, including genes involved in amino acid storage, metabolism, immunity and olfaction.
Detailed analyses of immune and olfaction gene expression across phenotypes demonstrated how transcriptomic analyses can inform our understanding of processes central to the biology of B. terrestris and the social Hymenoptera in general. For example, examination of immunity-related genes identified high conservation of important immunity pathway components across individual specimens from the life cycle stages while olfactory-related genes exhibited differential expression with a wider repertoire of gene expression within adults, especially sexuals, in comparison to immature stages. As there is an absence of replication across the samples, the results of this study are preliminary but provide a number of candidate genes which may be related to distinct phenotypic stage expression. This comprehensive transcriptome catalogue will provide an important gene discovery resource for directed programmes in ecology, evolution and conservation of a key pollinator.
Next-generation sequencing technologies are making a substantial impact on many areas of biology, including the analysis of genetic diversity in populations. However, genome-scale population genetic studies have been accessible only to well-funded model systems. Restriction-site associated DNA sequencing, a method that samples at reduced complexity across target genomes, promises to deliver high resolution population genomic data—thousands of sequenced markers across many individuals—for any organism at reasonable costs. It has found application in wild populations and non-traditional study species, and promises to become an important technology for ecological population genomics.
RADSeq; population genetics; next-generation sequencing; genetic marker discovery; SNP discovery
Genome sequencing has been democratized by second-generation technologies, and even small labs can sequence metazoan genomes now. In this article, we describe ‘959 Nematode Genomes’—a community-curated semantic wiki to coordinate the sequencing efforts of individual labs to collectively sequence 959 genomes spanning the phylum Nematoda. The main goal of the wiki is to track sequencing projects that have been proposed, are in progress, or have been completed. Wiki pages for species and strains are linked to pages for people and organizations, using machine- and human-readable metadata that users can query to see the status of their favourite worm. The site is based on the same platform that runs Wikipedia, with semantic extensions that allow the underlying taxonomy and data storage models to be maintained and updated with ease compared with a conventional database-driven web site. The wiki also provides a way to track and share preliminary data if those data are not polished enough to be submitted to the official sequence repositories. In just over a year, this wiki has already fostered new international collaborations and attracted newcomers to the enthusiastic community of nematode genomicists. www.nematodegenomes.org.