Search tips
Search criteria

Results 1-25 (377378)

Clipboard (0)

Related Articles

1.  MitoCOGs: clusters of orthologous genes from mitochondria and implications for the evolution of eukaryotes 
BMC Evolutionary Biology  2014;14(1):237.
Mitochondria are ubiquitous membranous organelles of eukaryotic cells that evolved from an alpha-proteobacterial endosymbiont and possess a small genome that encompasses from 3 to 106 genes. Accumulation of thousands of mitochondrial genomes from diverse groups of eukaryotes provides an opportunity for a comprehensive reconstruction of the evolution of the mitochondrial gene repertoire.
Clusters of orthologous mitochondrial protein-coding genes (MitoCOGs) were constructed from all available mitochondrial genomes and complemented with nuclear orthologs of mitochondrial genes. With minimal exceptions, the mitochondrial gene complements of eukaryotes are subsets of the superset of 66 genes found in jakobids. Reconstruction of the evolution of mitochondrial genomes indicates that the mitochondrial gene set of the last common ancestor of the extant eukaryotes was slightly larger than that of jakobids. This superset of mitochondrial genes likely represents an intermediate stage following the loss and transfer to the nucleus of most of the endosymbiont genes early in eukaryote evolution. Subsequent evolution in different lineages involved largely parallel transfer of ancestral endosymbiont genes to the nuclear genome. The intron density in nuclear orthologs of mitochondrial genes typically is nearly the same as in the rest of the genes in the respective genomes. However, in land plants, the intron density in nuclear orthologs of mitochondrial genes is almost 1.5-fold lower than the genomic mean, suggestive of ongoing transfer of functional genes from mitochondria to the nucleus.
The MitoCOGs are expected to become an important resource for the study of mitochondrial evolution. The nearly complete superset of mitochondrial genes in jakobids likely represents an intermediate stage in the evolution of eukaryotes after the initial, extensive loss and transfer of the endosymbiont genes. In addition, the bacterial multi-subunit RNA polymerase that is encoded in the jakobid mitochondrial genomes was replaced by a single-subunit phage-type RNA polymerase in the rest of the eukaryotes. These results are best compatible with the rooting of the eukaryotic tree between jakobids and the rest of the eukaryotes. The land plants are the only eukaryotic branch in which the gene transfer from the mitochondrial to the nuclear genome appears to be an active, ongoing process.
Electronic supplementary material
The online version of this article (doi:10.1186/s12862-014-0237-5) contains supplementary material, which is available to authorized users.
PMCID: PMC4256733  PMID: 25421434
Mitochondria; Genome evolution; Gene loss; Gene transfer; Introns; Clusters of orthologous genes
2.  Metabolic modeling of endosymbiont genome reduction on a temporal scale 
This study explores the order in which individual metabolic genes are lost in an in silico evolutionary process leading from the metabolic network of Eschericia coli to that of the genome-reduced endosymbiont Buchnera aphidicola.
Simulating the reductive evolutionary process under several growth conditions, a remarkable correlation between in silico and phylogenetically reconstructed gene loss time is obtained.A gene's k-robustness (its depth of backups) is prime determinant of its loss time.In silico gene loss time is a better predictor of their actual loss times than genomic features and network properties.Simulating the reductive evolutionary process by the loss of large blocks followed by single-gene deletions, as known to occur in evolution, yields a remarkable correspondence with the phylogenetic reconstruction and the block loss reported in the literature.
An open fundamental challenge in Systems Biology is whether a genome-scale model can predict patterns of genome evolution by realistically accounting for the associated biochemical constraints. In this study, we explore the order in which individual genes are lost in an in silico evolutionary process, leading from the metabolic network of Eschericia coli to that of the endosymbiont Buchnera aphidicola.
To evaluate the in silico gene loss time, we repeated the reductive evolutionary process introduced by Pál et al (2006), denoting the in silico deletion time of a gene in a single run of the reductive evolutionary process as the number of genes deleted before its own deletion occurred. By comparing the in silico evaluations of the gene loss time to that obtained by a phylogenetic reconstruction (Figure 1), we could evaluate the ability of an in silico process to predict temporal patterns of genome reduction. Applying this procedure on a literature-based viable media, we obtained a mean Spearman's correlation of 0.46 (53% of the maximal correlation, empirical P-value <9.9e−4) between in silico and phylogenetically reconstructed loss times. In order to provide an upper bound on evolutionary necessity stemming from metabolic constraints, we searched the space of potential growth media and biomass functions via a simulated annealing search algorithm aimed at identifying an environment/biomass function that maximizes the target correlation between in silico and reconstructed loss times. Simulating the reductive evolutionary process under the growth conditions and biomass function obtained in this process, we managed to improve the correlation between in silico and reconstructed loss times to a mean Spearman's correlation of 0.54 (63% of the maximal correlation, empirical P-value <9.9e−4, Figure 3).
Examining the dependency of the predicted loss time of each gene on its intrinsic network-level properties we find a very strong inverse Spearman's correlation of −0.84 (empirical P-value <9.9e−4) between the order of gene loss predicted in silico and the k-robustness levels of the genes, the latter denoting the depth of their functional backups in the network (Deutscher et al, 2006). Moreover, in order to examine whether the relative loss time of a gene is influenced by its functional dependencies with other genes, we performed a flux-coupling analysis and identified pairs of reactions whose activities asymmetrically depend on each other, i.e., are directionally coupled (Burgard et al, 2004). We find that genes encoding reactions whose activity is needed for activating the other reaction (and not vice versa) have a tendency to be lost later, as one would expect (binomial P-value <1e−14).
To assess the scale of these results, we examined as a control how well genomic features and network properties predict the phylogenetically reconstructed gene loss times. We examined the dependency of the latter on several factors that are known be inversely correlated with the propensity of a gene to be lost (Brinza et al, 2009; Delmotte et al, 2006; Tamames et al, 2007), including the genes' mRNA levels, tAI values (Covert et al, 2004; Reis et al, 2004; Sharp and Li, 1987; Tuller et al, 2010a) and the number of partners the gene products have in a protein–protein interaction network. Remarkably, these genomic features yield considerably lower Spearman's correlation than that obtained by the in silico simulations. Moreover, multiply regressing the loss times from the phylogenetic reconstruction on the in silico gene loss time predictions and the genomic and network variables, we found that the (normalized) coefficient of the in silico predictions in the regression is much higher than those of the genomic features, further testifying to the considerable independent predictive power of the metabolic model.
Finally, simulating the evolutionary process as large block deletions at first followed by single-gene deletions as is thought to occur in evolution (Moran and Mira, 2001; van Ham et al, 2003), a remarkable correspondence with the phylogenetic reconstruction was found. Namely, we find that after a certain amount of genes are deleted from the genome, no further block deletions can occur due to the increasing density of essential genes. Notably, the maximum amount of genes that can be deleted in blocks (i.e., until no more blocks can be deleted) corresponds to the number of genes appearing in our phylogenetic reconstruction from the LCA (last common ancestor of Buchnera and E. coli) to the LCSA (last common symbiotic ancestor, nodes 1–3 in Figure 1A), as described in the literature.
A fundamental challenge in Systems Biology is whether a cell-scale metabolic model can predict patterns of genome evolution by realistically accounting for associated biochemical constraints. Here, we study the order in which genes are lost in an in silico evolutionary process, leading from the metabolic network of Eschericia coli to that of the endosymbiont Buchnera aphidicola. We examine how this order correlates with the order by which the genes were actually lost, as estimated from a phylogenetic reconstruction. By optimizing this correlation across the space of potential growth and biomass conditions, we compute an upper bound estimate on the model's prediction accuracy (R=0.54). The model's network-based predictive ability outperforms predictions obtained using genomic features of individual genes, reflecting the effect of selection imposed by metabolic stoichiometric constraints. Thus, while the timing of gene loss might be expected to be a completely stochastic evolutionary process, remarkably, we find that metabolic considerations, on their own, make a marked 40% contribution to determining when such losses occur.
PMCID: PMC3094061  PMID: 21451589
constraint-based modeling; endosymbiont; evolution; metabolism
3.  Genome Evolution and Phylogenomic Analysis of Candidatus Kinetoplastibacterium, the Betaproteobacterial Endosymbionts of Strigomonas and Angomonas 
Genome Biology and Evolution  2013;5(2):338-350.
It has been long known that insect-infecting trypanosomatid flagellates from the genera Angomonas and Strigomonas harbor bacterial endosymbionts (Candidatus Kinetoplastibacterium or TPE [trypanosomatid proteobacterial endosymbiont]) that supplement the host metabolism. Based on previous analyses of other bacterial endosymbiont genomes from other lineages, a stereotypical path of genome evolution in such bacteria over the duration of their association with the eukaryotic host has been characterized. In this work, we sequence and analyze the genomes of five TPEs, perform their metabolic reconstruction, do an extensive phylogenomic analyses with all available Betaproteobacteria, and compare the TPEs with their nearest betaproteobacterial relatives. We also identify a number of housekeeping and central metabolism genes that seem to have undergone positive selection. Our genome structure analyses show total synteny among the five TPEs despite millions of years of divergence, and that this lineage follows the common path of genome evolution observed in other endosymbionts of diverse ancestries. As previously suggested by cell biology and biochemistry experiments, Ca. Kinetoplastibacterium spp. preferentially maintain those genes necessary for the biosynthesis of compounds needed by their hosts. We have also shown that metabolic and informational genes related to the cooperation with the host are overrepresented amongst genes shown to be under positive selection. Finally, our phylogenomic analysis shows that, while being in the Alcaligenaceae family of Betaproteobacteria, the closest relatives of these endosymbionts are not in the genus Bordetella as previously reported, but more likely in the Taylorella genus.
PMCID: PMC3590767  PMID: 23345457
endosymbiont biology; phylogenomics; comparative genomics; Trypanosomatidae; selective pressure
4.  The Wolbachia Genome of Brugia malayi: Endosymbiont Evolution within a Human Pathogenic Nematode 
PLoS Biology  2005;3(4):e121.
Complete genome DNA sequence and analysis is presented for Wolbachia, the obligate alpha-proteobacterial endosymbiont required for fertility and survival of the human filarial parasitic nematode Brugia malayi. Although, quantitatively, the genome is even more degraded than those of closely related Rickettsia species, Wolbachia has retained more intact metabolic pathways. The ability to provide riboflavin, flavin adenine dinucleotide, heme, and nucleotides is likely to be Wolbachia's principal contribution to the mutualistic relationship, whereas the host nematode likely supplies amino acids required for Wolbachia growth. Genome comparison of the Wolbachia endosymbiont of B. malayi (wBm) with the Wolbachia endosymbiont of Drosophila melanogaster (wMel) shows that they share similar metabolic trends, although their genomes show a high degree of genome shuffling. In contrast to wMel, wBm contains no prophage and has a reduced level of repeated DNA. Both Wolbachia have lost a considerable number of membrane biogenesis genes that apparently make them unable to synthesize lipid A, the usual component of proteobacterial membranes. However, differences in their peptidoglycan structures may reflect the mutualistic lifestyle of wBm in contrast to the parasitic lifestyle of wMel. The smaller genome size of wBm, relative to wMel, may reflect the loss of genes required for infecting host cells and avoiding host defense systems. Analysis of this first sequenced endosymbiont genome from a filarial nematode provides insight into endosymbiont evolution and additionally provides new potential targets for elimination of cutaneous and lymphatic human filarial disease.
Analysis of this Wolbachia genome, which resides within filarial parasites, offers insight into endosymbiont evolution and the promise of new strategies for the elimination of human filarial disease
PMCID: PMC1069646  PMID: 15780005
5.  Evolutionary Convergence and Nitrogen Metabolism in Blattabacterium strain Bge, Primary Endosymbiont of the Cockroach Blattella germanica 
PLoS Genetics  2009;5(11):e1000721.
Bacterial endosymbionts of insects play a central role in upgrading the diet of their hosts. In certain cases, such as aphids and tsetse flies, endosymbionts complement the metabolic capacity of hosts living on nutrient-deficient diets, while the bacteria harbored by omnivorous carpenter ants are involved in nitrogen recycling. In this study, we describe the genome sequence and inferred metabolism of Blattabacterium strain Bge, the primary Flavobacteria endosymbiont of the omnivorous German cockroach Blattella germanica. Through comparative genomics with other insect endosymbionts and free-living Flavobacteria we reveal that Blattabacterium strain Bge shares the same distribution of functional gene categories only with Blochmannia strains, the primary Gamma-Proteobacteria endosymbiont of carpenter ants. This is a remarkable example of evolutionary convergence during the symbiotic process, involving very distant phylogenetic bacterial taxa within hosts feeding on similar diets. Despite this similarity, different nitrogen economy strategies have emerged in each case. Both bacterial endosymbionts code for urease but display different metabolic functions: Blochmannia strains produce ammonia from dietary urea and then use it as a source of nitrogen, whereas Blattabacterium strain Bge codes for the complete urea cycle that, in combination with urease, produces ammonia as an end product. Not only does the cockroach endosymbiont play an essential role in nutrient supply to the host, but also in the catabolic use of amino acids and nitrogen excretion, as strongly suggested by the stoichiometric analysis of the inferred metabolic network. Here, we explain the metabolic reasons underlying the enigmatic return of cockroaches to the ancestral ammonotelic state.
Author Summary
Bacterial endosymbionts from insects are subjected to a process of genome reduction from the moment they interact with their host, especially when the symbiosis is strict (the partners live together permanently) and the endosymbiont is maternally inherited. The type of genes that are retained correlates with specific metabolic host requirements. Here, we report the genome sequence of Blattabacterium strain Bge, the primary endosymbiont of the German cockroach B. germanica. Cockroaches are omnivorous insects and Blattabacterium cooperates with their metabolism, not only with essential nutrient metabolism but also through an efficient use of amino acids and the nitrogen excretion by the combination of a urea cycle and urease activity. The repertoires of functions that are maintained in Blattabacterium are similar to those already observed in Blochmannia spp., the primary endosymbiont of carpenter ants, also an omnivorous insect. This constitutes a nice example of evolutionary convergence of two endosymbionts belonging to very different bacterial phyla that have evolved a similar repertoire of functions according to the host. However, the current set of genes and, more importantly, those that were lost in the process of genome reduction in both endosymbiont lineages have also contributed to a different involvement of Blattabacterium and Blochmannia in nitrogen metabolism.
PMCID: PMC2768785  PMID: 19911043
6.  The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? 
Biology Direct  2006;1:22.
Ever since the discovery of 'genes in pieces' and mRNA splicing in eukaryotes, origin and evolution of spliceosomal introns have been considered within the conceptual framework of the 'introns early' versus 'introns late' debate. The 'introns early' hypothesis, which is closely linked to the so-called exon theory of gene evolution, posits that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. Under this scenario, the absence of spliceosomal introns in prokaryotes is considered to be a result of "genome streamlining". The 'introns late' hypothesis counters that spliceosomal introns emerged only in eukaryotes, and moreover, have been inserted into protein-coding genes continuously throughout the evolution of eukaryotes. Beyond the formal dilemma, the more substantial side of this debate has to do with possible roles of introns in the evolution of eukaryotes.
I argue that several lines of evidence now suggest a coherent solution to the introns-early versus introns-late debate, and the emerging picture of intron evolution integrates aspects of both views although, formally, there seems to be no support for the original version of introns-early. Firstly, there is growing evidence that spliceosomal introns evolved from group II self-splicing introns which are present, usually, in small numbers, in many bacteria, and probably, moved into the evolving eukaryotic genome from the α-proteobacterial progenitor of the mitochondria. Secondly, the concept of a primordial pool of 'virus-like' genetic elements implies that self-splicing introns are among the most ancient genetic entities. Thirdly, reconstructions of the ancestral state of eukaryotic genes suggest that the last common ancestor of extant eukaryotes had an intron-rich genome. Thus, it appears that ancestors of spliceosomal introns, indeed, have existed since the earliest stages of life's evolution, in a formal agreement with the introns-early scenario. However, there is no evidence that these ancient introns ever became widespread before the emergence of eukaryotes, hence, the central tenet of introns-early, the role of introns in early evolution of proteins, has no support. However, the demonstration that numerous introns invaded eukaryotic genes at the outset of eukaryotic evolution and that subsequent intron gain has been limited in many eukaryotic lineages implicates introns as an ancestral feature of eukaryotic genomes and refutes radical versions of introns-late. Perhaps, most importantly, I argue that the intron invasion triggered other pivotal events of eukaryogenesis, including the emergence of the spliceosome, the nucleus, the linear chromosomes, the telomerase, and the ubiquitin signaling system. This concept of eukaryogenesis, in a sense, revives some tenets of the exon hypothesis, by assigning to introns crucial roles in eukaryotic evolutionary innovation.
The scenario of the origin and evolution of introns that is best compatible with the results of comparative genomics and theoretical considerations goes as follows: self-splicing introns since the earliest stages of life's evolution – numerous spliceosomal introns invading genes of the emerging eukaryote during eukaryogenesis – subsequent lineage-specific loss and gain of introns. The intron invasion, probably, spawned by the mitochondrial endosymbiont, might have critically contributed to the emergence of the principal features of the eukaryotic cell. This scenario combines aspects of the introns-early and introns-late views.
this article was reviewed by W. Ford Doolittle, James Darnell (nominated by W. Ford Doolittle), William Martin, and Anthony Poole.
PMCID: PMC1570339  PMID: 16907971
7.  Endosymbiotic associations within protists 
The establishment of an endosymbiotic relationship typically seems to be driven through complementation of the host's limited metabolic capabilities by the biochemical versatility of the endosymbiont. The most significant examples of endosymbiosis are represented by the endosymbiotic acquisition of plastids and mitochondria, introducing photosynthesis and respiration to eukaryotes. However, there are numerous other endosymbioses that evolved more recently and repeatedly across the tree of life. Recent advances in genome sequencing technology have led to a better understanding of the physiological basis of many endosymbiotic associations. This review focuses on endosymbionts in protists (unicellular eukaryotes). Selected examples illustrate the incorporation of various new biochemical functions, such as photosynthesis, nitrogen fixation and recycling, and methanogenesis, into protist hosts by prokaryotic endosymbionts. Furthermore, photosynthetic eukaryotic endosymbionts display a great diversity of modes of integration into different protist hosts.
In conclusion, endosymbiosis seems to represent a general evolutionary strategy of protists to acquire novel biochemical functions and is thus an important source of genetic innovation.
PMCID: PMC2817226  PMID: 20124339
metabolic complementation; prokaryotic endosymbionts; eukaryotic endosymbionts; evolution; integration
8.  Origin and evolution of spliceosomal introns 
Biology Direct  2012;7:11.
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
PMCID: PMC3488318  PMID: 22507701
Intron sliding; Intron gain; Intron loss; Spliceosome; Splicing signals; Evolution of exon/intron structure; Alternative splicing; Phylogenetic trees; Mobile domains; Eukaryotic ancestor
9.  Diverse Phage-Encoded Toxins in a Protective Insect Endosymbiont ▿ †  
Applied and Environmental Microbiology  2008;74(21):6782-6791.
The lysogenic bacteriophage APSE infects “Candidatus Hamiltonella defensa,” a facultative endosymbiont of aphids and other sap-feeding insects. This endosymbiont has established a beneficial association with aphids, increasing survivorship following attack by parasitoid wasps. Although APSE and “Ca. Hamiltonella defensa” are effectively maternally transmitted between aphid generations, they can also be horizontally transferred among insect hosts, which results in genetically distinct “Ca. Hamiltonella defensa” strains infecting the same aphid species and sporadic distributions of both APSE and “Ca. Hamiltonella defensa” among hosts. Aphids infected only with “Ca. Hamiltonella defensa” have significantly less protection than those infected with both “Ca. Hamiltonella defensa” and APSE. This protection has been proposed to be connected to eukaryote-targeted toxins previously discovered in the genomes of two characterized APSE strains. In this study, we have sequenced partial genomes from seven additional APSE strains to address the evolution and extent of toxin variation in this phage. The APSE lysis region has been a hot spot for nonhomologous recombination of novel virulence cassettes. We identified four new toxins from three protein families, Shiga-like toxin, cytolethal distending toxin, and YD-repeat toxins. These recombination events have also resulted in reassortment of the downstream lysozyme and holin genes. Analysis of the conserved APSE genes flanking the variable toxin cassettes reveals a close phylogenetic association with phage sequences from two other facultative endosymbionts of insects. Thus, phage may act as a conduit for ongoing gene exchange among heritable endosymbionts.
PMCID: PMC2576707  PMID: 18791000
10.  Possible import routes of proteins into the cyanobacterial endosymbionts/plastids of Paulinella chromatophora 
Theory in Biosciences  2011;131(1):1-18.
The rhizarian amoeba Paulinella chromatophora harbors two photosynthetically active and deeply integrated cyanobacterial endosymbionts acquired ~60 million years ago. Recent genomic analyses of P. chromatophora have revealed the loss of many essential genes from the endosymbiont’s genome, and have identified more than 30 genes that have been transferred to the host cell’s nucleus through endosymbiotic gene transfer (EGT). This indicates that, similar to classical primary plastids, Paulinella endosymbionts have evolved a transport system to import their nuclear-encoded proteins. To deduce how these proteins are transported, we searched for potential targeting signals in genes for 10 EGT-derived proteins. Our analyses indicate that five proteins carry potential signal peptides, implying they are targeted via the host endomembrane system. One sequence encodes a mitochondrial-like transit peptide, which suggests an import pathway involving a channel protein residing in the outer membrane of the endosymbiont. No N-terminal targeting signals were identified in the four other genes, but their encoded proteins could utilize non-classical targeting signals contained internally or in C-terminal regions. Several amino acids more often found in the Paulinella EGT-derived proteins than in their ancestral set (proteins still encoded in the endosymbiont genome) could constitute such signals. Characteristic features of the EGT-derived proteins are low molecular weight and nearly neutral charge, which both could be adaptations to enhance passage through the peptidoglycan wall present in the intermembrane space of the endosymbiont’s envelope. Our results suggest that Paulinella endosymbionts/plastids have evolved several different import routes, as has been shown in classical primary plastids.
PMCID: PMC3334493  PMID: 22209953
Paulinella chromatophora; Endosymbiosis; Plastid; Pre-sequence; Targeting signal; Endosymbiotic gene transfer; Life Sciences; Philosophy of Biology; Theoretical Ecology/Statistics; Bioinformatics; Statistical Physics, Dynamical Systems and Complexity; Mathematical and Computational Biology; Evolutionary Biology
11.  The process of genome shrinkage in the obligate symbiont Buchnera aphidicola 
Genome Biology  2001;2(12):research0054.1-research0054.12.
Very small genomes have evolved repeatedly in eubacterial lineages that have adopted obligate associations with eukaryotic hosts. Complete genome sequences have revealed that small genomes retain very different gene sets, raising the question of how final genome content is determined. To examine the process of genome reduction, the tiny genome of the endosymbiont Buchnera aphidicola was compared to the larger ancestral genome, reconstructed on the basis of the phylogenetic distribution of gene orthologs among fully sequenced relatives of Escherichia coli and Buchnera.
The reconstructed ancestral genome contained 2,425 open reading frames (ORFs). The Buchnera genome, containing 564 ORFs, consists of 153 fragments of 1-34 genes that are syntenic with reconstructed ancestral regions. On the basis of this reconstruction, 503 genes were eliminated within syntenic fragments, and 1,403 genes were lost from the gaps between syntenic fragments, probably in connection with genome rearrangements. Lost regions are sometimes large, and often span functionally unrelated genes. In addition, individual genes and regulatory regions have been lost or eroded. For the categories of DNA repair genes and rRNA genes, most lost loci fall in regions between syntenic fragments. This history of gene loss is reflected in the sequences of intergenic spacers at positions where genes were once present.
The most plausible interpretation of this reconstruction is that Buchnera lost many genes through the fixation of large deletions soon after the acquisition of an obligate endosymbiotic lifestyle. An implication is that final genome composition may be partly the chance outcome of initial deletions and that neighboring genes influence the likelihood of loss of particular genes and pathways.
PMCID: PMC64839  PMID: 11790257
12.  The cyanobacterial endosymbiont of the unicellular algae Rhopalodia gibba shows reductive genome evolution 
Bacteria occur in facultative association and intracellular symbiosis with a diversity of eukaryotic hosts. Recently, we have helped to characterise an intracellular nitrogen fixing bacterium, the so-called spheroid body, located within the diatom Rhopalodia gibba. Spheroid bodies are of cyanobacterial origin and exhibit features that suggest physiological adaptation to their intracellular life style. To investigate the genome modifications that have accompanied the process of endosymbiosis, here we compare gene structure, content and organisation in spheroid body and cyanobacterial genomes.
Comparison of the spheroid body's genome sequence with corresponding regions of near free-living relatives indicates that multiple modifications have occurred in the endosymbiont's genome. These include localised changes that have led to elimination of some genes. This gene loss has been accompanied either by deletion of the respective DNA region or replacement with non-coding DNA that is AT rich in composition. In addition, genome modifications have led to the fusion and truncation of genes. We also report that in the spheroid body's genome there is an accumulation of deleterious mutations in genes for cell wall biosynthesis and processes controlled by transposases. Interestingly, the formation of pseudogenes in the spheroid body has occurred in the presence of intact, and presumably functional, recA and recF genes. This is in contrast to the situation in most investigated obligate intracellular bacterium-eukaryote symbioses, where at least either recA or recF has been eliminated.
Our analyses suggest highly specific targeting/loss of individual genes during the process of genome reduction and establishment of a cyanobacterial endosymbiont inside a eukaryotic cell. Our findings confirm, at the genome level, earlier speculation on the obligate intracellular status of the spheroid body in Rhopalodia gibba. This association is the first example of an obligate cyanobacterial symbiosis involving nitrogen fixation for which genomic data are available. It represents a new model system to study molecular adaptations of genome evolution that accompany a switch from free-living to intracellular existence.
PMCID: PMC2246100  PMID: 18226230
13.  Metabolic Networks of Sodalis glossinidius: A Systems Biology Approach to Reductive Evolution 
PLoS ONE  2012;7(1):e30652.
Genome reduction is a common evolutionary process affecting bacterial lineages that establish symbiotic or pathogenic associations with eukaryotic hosts. Such associations yield highly reduced genomes with greatly streamlined metabolic abilities shaped by the type of ecological association with the host. Sodalis glossinidius, the secondary endosymbiont of tsetse flies, represents one of the few complete genomes available of a bacterium at the initial stages of this process. In the present study, genome reduction is studied from a systems biology perspective through the reconstruction and functional analysis of genome-scale metabolic networks of S. glossinidius.
The functional profile of ancestral and extant metabolic networks sheds light on the evolutionary events underlying transition to a host-dependent lifestyle. Meanwhile, reductive evolution simulations on the extant metabolic network can predict possible future evolution of S. glossinidius in the context of genome reduction. Finally, knockout simulations in different metabolic systems reveal a gradual decrease in network robustness to different mutational events for bacterial endosymbionts at different stages of the symbiotic association.
Stoichiometric analysis reveals few gene inactivation events whose effects on the functionality of S. glossinidius metabolic systems are drastic enough to account for the ecological transition from a free-living to host-dependent lifestyle. The decrease in network robustness across different metabolic systems may be associated with the progressive integration in the more stable environment provided by the insect host. Finally, reductive evolution simulations reveal the strong influence that external conditions exert on the evolvability of metabolic systems.
PMCID: PMC3265509  PMID: 22292008
14.  Polyploidy of Endosymbiotically Derived Genomes in Complex Algae 
Genome Biology and Evolution  2014;6(4):974-980.
Chlorarachniophyte and cryptophyte algae have complex plastids that were acquired by the uptake of a green or red algal endosymbiont via secondary endosymbiosis. The plastid is surrounded by four membranes, and a relict nucleus, called the nucleomorph, remains in the periplastidal compartment that is the remnant cytoplasm of the endosymbiont. Thus, these two algae possess four different genomes in a cell: Nuclear, nucleomorph, plastid, and mitochondrial. Recently, sequencing of the nuclear genomes of the chlorarachniophyte Bigelowiella natans and the cryptophyte Guillardia theta has been completed, and all four genomes have been made available. However, the copy number of each genome has never been investigated. It is important to know the actual DNA content of each genome, especially the highly reduced nucleomorph genome, for studies on genome evolution. In this study, we calculated genomic copy numbers in B. natans and G. theta using a real-time quantitative polymerase chain reaction approach. The nuclear genomes were haploid in both species, whereas the nucleomorph genomes were estimated to be diploid and tetraploid, respectively. Mitochondria and plastids contained a large copy number of genomic DNA in each cell. In the secondary endosymbioses of chlorarachniophytes and cryptophytes, the endosymbiont nuclear genomes were highly reduced in size and in the number of coding genes, whereas the chromosomal copy number was increased, as in bacterial endosymbiont genomes. This suggests that polyploidization is a general characteristic of highly reduced genomes in broad prokaryotic and eukaryotic endosymbionts.
PMCID: PMC4007541  PMID: 24709562
chlorarachniophyte; cryptophyte; endosymbiosis; nucleomorph; plastid
15.  Energetics and genetics across the prokaryote-eukaryote divide 
Biology Direct  2011;6:35.
All complex life on Earth is eukaryotic. All eukaryotic cells share a common ancestor that arose just once in four billion years of evolution. Prokaryotes show no tendency to evolve greater morphological complexity, despite their metabolic virtuosity. Here I argue that the eukaryotic cell originated in a unique prokaryotic endosymbiosis, a singular event that transformed the selection pressures acting on both host and endosymbiont.
The reductive evolution and specialisation of endosymbionts to mitochondria resulted in an extreme genomic asymmetry, in which the residual mitochondrial genomes enabled the expansion of bioenergetic membranes over several orders of magnitude, overcoming the energetic constraints on prokaryotic genome size, and permitting the host cell genome to expand (in principle) over 200,000-fold. This energetic transformation was permissive, not prescriptive; I suggest that the actual increase in early eukaryotic genome size was driven by a heavy early bombardment of genes and introns from the endosymbiont to the host cell, producing a high mutation rate. Unlike prokaryotes, with lower mutation rates and heavy selection pressure to lose genes, early eukaryotes without genome-size limitations could mask mutations by cell fusion and genome duplication, as in allopolyploidy, giving rise to a proto-sexual cell cycle. The side effect was that a large number of shared eukaryotic basal traits accumulated in the same population, a sexual eukaryotic common ancestor, radically different to any known prokaryote.
The combination of massive bioenergetic expansion, release from genome-size constraints, and high mutation rate favoured a protosexual cell cycle and the accumulation of eukaryotic traits. These factors explain the unique origin of eukaryotes, the absence of true evolutionary intermediates, and the evolution of sex in eukaryotes but not prokaryotes.
This article was reviewed by: Eugene Koonin, William Martin, Ford Doolittle and Mark van der Giezen. For complete reports see the Reviewers' Comments section.
PMCID: PMC3152533  PMID: 21714941
16.  Phylogenomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer in Cryptosporidium parvum 
Genome Biology  2004;5(11):R88.
An analysis of Cryptosporidium parvum genes of likely endosymbiont or prokaryotic origin supports the hypothesis that C. arvum evolved from a plastid-containing lineage.
The apicomplexan parasite Cryptosporidium parvum is an emerging pathogen capable of causing illness in humans and other animals and death in immunocompromised individuals. No effective treatment is available and the genome sequence has recently been completed. This parasite differs from other apicomplexans in its lack of a plastid organelle, the apicoplast. Gene transfer, either intracellular from an endosymbiont/donor organelle or horizontal from another organism, can provide evidence of a previous endosymbiotic relationship and/or alter the genetic repertoire of the host organism. Given the importance of gene transfers in eukaryotic evolution and the potential implications for chemotherapy, it is important to identify the complement of transferred genes in Cryptosporidium.
We have identified 31 genes of likely plastid/endosymbiont (n = 7) or prokaryotic (n = 24) origin using a phylogenomic approach. The findings support the hypothesis that Cryptosporidium evolved from a plastid-containing lineage and subsequently lost its apicoplast during evolution. Expression analyses of candidate genes of algal and eubacterial origin show that these genes are expressed and developmentally regulated during the life cycle of C. parvum.
Cryptosporidium is the recipient of a large number of transferred genes, many of which are not shared by other apicomplexan parasites. Genes transferred from distant phylogenetic sources, such as eubacteria, may be potential targets for therapeutic drugs owing to their phylogenetic distance or the lack of homologs in the host. The successful integration and expression of the transferred genes in this genome has changed the genetic and metabolic repertoire of the parasite.
PMCID: PMC545779  PMID: 15535864
17.  Tertiary Endosymbiosis in Two Dinotoms Has Generated Little Change in the Mitochondrial Genomes of Their Dinoflagellate Hosts and Diatom Endosymbionts 
PLoS ONE  2012;7(8):e43763.
Mitochondria or mitochondrion-derived organelles are found in all eukaryotes with the exception of secondary or tertiary plastid endosymbionts. In these highly reduced systems, the mitochondrion has been lost in all cases except the diatom endosymbionts found in a small group of dinoflagellates, called ‘dinotoms’, the only cells with two evolutionarily distinct mitochondria. To investigate the persistence of this redundancy and its consequences on the content and structure of the endosymbiont and host mitochondrial genomes, we report the sequences of these genomes from two dinotoms.
Methodology/Principal Findings
The endosymbiont mitochondrial genomes of Durinskia baltica and Kryptoperidinium foliaceum exhibit nearly identical gene content with other diatoms, and highly conserved gene order (nearly identical to that of the raphid pennate diatom Fragilariopsis cylindrus). These two genomes are differentiated from other diatoms' by the fission of nad11 and by an insertion within nad2, in-frame and unspliced from the mRNA. Durinskia baltica is further distinguished from K. foliaceum by two gene fusions and its lack of introns. The host mitochondrial genome in D. baltica encodes cox1 and cob plus several fragments of LSU rRNA gene in a hugely expanded genome that includes numerous pseudogenes, and a trans-spliced cox3 gene, like in other dinoflagellates. Over 100 distinct contigs were identified through 454 sequencing, but intact full-length genes for cox1, cob and the 5′ exon of cox3 were present as a single contig each, suggesting most of the genome is pseudogenes. The host mitochondrial genome of K. foliaceum was difficult to identify, but fragments of all the three protein-coding genes, corresponding transcripts, and transcripts of several LSU rRNA fragments were all recovered.
Overall, the endosymbiont and host mitochondrial genomes in the two dinotoms have changed surprisingly little from those of free-living diatoms and dinoflagellates, irrespective of their long coexistence side by side in dinotoms.
PMCID: PMC3423374  PMID: 22916303
18.  The Dual Origin of the Yeast Mitochondrial Proteome 
Yeast (Chichester, England)  2000;17(3):170-187.
We propose a scheme for the origin of mitochondria based on phylogenetic reconstructions with more than 400 yeast nuclear genes that encode mitochondrial proteins. Half of the yeast mitochondrial proteins have no discernable bacterial homologues, while one-tenth are unequivocally of α-proteobacterial origin. These data suggest that the majority of genes encoding yeast mitochondrial proteins are descendants of two different genomic lineages that have evolved in different modes. First, the ancestral free-living α-proteobacterium evolved into an endosymbiont of an anaerobic host. Most of the ancestral bacterial genes were lost, but a small fraction of genes supporting bioenergetic and translational processes were retained and eventually transferred to what became the host nuclear genome. In a second, parallel mode, a larger number of novel mitochondrial genes were recruited from the nuclear genome to complement the remaining genes from the bacterial ancestor. These eukaryotic genes, which are primarily involved in transport and regulatory functions, transformed the endosymbiont into an ATP-exporting organelle.
PMCID: PMC2448374  PMID: 11025528
19.  Vertical Transmission of a Drosophila Endosymbiont Via Cooption of the Yolk Transport and Internalization Machinery 
mBio  2013;4(2):e00532-12.
Spiroplasma is a diverse bacterial clade that includes many vertically transmitted insect endosymbionts, including Spiroplasma poulsonii, a natural endosymbiont of Drosophila melanogaster. These bacteria persist in the hemolymph of their adult host and exhibit efficient vertical transmission from mother to offspring. In this study, we analyzed the mechanism that underlies their vertical transmission, and here we provide strong evidence that these bacteria use the yolk uptake machinery to colonize the germ line. We show that Spiroplasma reaches the oocyte by passing through the intercellular space surrounding the ovarian follicle cells and is then endocytosed into oocytes within yolk granules during the vitellogenic stages of oogenesis. Mutations that disrupt yolk uptake by oocytes inhibit vertical Spiroplasma transmission and lead to an accumulation of these bacteria outside the oocyte. Impairment of yolk secretion by the fat body results in Spiroplasma not reaching the oocyte and a severe reduction of vertical transmission. We propose a model in which Spiroplasma first interacts with yolk in the hemolymph to gain access to the oocyte and then uses the yolk receptor, Yolkless, to be endocytosed into the oocyte. Cooption of the yolk uptake machinery is a powerful strategy for endosymbionts to target the germ line and achieve vertical transmission. This mechanism may apply to other endosymbionts and provides a possible explanation for endosymbiont host specificity.
Most insect species, including important disease vectors and crop pests, harbor vertically transmitted endosymbiotic bacteria. Studies have shown that many facultative endosymbionts, including Spiroplasma, confer protection against different classes of parasites on their hosts and therefore are attractive tools for the control of vector-borne diseases. The ability to be efficiently transmitted from females to their offspring is the key feature shaping associations between insects and their inherited endosymbionts, but to date, little is known about the mechanisms involved. In oviparous animals, yolk accumulates in developing eggs and serves to meet the nutritional demands of embryonic development. Here we show that Spiroplasma coopts the yolk transport and uptake machinery to colonize the germ line and ensure efficient vertical transmission. The uptake of yolk is a female germ line-specific feature and therefore an attractive target for cooption by endosymbionts that need to maintain high-fidelity maternal transmission.
PMCID: PMC3585447  PMID: 23462112
20.  Eukaryote-to-eukaryote gene transfer gives rise to genome mosaicism in euglenids 
Euglenophytes are a group of photosynthetic flagellates possessing a plastid derived from a green algal endosymbiont, which was incorporated into an ancestral host cell via secondary endosymbiosis. However, the impact of endosymbiosis on the euglenophyte nuclear genome is not fully understood due to its complex nature as a 'hybrid' of a non-photosynthetic host cell and a secondary endosymbiont.
We analyzed an EST dataset of the model euglenophyte Euglena gracilis using a gene mining program designed to detect laterally transferred genes. We found E. gracilis genes showing affinity not only with green algae, from which the secondary plastid in euglenophytes evolved, but also red algae and/or secondary algae containing red algal-derived plastids. Phylogenetic analyses of these 'red lineage' genes suggest that E. gracilis acquired at least 14 genes via eukaryote-to-eukaryote lateral gene transfer from algal sources other than the green algal endosymbiont that gave rise to its current plastid. We constructed an EST library of the aplastidic euglenid Peranema trichophorum, which is a eukaryovorous relative of euglenophytes, and also identified 'red lineage' genes in its genome.
Our data show genome mosaicism in E. gracilis and P. trichophorum. One possible explanation for the presence of these genes in these organisms is that some or all of them were independently acquired by lateral gene transfer and contributed to the successful integration and functioning of the green algal endosymbiont as a secondary plastid. Alternative hypotheses include the presence of a phagocytosed alga as the single source of those genes, or a cryptic tertiary endosymbiont harboring secondary plastid of red algal origin, which the eukaryovorous ancestor of euglenophytes had acquired prior to the secondary endosymbiosis of a green alga.
PMCID: PMC3101172  PMID: 21501489
21.  Genome-Wide Functional Divergence after the Symbiosis of Proteobacteria with Insects Unraveled through a Novel Computational Approach 
PLoS Computational Biology  2009;5(4):e1000344.
Symbiosis has been among the most important evolutionary steps to generate biological complexity. The establishment of symbiosis required an intimate metabolic link between biological systems with different complexity levels. The strict endo-cellular symbiotic bacteria of insects are beautiful examples of the metabolic coupling between organisms belonging to different kingdoms, a eukaryote and a prokaryote. The host (eukaryote) provides the endosymbiont (prokaryote) with a stable cellular environment while the endosymbiont supplements the host's diet with essential metabolites. For such communication to take place, endosymbionts' genomes have suffered dramatic modifications and reconfigurations of proteins' functions. Two of the main modifications, loss of genes redundant for endosymbiotic bacteria or the host and bacterial genome streamlining, have been extensively studied. However, no studies have accounted for possible functional shifts in the endosymbiotic proteomes. Here, we develop a simple method to screen genomes for evidence of functional divergence between two species clusters, and we apply it to identify functional shifts in the endosymbiotic proteomes. Despite the strong effects of genetic drift in the endosymbiotic systems, we unexpectedly identified genes to be under stronger selective constraints in endosymbionts of aphids and ants than in their free-living bacterial relatives. These genes are directly involved in supplementing the host's diet with essential metabolites. A test of functional divergence supports a strong relationship between the endosymbiosis and the functional shifts of proteins involved in the metabolic communication with the insect host. The correlation between functional divergence in the endosymbiotic bacterium and the ecological requirements of the host uncovers their intimate biochemical and metabolic communication and provides insights on the role of symbiosis in generating species diversity.
Author Summary
Biological complexity has emerged on earth by the combination of living forms. This combination, called symbiosis, had to overcome the problems caused by the uncoupled metabolisms of the organisms involved. One way to do so was through the loss of genes that were no longer needed for the endosymbiont in the protected cellular environment provided by the host. Another step necessary to adjust both metabolisms was through the change in the function of bacterial proteins to perform new roles in the symbiotic system. In this article, we test such events in symbiotic systems involving an insect and a bacterium by developing a new and simple method to identify proteome-wide functional shifts. Our results show that most of the functional changes occurred at genes involved in metabolic communication with the host and are correlated with the host's ecological traits.
PMCID: PMC2659769  PMID: 19343224
22.  Genome Sequence of “Candidatus Walczuchella monophlebidarum” the Flavobacterial Endosymbiont of Llaveia axin axin (Hemiptera: Coccoidea: Monophlebidae) 
Genome Biology and Evolution  2014;6(3):714-726.
Scale insects (Hemiptera: Coccoidae) constitute a very diverse group of sap-feeding insects with a large diversity of symbiotic associations with bacteria. Here, we present the complete genome sequence, metabolic reconstruction, and comparative genomics of the flavobacterial endosymbiont of the giant scale insect Llaveia axin axin. The gene repertoire of its 309,299 bp genome was similar to that of other flavobacterial insect endosymbionts though not syntenic. According to its genetic content, essential amino acid biosynthesis is likely to be the flavobacterial endosymbiont's principal contribution to the symbiotic association with its insect host. We also report the presence of a γ-proteobacterial symbiont that may be involved in waste nitrogen recycling and also has amino acid biosynthetic capabilities that may provide metabolic precursors to the flavobacterial endosymbiont. We propose “Candidatus Walczuchella monophlebidarum” as the name of the flavobacterial endosymbiont of insects from the Monophlebidae family.
PMCID: PMC3971599  PMID: 24610838
scale insect; γ-Proteobacteria; symbiosis; comparative genomics
23.  The Genome of Cardinium cBtQ1 Provides Insights into Genome Reduction, Symbiont Motility, and Its Settlement in Bemisia tabaci 
Genome Biology and Evolution  2014;6(4):1013-1030.
Many insects harbor inherited bacterial endosymbionts. Although some of them are not strictly essential and are considered facultative, they can be a key to host survival under specific environmental conditions, such as parasitoid attacks, climate changes, or insecticide pressures. The whitefly Bemisia tabaci is at the top of the list of organisms inflicting agricultural damage and outbreaks, and changes in its distribution may be associated to global warming. In this work, we have sequenced and analyzed the genome of Cardinium cBtQ1, a facultative bacterial endosymbiont of B. tabaci and propose that it belongs to a new taxonomic family, which also includes Candidatus Amoebophilus asiaticus and Cardinium cEper1, endosymbionts of amoeba and wasps, respectively. Reconstruction of their last common ancestors’ gene contents revealed an initial massive gene loss from the free-living ancestor. This was followed in Cardinium by smaller losses, associated with settlement in arthropods. Some of these losses, affecting cofactor and amino acid biosynthetic encoding genes, took place in Cardinium cBtQ1 after its divergence from the Cardinium cEper1 lineage and were related to its settlement in the whitefly and its endosymbionts. Furthermore, the Cardinium cBtQ1 genome displays a large proportion of transposable elements, which have recently inactivated genes and produced chromosomal rearrangements. The genome also contains a chromosomal duplication and a multicopy plasmid, which harbors several genes putatively associated with gliding motility, as well as two other genes encoding proteins with potential insecticidal activity. As gene amplification is very rare in endosymbionts, an important function of these genes cannot be ruled out.
PMCID: PMC4007549  PMID: 24723729
Amoebophilaceae; IS elements; gliding motility; Candidatus Cardinium hertigii; host–symbiont interaction
24.  Genome-Based Reconstruction of the Protein Import Machinery in the Secondary Plastid of a Chlorarachniophyte Alga 
Eukaryotic Cell  2012;11(3):324-333.
Most plastid proteins are encoded by their nuclear genomes and need to be targeted across multiple envelope membranes. In vascular plants, the translocons at the outer and inner envelope membranes of chloroplasts (TOC and TIC, respectively) facilitate transport across the two plastid membranes. In contrast, several algal groups harbor more complex plastids, the so-called secondary plastids, which are surrounded by three or four membranes, but the plastid protein import machinery (in particular, how proteins cross the membrane corresponding to the secondary endosymbiont plasma membrane) remains unexplored in many of these algae. To reconstruct the putative protein import machinery of a secondary plastid, we used the chlorarachniophyte alga Bigelowiella natans, whose plastid is bounded by four membranes and still possesses a relict nucleus of a green algal endosymbiont (the nucleomorph) in the intermembrane space. We identified nine homologs of plant-like TOC/TIC components in the recently sequenced B. natans nuclear genome, adding to the two that remain in the nucleomorph genome (B. natans TOC75 [BnTOC75] and BnTIC20). All of these proteins were predicted to be localized to the plastid and might function in the inner two membranes. We also show that the homologs of a protein, Der1, that is known to mediate transport across the second membrane in the several lineages with secondary plastids of red algal origin is not associated with plastid protein targeting in B. natans. How plastid proteins cross this membrane remains a mystery, but it is clear that the protein transport machinery of chlorarachniophyte plastids differs from that of red algal secondary plastids.
PMCID: PMC3294442  PMID: 22267775
25.  Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology 
PLoS ONE  2012;7(11):e49202.
High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers.
PMCID: PMC3504011  PMID: 23185309

Results 1-25 (377378)