The establishment of an endosymbiotic relationship typically seems to be driven through complementation of the host's limited metabolic capabilities by the biochemical versatility of the endosymbiont. The most significant examples of endosymbiosis are represented by the endosymbiotic acquisition of plastids and mitochondria, introducing photosynthesis and respiration to eukaryotes. However, there are numerous other endosymbioses that evolved more recently and repeatedly across the tree of life. Recent advances in genome sequencing technology have led to a better understanding of the physiological basis of many endosymbiotic associations. This review focuses on endosymbionts in protists (unicellular eukaryotes). Selected examples illustrate the incorporation of various new biochemical functions, such as photosynthesis, nitrogen fixation and recycling, and methanogenesis, into protist hosts by prokaryotic endosymbionts. Furthermore, photosynthetic eukaryotic endosymbionts display a great diversity of modes of integration into different protist hosts.
In conclusion, endosymbiosis seems to represent a general evolutionary strategy of protists to acquire novel biochemical functions and is thus an important source of genetic innovation.
metabolic complementation; prokaryotic endosymbionts; eukaryotic endosymbionts; evolution; integration
It has been long known that insect-infecting trypanosomatid flagellates from the genera Angomonas and Strigomonas harbor bacterial endosymbionts (Candidatus Kinetoplastibacterium or TPE [trypanosomatid proteobacterial endosymbiont]) that supplement the host metabolism. Based on previous analyses of other bacterial endosymbiont genomes from other lineages, a stereotypical path of genome evolution in such bacteria over the duration of their association with the eukaryotic host has been characterized. In this work, we sequence and analyze the genomes of five TPEs, perform their metabolic reconstruction, do an extensive phylogenomic analyses with all available Betaproteobacteria, and compare the TPEs with their nearest betaproteobacterial relatives. We also identify a number of housekeeping and central metabolism genes that seem to have undergone positive selection. Our genome structure analyses show total synteny among the five TPEs despite millions of years of divergence, and that this lineage follows the common path of genome evolution observed in other endosymbionts of diverse ancestries. As previously suggested by cell biology and biochemistry experiments, Ca. Kinetoplastibacterium spp. preferentially maintain those genes necessary for the biosynthesis of compounds needed by their hosts. We have also shown that metabolic and informational genes related to the cooperation with the host are overrepresented amongst genes shown to be under positive selection. Finally, our phylogenomic analysis shows that, while being in the Alcaligenaceae family of Betaproteobacteria, the closest relatives of these endosymbionts are not in the genus Bordetella as previously reported, but more likely in the Taylorella genus.
endosymbiont biology; phylogenomics; comparative genomics; Trypanosomatidae; selective pressure
Very small genomes have evolved repeatedly in eubacterial lineages that have adopted obligate associations with eukaryotic hosts. Complete genome sequences have revealed that small genomes retain very different gene sets, raising the question of how final genome content is determined. To examine the process of genome reduction, the tiny genome of the endosymbiont Buchnera aphidicola was compared to the larger ancestral genome, reconstructed on the basis of the phylogenetic distribution of gene orthologs among fully sequenced relatives of Escherichia coli and Buchnera.
The reconstructed ancestral genome contained 2,425 open reading frames (ORFs). The Buchnera genome, containing 564 ORFs, consists of 153 fragments of 1-34 genes that are syntenic with reconstructed ancestral regions. On the basis of this reconstruction, 503 genes were eliminated within syntenic fragments, and 1,403 genes were lost from the gaps between syntenic fragments, probably in connection with genome rearrangements. Lost regions are sometimes large, and often span functionally unrelated genes. In addition, individual genes and regulatory regions have been lost or eroded. For the categories of DNA repair genes and rRNA genes, most lost loci fall in regions between syntenic fragments. This history of gene loss is reflected in the sequences of intergenic spacers at positions where genes were once present.
The most plausible interpretation of this reconstruction is that Buchnera lost many genes through the fixation of large deletions soon after the acquisition of an obligate endosymbiotic lifestyle. An implication is that final genome composition may be partly the chance outcome of initial deletions and that neighboring genes influence the likelihood of loss of particular genes and pathways.
Genome reduction is a common evolutionary process affecting bacterial lineages that establish symbiotic or pathogenic associations with eukaryotic hosts. Such associations yield highly reduced genomes with greatly streamlined metabolic abilities shaped by the type of ecological association with the host. Sodalis glossinidius, the secondary endosymbiont of tsetse flies, represents one of the few complete genomes available of a bacterium at the initial stages of this process. In the present study, genome reduction is studied from a systems biology perspective through the reconstruction and functional analysis of genome-scale metabolic networks of S. glossinidius.
The functional profile of ancestral and extant metabolic networks sheds light on the evolutionary events underlying transition to a host-dependent lifestyle. Meanwhile, reductive evolution simulations on the extant metabolic network can predict possible future evolution of S. glossinidius in the context of genome reduction. Finally, knockout simulations in different metabolic systems reveal a gradual decrease in network robustness to different mutational events for bacterial endosymbionts at different stages of the symbiotic association.
Stoichiometric analysis reveals few gene inactivation events whose effects on the functionality of S. glossinidius metabolic systems are drastic enough to account for the ecological transition from a free-living to host-dependent lifestyle. The decrease in network robustness across different metabolic systems may be associated with the progressive integration in the more stable environment provided by the insect host. Finally, reductive evolution simulations reveal the strong influence that external conditions exert on the evolvability of metabolic systems.
Most plastid proteins are encoded by their nuclear genomes and need to be targeted across multiple envelope membranes. In vascular plants, the translocons at the outer and inner envelope membranes of chloroplasts (TOC and TIC, respectively) facilitate transport across the two plastid membranes. In contrast, several algal groups harbor more complex plastids, the so-called secondary plastids, which are surrounded by three or four membranes, but the plastid protein import machinery (in particular, how proteins cross the membrane corresponding to the secondary endosymbiont plasma membrane) remains unexplored in many of these algae. To reconstruct the putative protein import machinery of a secondary plastid, we used the chlorarachniophyte alga Bigelowiella natans, whose plastid is bounded by four membranes and still possesses a relict nucleus of a green algal endosymbiont (the nucleomorph) in the intermembrane space. We identified nine homologs of plant-like TOC/TIC components in the recently sequenced B. natans nuclear genome, adding to the two that remain in the nucleomorph genome (B. natans TOC75 [BnTOC75] and BnTIC20). All of these proteins were predicted to be localized to the plastid and might function in the inner two membranes. We also show that the homologs of a protein, Der1, that is known to mediate transport across the second membrane in the several lineages with secondary plastids of red algal origin is not associated with plastid protein targeting in B. natans. How plastid proteins cross this membrane remains a mystery, but it is clear that the protein transport machinery of chlorarachniophyte plastids differs from that of red algal secondary plastids.
Mitochondria or mitochondrion-derived organelles are found in all eukaryotes with the exception of secondary or tertiary plastid endosymbionts. In these highly reduced systems, the mitochondrion has been lost in all cases except the diatom endosymbionts found in a small group of dinoflagellates, called ‘dinotoms’, the only cells with two evolutionarily distinct mitochondria. To investigate the persistence of this redundancy and its consequences on the content and structure of the endosymbiont and host mitochondrial genomes, we report the sequences of these genomes from two dinotoms.
The endosymbiont mitochondrial genomes of Durinskia baltica and Kryptoperidinium foliaceum exhibit nearly identical gene content with other diatoms, and highly conserved gene order (nearly identical to that of the raphid pennate diatom Fragilariopsis cylindrus). These two genomes are differentiated from other diatoms' by the fission of nad11 and by an insertion within nad2, in-frame and unspliced from the mRNA. Durinskia baltica is further distinguished from K. foliaceum by two gene fusions and its lack of introns. The host mitochondrial genome in D. baltica encodes cox1 and cob plus several fragments of LSU rRNA gene in a hugely expanded genome that includes numerous pseudogenes, and a trans-spliced cox3 gene, like in other dinoflagellates. Over 100 distinct contigs were identified through 454 sequencing, but intact full-length genes for cox1, cob and the 5′ exon of cox3 were present as a single contig each, suggesting most of the genome is pseudogenes. The host mitochondrial genome of K. foliaceum was difficult to identify, but fragments of all the three protein-coding genes, corresponding transcripts, and transcripts of several LSU rRNA fragments were all recovered.
Overall, the endosymbiont and host mitochondrial genomes in the two dinotoms have changed surprisingly little from those of free-living diatoms and dinoflagellates, irrespective of their long coexistence side by side in dinotoms.
Wolbachia endosymbionts are widespread in arthropods and are generally considered reproductive parasites, inducing various phenotypes including cytoplasmic incompatibility, parthenogenesis, feminization and male killing, which serve to promote their spread through populations. In contrast, Wolbachia infecting filarial nematodes that cause human diseases, including elephantiasis and river blindness, are obligate mutualists. DNA purification methods for efficient genomic sequencing of these unculturable bacteria have proven difficult using a variety of techniques. To efficiently capture endosymbiont DNA for studies that examine the biology of symbiosis, we devised a parallel strategy to an earlier array-based method by creating a set of SureSelect™ (Agilent) 120-mer target enrichment RNA oligonucleotides (“baits”) for solution hybrid selection. These were designed from Wolbachia complete and partial genome sequences in GenBank and were tiled across each genomic sequence with 60 bp overlap. Baits were filtered for homology against host genomes containing Wolbachia using BLAT and sequences with significant host homology were removed from the bait pool. Filarial parasite Brugia malayi DNA was used as a test case, as the complete sequence of both Wolbachia and its host are known. DNA eluted from capture was size selected and sequencing samples were prepared using the NEBNext® Sample Preparation Kit. One-third of a 50 nt paired-end sequencing lane on the HiSeq™ 2000 (Illumina) yielded 53 million reads and the entirety of the Wolbachia genome was captured. We then used the baits to isolate more than 97.1 % of the genome of a distantly related Wolbachia strain from the crustacean Armadillidium vulgare, demonstrating that the method can be used to enrich target DNA from unculturable microbes over large evolutionary distances.
Wolbachia; Obligate endosymbiont; Target enrichment; NextGen sequencing; DNA capture; SureSelect™
The lysogenic bacteriophage APSE infects “Candidatus Hamiltonella defensa,” a facultative endosymbiont of aphids and other sap-feeding insects. This endosymbiont has established a beneficial association with aphids, increasing survivorship following attack by parasitoid wasps. Although APSE and “Ca. Hamiltonella defensa” are effectively maternally transmitted between aphid generations, they can also be horizontally transferred among insect hosts, which results in genetically distinct “Ca. Hamiltonella defensa” strains infecting the same aphid species and sporadic distributions of both APSE and “Ca. Hamiltonella defensa” among hosts. Aphids infected only with “Ca. Hamiltonella defensa” have significantly less protection than those infected with both “Ca. Hamiltonella defensa” and APSE. This protection has been proposed to be connected to eukaryote-targeted toxins previously discovered in the genomes of two characterized APSE strains. In this study, we have sequenced partial genomes from seven additional APSE strains to address the evolution and extent of toxin variation in this phage. The APSE lysis region has been a hot spot for nonhomologous recombination of novel virulence cassettes. We identified four new toxins from three protein families, Shiga-like toxin, cytolethal distending toxin, and YD-repeat toxins. These recombination events have also resulted in reassortment of the downstream lysozyme and holin genes. Analysis of the conserved APSE genes flanking the variable toxin cassettes reveals a close phylogenetic association with phage sequences from two other facultative endosymbionts of insects. Thus, phage may act as a conduit for ongoing gene exchange among heritable endosymbionts.
Published data suggest that hydrogenosomes, organelles found in diverse anaerobic eukaryotes that make energy and hydrogen, were once mitochondria. As hydrogenosomes generally lack a genome, the conversion is probably one way. The sources of the key hydrogenosomal enzymes, pyruvate : ferredoxin oxidoreductase (PFO) and hydrogenase, are not resolved by current phylogenetic analyses, but it is likely that both were present at an early stage of eukaryotic evolution. Once thought to be restricted to a few unusual anaerobic eukaryotes, the proteins are intimately integrated into the fabric of diverse eukaryotic cells, where they are targeted to different cell compartments, and not just hydrogenosomes. There is no evidence supporting the view that PFO and hydrogenase originated from the mitochondrial endosymbiont, as posited by the hydrogen hypothesis for eukaryogenesis. Other organelles derived from mitochondria have now been described in anaerobic and parasitic microbial eukaryotes, including species that were once thought to have diverged before the mitochondrial symbiosis. It thus seems possible that all eukaryotes may eventually be shown to contain an organelle of mitochondrial ancestry, to which different types of biochemistry can be targeted. It remains to be seen if, despite their obvious differences, this family of organelles shares a common function of importance for the eukaryotic cell, other than energy production, that might provide the underlying selection pressure for organelle retention.
Symbiosis has been among the most important evolutionary steps to generate biological complexity. The establishment of symbiosis required an intimate metabolic link between biological systems with different complexity levels. The strict endo-cellular symbiotic bacteria of insects are beautiful examples of the metabolic coupling between organisms belonging to different kingdoms, a eukaryote and a prokaryote. The host (eukaryote) provides the endosymbiont (prokaryote) with a stable cellular environment while the endosymbiont supplements the host's diet with essential metabolites. For such communication to take place, endosymbionts' genomes have suffered dramatic modifications and reconfigurations of proteins' functions. Two of the main modifications, loss of genes redundant for endosymbiotic bacteria or the host and bacterial genome streamlining, have been extensively studied. However, no studies have accounted for possible functional shifts in the endosymbiotic proteomes. Here, we develop a simple method to screen genomes for evidence of functional divergence between two species clusters, and we apply it to identify functional shifts in the endosymbiotic proteomes. Despite the strong effects of genetic drift in the endosymbiotic systems, we unexpectedly identified genes to be under stronger selective constraints in endosymbionts of aphids and ants than in their free-living bacterial relatives. These genes are directly involved in supplementing the host's diet with essential metabolites. A test of functional divergence supports a strong relationship between the endosymbiosis and the functional shifts of proteins involved in the metabolic communication with the insect host. The correlation between functional divergence in the endosymbiotic bacterium and the ecological requirements of the host uncovers their intimate biochemical and metabolic communication and provides insights on the role of symbiosis in generating species diversity.
Biological complexity has emerged on earth by the combination of living forms. This combination, called symbiosis, had to overcome the problems caused by the uncoupled metabolisms of the organisms involved. One way to do so was through the loss of genes that were no longer needed for the endosymbiont in the protected cellular environment provided by the host. Another step necessary to adjust both metabolisms was through the change in the function of bacterial proteins to perform new roles in the symbiotic system. In this article, we test such events in symbiotic systems involving an insect and a bacterium by developing a new and simple method to identify proteome-wide functional shifts. Our results show that most of the functional changes occurred at genes involved in metabolic communication with the host and are correlated with the host's ecological traits.
The rhizarian amoeba Paulinella chromatophora harbors two photosynthetically active and deeply integrated cyanobacterial endosymbionts acquired ~60 million years ago. Recent genomic analyses of P. chromatophora have revealed the loss of many essential genes from the endosymbiont’s genome, and have identified more than 30 genes that have been transferred to the host cell’s nucleus through endosymbiotic gene transfer (EGT). This indicates that, similar to classical primary plastids, Paulinella endosymbionts have evolved a transport system to import their nuclear-encoded proteins. To deduce how these proteins are transported, we searched for potential targeting signals in genes for 10 EGT-derived proteins. Our analyses indicate that five proteins carry potential signal peptides, implying they are targeted via the host endomembrane system. One sequence encodes a mitochondrial-like transit peptide, which suggests an import pathway involving a channel protein residing in the outer membrane of the endosymbiont. No N-terminal targeting signals were identified in the four other genes, but their encoded proteins could utilize non-classical targeting signals contained internally or in C-terminal regions. Several amino acids more often found in the Paulinella EGT-derived proteins than in their ancestral set (proteins still encoded in the endosymbiont genome) could constitute such signals. Characteristic features of the EGT-derived proteins are low molecular weight and nearly neutral charge, which both could be adaptations to enhance passage through the peptidoglycan wall present in the intermembrane space of the endosymbiont’s envelope. Our results suggest that Paulinella endosymbionts/plastids have evolved several different import routes, as has been shown in classical primary plastids.
Paulinella chromatophora; Endosymbiosis; Plastid; Pre-sequence; Targeting signal; Endosymbiotic gene transfer; Life Sciences; Philosophy of Biology; Theoretical Ecology/Statistics; Bioinformatics; Statistical Physics, Dynamical Systems and Complexity; Mathematical and Computational Biology; Evolutionary Biology
In one small group of dinoflagellates, photosynthesis is carried out by a tertiary endosymbiont derived from a diatom, giving rise to a complex cell that we collectively refer to as a ‘dinotom’. The endosymbiont is separated from its host by a single membrane and retains plastids, mitochondria, a large nucleus, and many other eukaryotic organelles and structures, a level of complexity suggesting an early stage of integration. Although the evolution of these endosymbionts has attracted considerable interest, the plastid genome has not been examined in detail, and indeed no tertiary plastid genome has yet been sequenced.
Here we describe the complete plastid genomes of two closely related dinotoms, Durinskia baltica and Kryptoperidinium foliaceum. The D. baltica (116470 bp) and K. foliaceum (140426 bp) plastid genomes map as circular molecules featuring two large inverted repeats that separate distinct single copy regions. The organization and gene content of the D. baltica plastid closely resemble those of the pennate diatom Phaeodactylum tricornutum. The K. foliaceum plastid genome is much larger, has undergone more reorganization, and encodes a putative tyrosine recombinase (tyrC) also found in the plastid genome of the heterokont Heterosigma akashiwo, and two putative serine recombinases (serC1 and serC2) homologous to recombinases encoded by plasmids pCf1 and pCf2 in another pennate diatom, Cylindrotheca fusiformis. The K. foliaceum plastid genome also contains an additional copy of serC1, two degenerate copies of another plasmid-encoded ORF, and two non-coding regions whose sequences closely resemble portions of the pCf1 and pCf2 plasmids.
These results suggest that while the plastid genomes of two dinotoms share very similar gene content and genome organization with that of the free-living pennate diatom P. tricornutum, the K. folicaeum plastid genome has absorbed two exogenous plasmids. Whether this took place before or after the tertiary endosymbiosis is not clear.
All complex life on Earth is eukaryotic. All eukaryotic cells share a common ancestor that arose just once in four billion years of evolution. Prokaryotes show no tendency to evolve greater morphological complexity, despite their metabolic virtuosity. Here I argue that the eukaryotic cell originated in a unique prokaryotic endosymbiosis, a singular event that transformed the selection pressures acting on both host and endosymbiont.
The reductive evolution and specialisation of endosymbionts to mitochondria resulted in an extreme genomic asymmetry, in which the residual mitochondrial genomes enabled the expansion of bioenergetic membranes over several orders of magnitude, overcoming the energetic constraints on prokaryotic genome size, and permitting the host cell genome to expand (in principle) over 200,000-fold. This energetic transformation was permissive, not prescriptive; I suggest that the actual increase in early eukaryotic genome size was driven by a heavy early bombardment of genes and introns from the endosymbiont to the host cell, producing a high mutation rate. Unlike prokaryotes, with lower mutation rates and heavy selection pressure to lose genes, early eukaryotes without genome-size limitations could mask mutations by cell fusion and genome duplication, as in allopolyploidy, giving rise to a proto-sexual cell cycle. The side effect was that a large number of shared eukaryotic basal traits accumulated in the same population, a sexual eukaryotic common ancestor, radically different to any known prokaryote.
The combination of massive bioenergetic expansion, release from genome-size constraints, and high mutation rate favoured a protosexual cell cycle and the accumulation of eukaryotic traits. These factors explain the unique origin of eukaryotes, the absence of true evolutionary intermediates, and the evolution of sex in eukaryotes but not prokaryotes.
This article was reviewed by: Eugene Koonin, William Martin, Ford Doolittle and Mark van der Giezen. For complete reports see the Reviewers' Comments section.
Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented.
The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages.
Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol.
Algae with secondary plastids such as diatoms maintain two different eukaryotic cytoplasms. One of them, the so-called periplastidal compartment (PPC), is the naturally minimized cytoplasm of a eukaryotic endosymbiont. In order to investigate the protein composition of the PPC of diatoms, we applied knowledge of the targeting signals of PPC-directed proteins in searches of the genome data for proteins acting in the PPC and proved their in vivo localization via expressing green fluorescent protein (GFP) fusions. Our investigation increased the knowledge of the protein content of the PPC approximately 3-fold and thereby indicated that this narrow compartment was functionally reduced to some important cellular functions with nearly no housekeeping biochemical pathways.
secondary endosymbiosis; periplastidal compartment (PPC); plastid protein import; bipartite targeting signal (BTS); Phaeodactylum tricornutum; diatom
An analysis of Cryptosporidium parvum genes of likely endosymbiont or prokaryotic origin supports the hypothesis that C. arvum evolved from a plastid-containing lineage.
The apicomplexan parasite Cryptosporidium parvum is an emerging pathogen capable of causing illness in humans and other animals and death in immunocompromised individuals. No effective treatment is available and the genome sequence has recently been completed. This parasite differs from other apicomplexans in its lack of a plastid organelle, the apicoplast. Gene transfer, either intracellular from an endosymbiont/donor organelle or horizontal from another organism, can provide evidence of a previous endosymbiotic relationship and/or alter the genetic repertoire of the host organism. Given the importance of gene transfers in eukaryotic evolution and the potential implications for chemotherapy, it is important to identify the complement of transferred genes in Cryptosporidium.
We have identified 31 genes of likely plastid/endosymbiont (n = 7) or prokaryotic (n = 24) origin using a phylogenomic approach. The findings support the hypothesis that Cryptosporidium evolved from a plastid-containing lineage and subsequently lost its apicoplast during evolution. Expression analyses of candidate genes of algal and eubacterial origin show that these genes are expressed and developmentally regulated during the life cycle of C. parvum.
Cryptosporidium is the recipient of a large number of transferred genes, many of which are not shared by other apicomplexan parasites. Genes transferred from distant phylogenetic sources, such as eubacteria, may be potential targets for therapeutic drugs owing to their phylogenetic distance or the lack of homologs in the host. The successful integration and expression of the transferred genes in this genome has changed the genetic and metabolic repertoire of the parasite.
It is generally accepted that plastids first arose by acquisition of photosynthetic prokaryotic endosymbionts by non-photosynthetic eukaryotic hosts. It is also accepted that photosynthetic eukaryotes were acquired on several occasions as endosymbionts by non-photosynthetic eukaryote hosts to form secondary plastids. In some lineages, secondary plastids were lost and new symbionts were acquired, to form tertiary plastids. Most recent work has been interpreted to indicate that primary plastids arose only once, referred to as a ‘monophyletic’ origin. We critically assess the evidence for this. We argue that the combination of Ockham's razor and poor taxon sampling will bias studies in favour of monophyly. We discuss possible concerns in phylogenetic reconstruction from sequence data. We argue that improved understanding of lineage-specific substitution processes is needed to assess the reliability of sequence-based trees. Improved understanding of the timing of the radiation of present-day cyanobacteria is also needed. We suggest that acquisition of plastids is better described as the result of a process rather than something occurring at a discrete time, and describe the ‘shopping bag’ model of plastid origin. We argue that dinoflagellates and other lineages provide evidence in support of this.
endosymbiosis; polyphyly; monophyly; plastid; phylogenetics; dinoflagellate
Saccharomyces cerevisiae is the first eukaryotic organism for which a multi-compartment genome-scale metabolic model was constructed. Since then a sequence of improved metabolic reconstructions for yeast has been introduced. These metabolic models have been extensively used to elucidate the organizational principles of yeast metabolism and drive yeast strain engineering strategies for targeted overproductions. They have also served as a starting point and a benchmark for the reconstruction of genome-scale metabolic models for other eukaryotic organisms. In spite of the successive improvements in the details of the described metabolic processes, even the recent yeast model (i.e., iMM904) remains significantly less predictive than the latest E. coli model (i.e., iAF1260). This is manifested by its significantly lower specificity in predicting the outcome of grow/no grow experiments in comparison to the E. coli model.
In this paper we make use of the automated GrowMatch procedure for restoring consistency with single gene deletion experiments in yeast and extend the procedure to make use of synthetic lethality data using the genome-scale model iMM904 as a basis. We identified and vetted using literature sources 120 distinct model modifications including various regulatory constraints for minimal and YP media. The incorporation of the suggested modifications led to a substantial increase in the fraction of correctly predicted lethal knockouts (i.e., specificity) from 38.84% (87 out of 224) to 53.57% (120 out of 224) for the minimal medium and from 24.73% (45 out of 182) to 40.11% (73 out of 182) for the YP medium. Synthetic lethality predictions improved from 12.03% (16 out of 133) to 23.31% (31 out of 133) for the minimal medium and from 6.96% (8 out of 115) to 13.04% (15 out of 115) for the YP medium.
Overall, this study provides a roadmap for the computationally driven correction of multi-compartment genome-scale metabolic models and demonstrates the value of synthetic lethals as curation agents.
Nucleomorphs are the remnant nuclei of algal endosymbionts that were engulfed by nonphotosynthetic host eukaryotes. These peculiar organelles are found in cryptomonad and chlorarachniophyte algae, where they evolved from red and green algal endosymbionts, respectively. Despite their independent origins, cryptomonad and chlorarachniophyte nucleomorph genomes are similar in size and structure: they are both <1 million base pairs in size (the smallest nuclear genomes known), comprised three chromosomes, and possess subtelomeric ribosomal DNA operons. Here, we report the complete sequence of one of the smallest cryptomonad nucleomorph genomes known, that of the secondarily nonphotosynthetic cryptomonad Cryptomonas paramecium. The genome is 486 kbp in size and contains 518 predicted genes, 466 of which are protein coding. Although C. paramecium lacks photosynthetic ability, its nucleomorph genome still encodes 18 plastid-associated proteins. More than 90% of the “conserved” protein genes in C. paramecium (i.e., those with clear homologs in other eukaryotes) are also present in the nucleomorph genomes of the cryptomonads Guillardia theta and Hemiselmis andersenii. In contrast, 143 of 466 predicted C. paramecium proteins (30.7%) showed no obvious similarity to proteins encoded in any other genome, including G. theta and H. andersenii. Significantly, however, many of these “nucleomorph ORFans” are conserved in position and size between the three genomes, suggesting that they are in fact homologous to one another. Finally, our analyses reveal an unexpected degree of overlap in the genes present in the independently evolved chlorarachniophyte and cryptomonad nucleomorph genomes: ∼80% of a set of 120 conserved nucleomorph genes in the chlorarachniophyte Bigelowiella natans were also present in all three cryptomonad nucleomorph genomes. This result suggests that similar reductive processes have taken place in unrelated lineages of nucleomorph-containing algae.
nucleomorph; cryptomonads; chlorarachniophytes; genome reduction; endosymbiosis
The endosymbiotic birth of organelles is accompanied by massive transfer of endosymbiont genes to the eukaryotic host nucleus. In the centric diatom Thalassiosira pseudonana the Psb28 protein is encoded in the plastid genome while a second version is nuclear-encoded and possesses a bipartite N-terminal presequence necessary to target the protein into the diatom complex plastid. Thus it can represent a gene captured during endosymbiotic gene transfer.
To specify the origin of nuclear- and plastid-encoded Psb28 in T. pseudonana we have performed extensive phylogenetic analyses of both mentioned genes. We have also experimentally tested the intracellular location of the nuclear-encoded Psb28 protein (nuPsb28) through transformation of the diatom Phaeodactylum tricornutum with the gene in question fused to EYFP.
We show here that both versions of the psb28 gene in T. pseudonana are transcribed. We also provide experimental evidence for successful targeting of the nuPsb28 fused with EYFP to the diatom complex plastid. Extensive phylogenetic analyses demonstrate that nucleotide composition of the analyzed genes deeply influences the tree topology and that appropriate methods designed to deal with a compositional bias of the sequences and the long branch attraction artefact (LBA) need to be used to overcome this obstacle. We propose that nuclear psb28 in T. pseudonana is a duplicate of a plastid localized version, and that it has been transferred from its endosymbiont.
Cockroaches are terrestrial insects that strikingly eliminate waste nitrogen as ammonia instead of uric acid. Blattabacterium cuenoti (Mercier 1906) strains Bge and Pam are the obligate primary endosymbionts of the cockroaches Blattella germanica and Periplaneta americana, respectively. The genomes of both bacterial endosymbionts have recently been sequenced, making possible a genome-scale constraint-based reconstruction of their metabolic networks. The mathematical expression of a metabolic network and the subsequent quantitative studies of phenotypic features by Flux Balance Analysis (FBA) represent an efficient functional approach to these uncultivable bacteria.
We report the metabolic models of Blattabacterium strains Bge (iCG238) and Pam (iCG230), comprising 296 and 289 biochemical reactions, associated with 238 and 230 genes, and 364 and 358 metabolites, respectively. Both models reflect both the striking similarities and the singularities of these microorganisms. FBA was used to analyze the properties, potential and limits of the models, assuming some environmental constraints such as aerobic conditions and the net production of ammonia from these bacterial systems, as has been experimentally observed. In addition, in silico simulations with the iCG238 model have enabled a set of carbon and nitrogen sources to be defined, which would also support a viable phenotype in terms of biomass production in the strain Pam, which lacks the first three steps of the tricarboxylic acid cycle. FBA reveals a metabolic condition that renders these enzymatic steps dispensable, thus offering a possible evolutionary explanation for their elimination. We also confirm, by computational simulations, the fragility of the metabolic networks and their host dependence.
The minimized Blattabacterium metabolic networks are surprisingly similar in strains Bge and Pam, after 140 million years of evolution of these endosymbionts in separate cockroach lineages. FBA performed on the reconstructed networks from the two bacteria helps to refine the functional analysis of the genomes enabling us to postulate how slightly different host metabolic contexts drove their parallel evolution.
Euglenophytes are a group of photosynthetic flagellates possessing a plastid derived from a green algal endosymbiont, which was incorporated into an ancestral host cell via secondary endosymbiosis. However, the impact of endosymbiosis on the euglenophyte nuclear genome is not fully understood due to its complex nature as a 'hybrid' of a non-photosynthetic host cell and a secondary endosymbiont.
We analyzed an EST dataset of the model euglenophyte Euglena gracilis using a gene mining program designed to detect laterally transferred genes. We found E. gracilis genes showing affinity not only with green algae, from which the secondary plastid in euglenophytes evolved, but also red algae and/or secondary algae containing red algal-derived plastids. Phylogenetic analyses of these 'red lineage' genes suggest that E. gracilis acquired at least 14 genes via eukaryote-to-eukaryote lateral gene transfer from algal sources other than the green algal endosymbiont that gave rise to its current plastid. We constructed an EST library of the aplastidic euglenid Peranema trichophorum, which is a eukaryovorous relative of euglenophytes, and also identified 'red lineage' genes in its genome.
Our data show genome mosaicism in E. gracilis and P. trichophorum. One possible explanation for the presence of these genes in these organisms is that some or all of them were independently acquired by lateral gene transfer and contributed to the successful integration and functioning of the green algal endosymbiont as a secondary plastid. Alternative hypotheses include the presence of a phagocytosed alga as the single source of those genes, or a cryptic tertiary endosymbiont harboring secondary plastid of red algal origin, which the eukaryovorous ancestor of euglenophytes had acquired prior to the secondary endosymbiosis of a green alga.
As predicted by the nearly neutral model of evolution, numerous studies have shown that reduced Ne accelerates the accumulation of slightly deleterious changes under genetic drift. While such studies have mostly focused on eukaryotes, bacteria also offer excellent models to explore the effects of Ne. Most notably, the genomes of host-dependent bacteria with small Ne show signatures of genetic drift, including elevated Ka/Ks. Here, I explore the utility of an alternative measure of selective constraint: the per-site rate of radical and conservative amino acid substitutions (Dr/Dc). I test the hypothesis that purifying selection against radical amino acid changes is less effective in two insect endosymbiont groups (Blochmannia of ants and Buchnera of aphids), compared to related gamma-Proteobacteria. Genome comparisons demonstrate a significant elevation in Dr/Dc in endosymbionts that affects the majority (66–79%) of shared orthologs examined. The elevation of Dr/Dc in endosymbionts affects all functional categories examined. Simulations indicate that Dr/Dc estimates are sensitive to codon frequencies and mutational parameters; however, estimation biases occur in the opposite direction as the patterns observed in genome comparisons, thereby making the inference of elevated Dr/Dc more conservative. Increased Dr/Dc and other signatures of genome degradation in endosymbionts are consistent with strong effects of genetic drift in their small populations, as well as linkage to selected sites in these asexual bacteria. While relaxed selection against radical substitutions may contribute, genome-wide processes such as genetic drift and linkage best explain the pervasive elevation in Dr/Dc across diverse functional categories that include basic cellular processes. Although the current study focuses on a few bacterial lineages, it suggests Dr/Dc is a useful gauge of selective constraint and may provide a valuable alternative to Ka/Ks when high sequence divergences preclude estimates of Ks. Broader application of Dr/Dc will benefit from approaches less prone to estimation biases.
Whole-genome transporter analyses have been conducted on 141 organisms whose complete genome sequences are available. For each organism, the complete set of membrane transport systems was identified with predicted functions, and classified into protein families based on the transporter classification system. Organisms with larger genome sizes generally possessed a relatively greater number of transport systems. In prokaryotes and unicellular eukaryotes, the significant factor in the increase in transporter content with genome size was a greater diversity of transporter types. In contrast, in multicellular eukaryotes, greater number of paralogs in specific transporter families was the more important factor in the increase in transporter content with genome size. Both eukaryotic and prokaryotic intracellular pathogens and endosymbionts exhibited markedly limited transport capabilities. Hierarchical clustering of phylogenetic profiles of transporter families, derived from the presence or absence of a certain transporter family, showed that clustering patterns of organisms were correlated to both their evolutionary history and their overall physiology and lifestyles.
Membrane transporters are the cell's equivalent of delivery vehicles, garbage disposals, and communication systems—proteins that negotiate through cell membranes to deliver essential nutrients, eject waste products, and help the cell sense environmental conditions around it. Membrane transport systems play crucial roles in fundamental cellular processes of all organisms. The suite of transporters in any one organism also sheds light on its lifestyle and physiology. Up to now, analysis of membrane transporters has been limited mainly to the examination of transporter genes of individual organisms. But advances in genome sequencing have now made it possible for scientists to compare transport and other essential cellular processes across a range of organisms in all three domains of life.
Ren and Paulsen present the first comprehensive bioinformatic analysis of the predicted membrane transporter content of 141 different prokaryotic and eukaryotic organisms. The scientists developed a new computational application of the phylogenetic profiling approach to cluster together organisms that appear to have similar suites of transporters. For example, a group of obligate intracellular pathogens and endosymbionts possess only limited transporter systems in spite of the massive metabolite fluxes one would expect between the symbionts and their host. This is likely due to the relatively static nature of their intracellular environment. In contrast, a cluster of plant/soil-associated microbes encode a robust array of transporters, reflecting the organisms' versatility as well as their exposure to a wide range of different substrates in their natural environment.
The availability of complete genome sequence data from both bacteria and eukaryotes provides information about the contribution of bacterial genes to the origin and evolution of mitochondria. Phylogenetic analyses based on genes located in the mitochondrial genome indicate that these genes originated from within the alpha-proteobacteria. A number of ancestral bacterial genes have also been transferred from the mitochondrial to the nuclear genome, as evidenced by the presence of orthologous genes in the mitochondrial genome in some species and in the nuclear genome of other species. However, a multitude of mitochondrial proteins encoded in the nucleus display no homology to bacterial proteins, indicating that these originated within the eukaryotic cell subsequent to the acquisition of the endosymbiont. An analysis of the expression patterns of yeast nuclear genes coding for mitochondrial proteins has shown that genes predicted to be of eukaryotic origin are mainly translated on polysomes that are free in the cytosol whereas those of putative bacterial origin are translated on polysomes attached to the mitochondrion. The strong relationship with alpha-proteobacterial genes observed for some mitochondrial genes, combined with the lack of such a relationship for others, indicates that the modern mitochondrial proteome is the product of both reductive and expansive processes.