Prokaryotic taxonomy is the underpinning of microbiology, as it provides a framework for the proper identification and naming of organisms. The “gold standard” of bacterial species delineation is the overall genome similarity determined by DNA-DNA hybridization (DDH), a technically rigorous yet sometimes variable method that may produce inconsistent results. Improvements in next-generation sequencing have resulted in an upsurge of bacterial genome sequences and bioinformatic tools that compare genomic data, such as average nucleotide identity (ANI), correlation of tetranucleotide frequencies, and the genome-to-genome distance calculator, or in silico DDH (isDDH). Here, we evaluate ANI and isDDH in combination with phylogenetic studies using Aeromonas, a taxonomically challenging genus with many described species and several strains that were reassigned to different species as a test case. We generated improved, high-quality draft genome sequences for 33 Aeromonas strains and combined them with 23 publicly available genomes. ANI and isDDH distances were determined and compared to phylogenies from multilocus sequence analysis of housekeeping genes, ribosomal proteins, and expanded core genes. The expanded core phylogenetic analysis suggested relationships between distant Aeromonas clades that were inconsistent with studies using fewer genes. ANI values of ≥96% and isDDH values of ≥70% consistently grouped genomes originating from strains of the same species together. Our study confirmed known misidentifications, validated the recent revisions in the nomenclature, and revealed that a number of genomes deposited in GenBank are misnamed. In addition, two strains were identified that may represent novel Aeromonas species.
Improvements in DNA sequencing technologies have resulted in the ability to generate large numbers of high-quality draft genomes and led to a dramatic increase in the number of publically available genomes. This has allowed researchers to characterize microorganisms using genome data. Advantages of genome sequence-based classification include data and computing programs that can be readily shared, facilitating the standardization of taxonomic methodology and resolving conflicting identifications by providing greater uniformity in an overall analysis. Using Aeromonas as a test case, we compared and validated different approaches. Based on our analyses, we recommend cutoff values for distance measures for identifying species. Accurate species classification is critical not only to obviate the perpetuation of errors in public databases but also to ensure the validity of inferences made on the relationships among species within a genus and proper identification in clinical and veterinary diagnostic laboratories.
We recently reported that the Thermotogales acquired the ability to synthesize vitamin B12 by acquisition of genes from two distantly related lineages, Archaea and Firmicutes (K. S. Swithers et al., Genome Biol. Evol. 4:730–739, 2012). Ancestral state reconstruction suggested that the cobinamide salvage gene cluster was present in the Thermotogales' most recent common ancestor. We also predicted that Thermotoga lettingae could not synthesize B12
de novo but could use the cobinamide salvage pathway to synthesize B12. In this study, these hypotheses were tested, and we found that Tt. lettingae did not synthesize B12
de novo but salvaged cobinamide. The growth rate of Tt. lettingae increased with the addition of B12 or cobinamide to its medium. It synthesized B12 when the medium was supplemented with cobinamide, and no B12 was detected in cells grown on cobinamide-deficient medium. Upstream of the cobinamide salvage genes is a putative B12 riboswitch. In other organisms, B12 riboswitches allow for higher transcriptional activity in the absence of B12. When Tt. lettingae was grown with no B12, the salvage genes were upregulated compared to cells grown with B12 or cobinamide. Another gene cluster with a putative B12 riboswitch upstream is the btuFCD ABC transporter, and it showed a transcription pattern similar to that of the cobinamide salvage genes. The BtuF proteins from species that can and cannot salvage cobinamides were shown in vitro to bind both B12 and cobinamide. These results suggest that Thermotogales species can use the BtuFCD transporter to import both B12 and cobinamide, even if they cannot salvage cobinamide.
The Halobacteria are known to engage in frequent gene transfer and homologous recombination. For stably diverged lineages to persist some checks on the rate of between lineage recombination must exist. We surveyed a group of isolates from the Aran-Bidgol endorheic lake in Iran and sequenced a selection of them. Multilocus Sequence Analysis (MLSA) and Average Nucleotide Identity (ANI) revealed multiple clusters (phylogroups) of organisms present in the lake. Patterns of intein and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) presence/absence and their sequence similarity, GC usage along with the ANI and the identities of the genes used in the MLSA revealed that two of these clusters share an exchange bias toward others in their phylogroup while showing reduced rates of exchange with other organisms in the environment. However, a third cluster, composed in part of named species from other areas of central Asia, displayed many indications of variability in exchange partners, from within the lake as well as outside the lake. We conclude that barriers to gene exchange exist between the two purely Aran-Bidgol phylogroups, and that the third cluster with members from other regions is not a single population and likely reflects an amalgamation of several populations.
Halobacteria; Multilocus Sequence Analysis (MLSA); Average Nucleotide Identity (ANI); intein; CRISPR
Halobacteria require high NaCl concentrations for growth and are the dominant inhabitants of hypersaline environments above 15% NaCl. They are well-documented to be highly recombinogenic, both in frequency and in the range of exchange partners. In this study, we examine the genetic and genomic variation of cultured, naturally co-occurring environmental populations of Halobacteria. Sequence data from multiple loci (~2500 bp) identified many closely and more distantly related strains belonging to the genera Halorubrum and Haloarcula. Genome fingerprinting using a random priming PCR amplification method to analyze these isolates revealed diverse banding patterns across each of the genera and surprisingly even for isolates that are identical at the nucleotide level for five protein coding sequenced loci. This variance in genome structure even between identical multilocus sequence analysis (MLSA) haplotypes indicates that accumulation of genomic variation is rapid: faster than the rate of third codon substitutions.
Halobacteria; MLSA; genome fingerprinting; Aran-Bidgol lake; environmental population
The bacterial genomes of Thermotoga species show evidence of significant interdomain horizontal gene transfer from the Archaea. Members of this genus acquired many genes from the Thermococcales, which grow at higher temperatures than Thermotoga species. In order to study the functional history of an interdomain horizontally acquired gene we used ancestral sequence reconstruction to examine the thermal characteristics of reconstructed ancestral proteins of the Thermotoga lineage and its archaeal donors. Several ancestral sequence reconstruction methods were used to determine the possible sequences of the ancestral Thermotoga and Archaea myo-inositol-3-phosphate synthase (MIPS). These sequences were predicted to be more thermostable than the extant proteins using an established sequence composition method. We verified these computational predictions by measuring the activities and thermostabilities of purified proteins from the Thermotoga and the Thermococcales species, and eight ancestral reconstructed proteins. We found that the ancestral proteins from both the archaeal donor and the Thermotoga most recent common ancestor recipient were more thermostable than their descendants. We show that there is a correlation between the thermostability of MIPS protein and the optimal growth temperature (OGT) of its host, which suggests that the OGT of the ancestors of these species of Archaea and the Thermotoga grew at higher OGTs than their descendants.
In vitro studies of the haloarchaeal genus Haloferax have demonstrated
their ability to frequently exchange DNA between species, whereas rates of homologous
recombination estimated from natural populations in the genus Halorubrum
are high enough to maintain random association of alleles between five loci. To quantify
the effects of gene transfer and recombination of commonly held (relaxed core) genes
during the evolution of the class Halobacteria (haloarchaea), we reconstructed the history
of 21 genomes representing all major groups. Using a novel algorithm and a concatenated
ribosomal protein phylogeny as a reference, we created a directed horizontal genetic
transfer (HGT) network of contemporary and ancestral genomes. Gene order analysis revealed
that 90% of testable HGTs were by direct homologous replacement, rather than
nonhomologous integration followed by a loss. Network analysis revealed an inverse
log-linear relationship between HGT frequency and ribosomal protein evolutionary distance
that is maintained across the deepest divergences in Halobacteria. We use this
mathematical relationship to estimate the total transfers and amino acid substitutions
delivered by HGTs in each genome, providing a measure of chimerism. For the relaxed core
genes of each genome, we conservatively estimate that 11–20% of their
evolution occurred in other haloarchaea. Our findings are unexpected, because the transfer
and homologous recombination of relaxed core genes between members of the class
Halobacteria disrupts the coevolution of genes; however, the generation of new
combinations of divergent but functionally related genes may lead to adaptive phenotypes
not available through cumulative mutations and recombination within a single
homologous recombination; horizontal gene transfer; lateral gene transfer; fitness landscape; populations; microbial evolution
The availability of genome sequences of Thermotogales species from across the order allows an examination of the evolutionary origins of phenotypic characteristics in this lineage. Several studies have shown that the Thermotogales have acquired large numbers of genes from distantly related lineages, particularly Firmicutes and Archaea. Here, we report the finding that some Thermotogales acquired the ability to synthesize vitamin B12 by acquiring the requisite genes from these distant lineages. Thermosipho species, uniquely among the Thermotogales, contain genes that encode the means to synthesize vitamin B12 de novo from glutamate. These genes are split into two gene clusters: the corrinoid synthesis gene cluster, that is unique to the Thermosipho and the cobinamide salvage gene cluster. The corrinoid synthesis cluster was acquired from the Firmicutes lineage, whereas the salvage pathway is an amalgam of bacteria- and archaea-derived proteins. The cobinamide salvage gene cluster has a patchy distribution among Thermotogales species, and ancestral state reconstruction suggests that this pathway was present in the common Thermotogales ancestor. We show that Thermosipho africanus can grow in the absence of vitamin B12, so its de novo pathway is functional. We detected vitamin B12 in the extracts of T. africanus cells to verify the synthetic pathway. Genes in T. africanus with apparent B12 riboswitches were found to be down-regulated in the presence of vitamin B12 consistent with their roles in B12 synthesis and cobinamide salvage.
cobalamin; cobinamide; Thermotogales; vitamin B12; horizontal gene transfer
The unifying structural characteristic of members of the bacterial order Thermotogales is their toga, an unusual cell envelope that includes a loose-fitting sheath around each cell. Only two toga-associated structural proteins have been purified and characterized in Thermotoga maritima: the anchor protein OmpA1 (or Ompα) and the porin OmpB (or Ompβ). The gene encoding OmpA1 (ompA1) was cloned and sequenced and later assigned to TM0477 in the genome sequence, but because no peptide sequence was available for OmpB, its gene (ompB) was not annotated. We identified six porin candidates in the genome sequence of T. maritima. Of these candidates, only one, encoded by TM0476, has all the characteristics reported for OmpB and characteristics expected of a porin including predominant β-sheet structure, a carboxy terminus porin anchoring motif, and a porin-specific amino acid composition. We highly enriched a toga fraction of cells for OmpB by sucrose gradient centrifugation and hydroxyapatite chromatography and analyzed it by LC/MS/MS. We found that the only porin candidate that it contained was the TM0476 product. This cell fraction also had β-sheet character as determined by circular dichroism, consistent with its enrichment for OmpB. We conclude that TM0476 encodes OmpB. A phylogenetic analysis of OmpB found orthologs encoded in syntenic locations in the genomes of all but two Thermotogales species. Those without orthologs have putative isofunctional genes in their place. Phylogenetic analyses of OmpA1 revealed that each species of the Thermotogales has one or two OmpA homologs. T. maritima has two OmpA homologs, encoded by ompA1 (TM0477) and ompA2 (TM1729), both of which were found in the toga protein-enriched cell extracts. These annotations of the genes encoding toga structural proteins will guide future examinations of the structure and function of this unusual lineage-defining cell sheath.
Horizontal gene transfer (HGT) has greatly impacted the genealogical history of many lineages, particularly for prokaryotes, with genes frequently moving in and out of a line of descent. Many genes that were acquired by a lineage in the past likely originated from ancestral relatives that have since gone extinct. During the course of evolution, HGT has played an essential role in the origin and dissemination of genetic and metabolic novelty.
Three divergent forms of leucyl-tRNA synthetase (LeuRS) exist in the archaeal order Halobacteriales, commonly known as haloarchaea. Few haloarchaeal genomes have the typical archaeal form of this enzyme and phylogenetic analysis indicates it clusters within the Euryarchaeota as expected. The majority of sequenced halobacterial genomes possess a bacterial form of LeuRS. Phylogenetic reconstruction puts this larger group of haloarchaea at the base of the bacterial domain. The most parsimonious explanation is that an ancient transfer of LeuRS took place from an organism related to the ancestor of the bacterial domain to the haloarchaea. The bacterial form of LeuRS further underwent gene duplications and/or gene transfers within the haloarchaea, with some genomes possessing two distinct types of bacterial LeuRS. The cognate tRNALeu also reveals two distinct clusters for the haloarchaea; however, these tRNALeu clusters do not coincide with the groupings found in the LeuRS tree, revealing that LeuRS evolved independently of its cognate tRNA.
The study of leucyl-tRNA synthetase in haloarchaea illustrates the importance of gene transfer originating in lineages that went extinct since the transfer occurred. The haloarchaeal LeuRS and tRNALeu did not co-evolve.
The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially in phylogenomic analyses. Decomposition of gene trees into embedded quartets (unrooted trees each with four taxa) is a convenient and statistically powerful technique to address this challenging problem. This approach was shown to be useful in several studies of completely sequenced microbial genomes.
We present here a web server that takes a collection of gene phylogenies, decomposes them into quartets, generates a Quartet Spectrum, and draws a split network. Users are also provided with various data download options for further analyses. Each gene phylogeny is to be represented by an assessment of phylogenetic information content, such as sets of trees reconstructed from bootstrap replicates or sampled from a posterior distribution. The Quartet Decomposition server is accessible at http://quartets.uga.edu.
The Quartet Decomposition server presented here provides a convenient means to perform Quartet Decomposition analyses and will empower users to find statistically supported phylogenetic conflicts.
Thermotoga sp. strain RQ2 is probably a strain of Thermotoga maritima. Its complete genome sequence allows for an examination of the extent and consequences of gene flow within Thermotoga species and strains. Thermotoga sp. RQ2 differs from T. maritima in its genes involved in myo-inositol metabolism. Its genome also encodes an apparent fructose phosphotransferase system (PTS) sugar transporter. This operon is also found in Thermotoga naphthophila strain RKU-10 but no other Thermotogales. These are the first reported PTS transporters in the Thermotogales.
Kosmotoga olearia strain TBF 19.5.1 is a member of the Thermotogales that grows best at 65°C and very well even at 37°C. Information about this organism is important for understanding the evolution of mesophiles from thermophiles. Its genome sequence reveals extensive gene gains and a large content of mobile genetic elements. It also contains putative hydrogenase genes that have no homologs in the other member of the Thermotogales.
Homing endonucleases are site-specific and rare cutting endonucleases often encoded by intron or intein containing genes. They lead to the rapid spread of the genetic element that hosts them by a process termed 'homing'; and ultimately the allele containing the element will be fixed in the population.
PI-SceI, an endonuclease encoded as a protein insert or intein within the yeast V-ATPase catalytic subunit encoding gene (vma1), is among the best characterized homing endonucleases. The structures of the Sce VMA1 intein and of the intein bound to its target site are known. Extensive biochemical studies performed on the PI-SceI enzyme provide information useful to recognize critical amino acids involved in self-splicing and endonuclease functions of the protein. Here we describe an insertion of the Green Fluorescence Protein (GFP) into a loop which is located between the endonuclease and splicing domains of the Sce VMA1 intein. The GFP is functional and the additional GFP domain does not prevent intein excision and endonuclease activity. However, the endonuclease activity of the newly engineered protein was different from the wild-type protein in that it required the presence of Mn2+ and not Mg2+ metal cations for activity.
Homing endonuclease; intein; GFP-fusion protein; PI-SceI; Sce VMA
In the presence of horizontal gene transfer (HGT), the concepts of lineage and genealogy in the microbial world become more ambiguous because chimeric genomes trace their ancestry from a myriad of sources, both living and extinct.
We present the evolutionary histories of three aminoacyl-tRNA synthetases (aaRS) to illustrate that the concept of organismal lineage in the prokaryotic world is defined by both vertical inheritance and reticulations due to HGT. The acquisition of a novel gene from a distantly related taxon can be considered as a shared derived character that demarcates a group of organisms, as in the case of the spirochaete Phenylalanyl-tRNA synthetase (PheRS). On the other hand, when organisms transfer genetic material with their close kin, the similarity and therefore relatedness observed among them is essentially shaped by gene transfer. Studying the distribution patterns of divergent genes with identical functions, referred to as homeoalleles, can reveal preferences for transfer partners. We describe the very ancient origin and the distribution of the archaeal homeoalleles for Threonyl-tRNA synthetases (ThrRS) and Seryl-tRNA synthetases (SerRS).
Patterns created through biased HGT can be undistinguishable from those created through shared organismal ancestry. A re-evaluation of the definition of lineage is necessary to reflect genetic relatedness due to both HGT and vertical inheritance. In most instances, HGT bias will maintain and strengthen similarity within groups. Only in cases where HGT bias is due to other factors, such as shared ecological niche, do patterns emerge from gene phylogenies that are in conflict with those reflecting shared organismal ancestry.
This article was reviewed by W. Ford Doolittle, François-Joseph Lapointe, and Frederic Bouchard.
Phylogenetic reconstruction using DNA and protein sequences has allowed the reconstruction of evolutionary histories encompassing all life. We present and discuss a means to incorporate much of this rich narrative into a single model that acknowledges the discrete evolutionary units that constitute the organism. Briefly, this Rooted Net of Life genome phylogeny is constructed around an initial, well resolved and rooted tree scaffold inferred from a supermatrix of combined ribosomal genes. Extant sampled ribosomes form the leaves of the tree scaffold. These leaves, but not necessarily the deeper parts of the scaffold, can be considered to represent a genome or pan-genome, and to be associated with members of other gene families within that sequenced (pan)genome. Unrooted phylogenies of gene families containing four or more members are reconstructed and superimposed over the scaffold. Initially, reticulations are formed where incongruities between topologies exist. Given sufficient evidence, edges may then be differentiated as those representing vertical lines of inheritance within lineages and those representing horizontal genetic transfers or endosymbioses between lineages.
W. Ford Doolittle, Eric Bapteste and Robert Beiko.
In 2009, James Lake introduced a new hypothesis in which reticulate phylogeny reconstruction is used to elucidate the origin of Gram-negative bacteria (Nature 460: 967–971). The presented data supported the Gram-negative bacteria originating from an ancient endosymbiosis between the Actinobacteria and Clostridia. His conclusion was based on a presence-absence analysis of protein families that divided all prokaryotes into five groups: Actinobacteria, Double Membrane bacteria (DM), Clostridia, Archaea and Bacilli. Of these five groups, the DM are by far the largest and most diverse group compared to the other groupings. While the fusion hypothesis for the origin of double membrane bacteria is enticing, we show that the signal supporting an ancient symbiosis is lost when the DM group is broken down into smaller subgroups. We conclude that the signal detected in James Lake's analysis in part results from a systematic artifact due to group size and diversity combined with low levels of horizontal gene transfer.
Sequencing of genomes from many different bacterial and archaeal groups is broadening the picture of the prokaryotic pan-genome.
A new initiative provides comparative genomicists with a more complete picture of genome diversity. Here we discuss the improved sampling strategy.
Aeromonas veronii biovar sobria, Aeromonas veronii biovar veronii, and Aeromonas allosaccharophila are a closely related group of organisms, the Aeromonas veronii Group, that inhabit a wide range of host animals as a symbiont or pathogen. In this study, the ability of various strains to colonize the medicinal leech as a model for beneficial symbiosis and to kill wax worm larvae as a model for virulence was determined. Isolates cultured from the leech out-competed other strains in the leech model, while most strains were virulent in the wax worms. Three housekeeping genes, recA, dnaJ and gyrB, the gene encoding chitinase, chiA, and four loci associated with the type three secretion system, ascV, ascFG, aexT, and aexU were sequenced. The phylogenetic reconstruction failed to produce one consensus tree that was compatible with most of the individual genes. The Approximately Unbiased test and the Genetic Algorithm for Recombination Detection both provided further support for differing evolutionary histories among this group of genes. Two contrasting tests detected recombination within aexU, ascFG, ascV, dnaJ, and gyrB but not in aexT or chiA. Quartet decomposition analysis indicated a complex recent evolutionary history for these strains with a high frequency of horizontal gene transfer between several but not among all strains. In this study we demonstrate that at least for some strains, horizontal gene transfer occurs at a sufficient frequency to blur the signal from vertically inherited genes, despite strains being adapted to distinct niches. Simply increasing the number of genes included in the analysis is unlikely to overcome this challenge in organisms that occupy multiple niches and can exchange DNA between strains specialized to different niches. Instead, the detection of genes critical in the adaptation to specific niches may help to reveal the physiological specialization of these strains.
Horizontal gene transfer (HGT) is often considered to be a source of error in phylogenetic reconstruction, causing individual gene trees within an organismal lineage to be incongruent, obfuscating the ‘true’ evolutionary history. However, when identified as such, HGTs between divergent organismal lineages are useful, phylogenetically informative characters that can provide insight into evolutionary history. Here, we discuss several distinct HGT events involving all three domains of life, illustrating the selective advantages that can be conveyed via HGT, and the utility of HGT in aiding phylogenetic reconstruction and in dating the relative sequence of speciation events. We also discuss the role of HGT from extinct lineages, and its impact on our understanding of the evolution of life on Earth. Organismal phylogeny needs to incorporate reticulations; a simple tree does not provide an accurate depiction of the processes that have shaped life's history.
horizontal gene transfer; chlamydiae; cyanobacteria; acetoclastic methanogenesis; pyrrolysine; extinct lineages
Universally conserved positions in ribosomal proteins have significant biases in amino acid usage, likely indicating the expansion of the genetic code at the time leading up to the most recent common ancestor(s) (MRCA). Here, we apply this principle to the evolutionary history of the ribosome before the MRCA. It has been proposed that the experimentally determined order of assembly for ribosomal subunits recapitulates their evolutionary chronology. Given this model, we produce a probabilistic evolutionary ordering of the universally conserved small subunit (SSU) and large subunit (LSU) ribosomal proteins. Optimizing the relative ordering of SSU and LSU evolutionary chronologies with respect to minimizing differences in amino acid usage bias, we find strong compositional evidence for a more ancient origin for early LSU proteins. Furthermore, we find that this ordering produces several trends in specific amino acid usages compatible with models of genetic code evolution.
Reconstructing the 'Tree of Life' is complicated by extensive horizontal gene transfer between diverse groups of organisms. While numerous conceptual and technical obstacles remain, a report in this issue of Journal of Biology from Koonin and colleagues on the largest-scale prokaryotic genomic reconstruction yet attempted shows that such a tree is discernible, although its branches cannot be traced.
Inteins and introns are genetic elements that are removed from proteins and RNA after translation or transcription, respectively. Previous studies have suggested that these genetic elements are found in conserved parts of the host protein. To our knowledge this type of analysis has not been done for group II introns residing within a gene. Here we provide quantitative statistical support from an analyses of proteins that host inteins, group I introns, group II introns and spliceosomal introns across all three domains of life.
To determine whether or not inteins, group I, group II, and spliceosomal introns are found preferentially in conserved regions of their respective host protein, conservation profiles were generated and intein and intron positions were mapped to the profiles. Fisher's combined probability test was used to determine the significance of the distribution of insertion sites across the conservation profile for each protein. For a subset of studied proteins, the conservation profile and insertion positions were mapped to protein structures to determine if the insertion sites correlate to regions of functional activity. All inteins and most group I introns were found to be preferentially located within conserved regions; in contrast, a bacterial intein-like protein, group II and spliceosomal introns did not show a preference for conserved sites.
These findings demonstrate that inteins and group I introns are found preferentially in conserved regions of their respective host proteins. Homing endonucleases are often located within inteins and group I introns and these may facilitate mobility to conserved regions. Insertion at these conserved positions decreases the chance of elimination, and slows deletion of the elements, since removal of the elements has to be precise as not to disrupt the function of the protein. Furthermore, functional constrains on the targeted site make it more difficult for hosts to evolve immunity to the homing endonuclease. Therefore, these elements will better survive and propagate as molecular parasites in conserved sites. In contrast, spliceosomal introns and group II introns do not show significant preference for conserved sites and appear to have adopted a different strategy to evade loss.
The concept of a tree of life is prevalent in the evolutionary literature. It stems from attempting to obtain a grand unified natural system that reflects a recurrent process of species and lineage splittings for all forms of life. Traditionally, the discipline of systematics operates in a similar hierarchy of bifurcating (sometimes multifurcating) categories. The assumption of a universal tree of life hinges upon the process of evolution being tree-like throughout all forms of life and all of biological time. In multicellular eukaryotes, the molecular mechanisms and species-level population genetics of variation do indeed mainly cause a tree-like structure over time. In prokaryotes, they do not. Prokaryotic evolution and the tree of life are two different things, and we need to treat them as such, rather than extrapolating from macroscopic life to prokaryotes. In the following we will consider this circumstance from philosophical, scientific, and epistemological perspectives, surmising that phylogeny opted for a single model as a holdover from the Modern Synthesis of evolution.
It was far easier to envision and defend the concept of a universal tree of life before we had data from genomes. But the belief that prokaryotes are related by such a tree has now become stronger than the data to support it. The monistic concept of a single universal tree of life appears, in the face of genome data, increasingly obsolete. This traditional model to describe evolution is no longer the most scientifically productive position to hold, because of the plurality of evolutionary patterns and mechanisms involved. Forcing a single bifurcating scheme onto prokaryotic evolution disregards the non-tree-like nature of natural variation among prokaryotes and accounts for only a minority of observations from genomes.
Prokaryotic evolution and the tree of life are two different things. Hence we will briefly set out alternative models to the tree of life to study their evolution. Ultimately, the plurality of evolutionary patterns and mechanisms involved, such as the discontinuity of the process of evolution across the prokaryote-eukaryote divide, summons forth a pluralistic approach to studying evolution.
This article was reviewed by Ford Doolittle, John Logsdon and Nicolas Galtier.
Prochlorococcus is a genus of marine cyanobacteria characterized by small cell and genome size, an evolutionary trend toward low GC content, the possession of chlorophyll b, and the absence of phycobilisomes. Whereas many shared derived characters define Prochlorococcus as a clade, many genome-based analyses recover them as paraphyletic, with some low-light adapted Prochlorococcus spp. grouping with marine Synechococcus. Here, we use 18 Prochlorococcus and marine Synechococcus genomes to analyze gene flow within and between these taxa. We introduce embedded quartet scatter plots as a tool to screen for genes whose phylogeny agrees or conflicts with the plurality phylogenetic signal, with accepted taxonomy and naming, with GC content, and with the ecological adaptation to high and low light intensities. We find that most gene families support high-light adapted Prochlorococcus spp. as a monophyletic clade and low-light adapted Prochlorococcus sp. as a paraphyletic group. But we also detect 16 gene families that were transferred between high-light adapted and low-light adapted Prochlorococcus sp. and 495 gene families, including 19 ribosomal proteins, that do not cluster designated Prochlorococcus and Synechococcus strains in the expected manner. To explain the observed data, we propose that frequent gene transfer between marine Synechococcus spp. and low-light adapted Prochlorococcus spp. has created a “highway of gene sharing” (Beiko RG, Harlow TJ, Ragan MA. 2005. Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA. 102:14332–14337) that tends to erode genus boundaries without erasing the Prochlorococcus-specific ecological adaptations.
marine cyanobacteria; horizontal gene transfer; introgression; quartet decomposition; supertree; genome evolution
Analyses of the red algal Cyanidioschyzon genome identified 37 genes that were acquired from non-organellar sources prior to the split of red algae and green plants.
Horizontal gene transfer occurs frequently in prokaryotes and unicellular eukaryotes. Anciently acquired genes, if retained among descendants, might significantly affect the long-term evolution of the recipient lineage. However, no systematic studies on the scope of anciently acquired genes and their impact on macroevolution are currently available in eukaryotes.
Analyses of the genome of the red alga Cyanidioschyzon identified 37 genes that were acquired from non-organellar sources prior to the split of red algae and green plants. Ten of these genes are rarely found in cyanobacteria or have additional plastid-derived homologs in plants. These genes most likely provided new functions, often essential for plant growth and development, to the ancestral plant. Many remaining genes may represent replacements of endogenous homologs with a similar function. Furthermore, over 78% of the anciently acquired genes are related to the biogenesis and functionality of plastids, the defining character of plants.
Our data suggest that, although ancient horizontal gene transfer events did occur in eukaryotic evolution, the number of acquired genes does not predict the role of horizontal gene transfer in the adaptation of the recipient organism. Our data also show that multiple independently acquired genes are able to generate and optimize key evolutionary novelties in major eukaryotic groups. In light of these findings, we propose and discuss a general mechanism of horizontal gene transfer in the macroevolution of eukaryotes.