Oil reservoirs represent a nutrient-rich ecological niche of the deep biosphere. Although most oil reservoirs are occupied by microbial populations, when and how the microbes colonized these environments remains unanswered. To address this question, we compared 11 genomes of Thermotoga maritima-like hyperthermophilic bacteria from two environment types: subsurface oil reservoirs in the North Sea and Japan, and marine sites located in the Kuril Islands, Italy and the Azores. We complemented our genomes with Thermotoga DNA from publicly available subsurface metagenomes from North America and Australia. Our analysis revealed complex non-bifurcating evolutionary history of the isolates' genomes, suggesting high amounts of gene flow across all sampled locations, a conjecture supported by numerous recombination events. Genomes from the same type of environment tend to be more similar, and have exchanged more genes with each other than with geographically close isolates from different types of environments. Hence, Thermotoga populations of oil reservoirs do not appear isolated, a requirement of the ‘burial and isolation' hypothesis, under which reservoir bacteria are descendants of the isolated communities buried with sediments that over time became oil reservoirs. Instead, our analysis supports a more complex view, where bacteria from subsurface and marine populations have been continuously migrating into the oil reservoirs and influencing their genetic composition. The Thermotoga spp. in the oil reservoirs in the North Sea and Japan probably entered the reservoirs shortly after they were formed. An Australian oil reservoir, on the other hand, was likely colonized very recently, perhaps during human reservoir development.
Gene transfer agents (GTAs) are phage-like particles that can package and transfer a random piece of the producing cell’s genome, but are unable to transfer all the genes required for their own production. As such, GTAs represent an evolutionary conundrum: are they selfish genetic elements propagating through an unknown mechanism, defective viruses, or viral structures “repurposed” by cells for gene exchange, as their name implies? In Rhodobacter capsulatus, production of the R. capsulatus GTA (RcGTA) particles is associated with a cluster of genes resembling a small prophage. Utilizing transcriptomic, genetic and biochemical approaches, we report that the RcGTA “genome” consists of at least 24 genes distributed across five distinct loci. We demonstrate that, of these additional loci, two are involved in cell recognition and binding and one in the production and maturation of RcGTA particles. The five RcGTA “genome” loci are widespread within Rhodobacterales, but not all loci have the same evolutionary histories. Specifically, two of the loci have been subject to frequent, probably virus-mediated, gene transfer events. We argue that it is unlikely that RcGTA is a selfish genetic element. Instead, our findings are compatible with the scenario that RcGTA is a virus-derived element maintained by the producing organism due to a selective advantage of within-population gene exchange. The modularity of the RcGTA “genome” is presumably a result of selection on the host organism to retain GTA functionality.
RcGTA; gene exchange; exaptation; prophage; virus; Rhodobacter.
Modern industrial agriculture depends on high-density cultivation of genetically similar crop plants, creating favorable conditions for the emergence of novel pathogens with increased fitness in managed compared with ecologically intact settings. Here, we present the genome sequence of six strains of the cucurbit bacterial wilt pathogen Erwinia tracheiphila (Enterobacteriaceae) isolated from infected squash plants in New York, Pennsylvania, Kentucky, and Michigan. These genomes exhibit a high proportion of recent horizontal gene acquisitions, invasion and remarkable amplification of mobile genetic elements, and pseudogenization of approximately 20% of the coding sequences. These genome attributes indicate that E. tracheiphila recently emerged as a host-restricted pathogen. Furthermore, chromosomal rearrangements associated with phage and transposable element proliferation contribute to substantial differences in gene content and genetic architecture between the six E. tracheiphila strains and other Erwinia species. Together, these data lead us to hypothesize that E. tracheiphila has undergone recent evolution through both genome decay (pseudogenization) and genome expansion (horizontal gene transfer and mobile element amplification). Despite evidence of dramatic genomic changes, the six strains are genetically monomorphic, suggesting a recent population bottleneck and emergence into E. tracheiphila’s current ecological niche.
Cucurbita, Cucumis; Erwinia, mobile DNA; transposase; insertion sequence; pseudogene; host specialization; vector; monomorphic; phage; pumpkin; squash; cucumber
Erwinia tracheiphila is one of the most economically important pathogens of cucumbers, melons, squashes, pumpkins, and gourds in the northeastern and midwestern United States, yet its molecular pathology remains uninvestigated. Here, we report the first draft genome sequence of an E. tracheiphila strain isolated from an infected wild gourd (Cucurbita pepo subsp. texana) plant. The genome assembly consists of 7 contigs and includes a putative plasmid and at least 20 phage and prophage elements.
A large fraction of any bacterial genome consists of hypothetical protein-coding open reading frames (ORFs). While most of these ORFs are present only in one or a few sequenced genomes, a few are conserved, often across large phylogenetic distances. Such conservation provides clues to likely uncharacterized cellular functions that need to be elucidated. Marine cyanobacteria from the Prochlorococcus/marine Synechococcus clade are dominant bacteria in oceanic waters and are significant contributors to global primary production. A Hyper Conserved Protein (PSHCP) of unknown function is 100% conserved at the amino acid level in genomes of Prochlorococcus/marine Synechococcus, but lacks homologs outside of this clade. In this study we investigated Prochlorococcus marinus strains MED4 and MIT 9313 and Synechococcus sp. strain WH 8102 for the transcription of the PSHCP gene using RT-Q-PCR, for the presence of the protein product through quantitative immunoblotting, and for the protein's binding partners in a pull down assay. Significant transcription of the gene was detected in all strains. The PSHCP protein content varied between 8±1 fmol and 26±9 fmol per ug total protein, depending on the strain. The 50 S ribosomal protein L2, the Photosystem I protein PsaD and the Ycf48-like protein were found associated with the PSHCP protein in all strains and not appreciably or at all in control experiments. We hypothesize that PSHCP is a protein associated with the ribosome, and is possibly involved in photosystem assembly.
Horizontal gene transfer is important in the evolution of bacterial and archaeal genomes. An interesting genetic exchange process is carried out by diverse phage-like gene transfer agents (GTAs) that are found in a wide range of prokaryotes. Although GTAs resemble phages, they lack the hallmark capabilities that define typical phages, and they package random pieces of the producing cell’s genome. In this Review, we discuss the defining characteristics of the GTAs that have been identified to date, along with potential functions for these agents and the possible evolutionary forces that act on the genes involved in their production.
PMID: 22683880 CAMSID: cams2872
Here we describe the genome of Mesotoga prima MesG1.Ag4.2, the first genome of a mesophilic Thermotogales bacterium. Mesotoga prima was isolated from a polychlorinated biphenyl (PCB)-dechlorinating enrichment culture from Baltimore Harbor sediments. Its 2.97 Mb genome is considerably larger than any previously sequenced Thermotogales genomes, which range between 1.86 and 2.30 Mb. This larger size is due to both higher numbers of protein-coding genes and larger intergenic regions. In particular, the M. prima genome contains more genes for proteins involved in regulatory functions, for instance those involved in regulation of transcription. Together with its closest relative, Kosmotoga olearia, it also encodes different types of proteins involved in environmental and cell–cell interactions as compared with other Thermotogales bacteria. Amino acid composition analysis of M. prima proteins implies that this lineage has inhabited low-temperature environments for a long time. A large fraction of the M. prima genome has been acquired by lateral gene transfer (LGT): a DarkHorse analysis suggests that 766 (32%) of predicted protein-coding genes have been involved in LGT after Mesotoga diverged from the other Thermotogales lineages. A notable example of a lineage-specific LGT event is a reductive dehalogenase gene—a key enzyme in dehalorespiration, indicating M. prima may have a more active role in PCB dechlorination than was previously assumed.
lateral gene transfer; thermotogales; mesophilic; temperature adaptation
The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially in phylogenomic analyses. Decomposition of gene trees into embedded quartets (unrooted trees each with four taxa) is a convenient and statistically powerful technique to address this challenging problem. This approach was shown to be useful in several studies of completely sequenced microbial genomes.
We present here a web server that takes a collection of gene phylogenies, decomposes them into quartets, generates a Quartet Spectrum, and draws a split network. Users are also provided with various data download options for further analyses. Each gene phylogeny is to be represented by an assessment of phylogenetic information content, such as sets of trees reconstructed from bootstrap replicates or sampled from a posterior distribution. The Quartet Decomposition server is accessible at http://quartets.uga.edu.
The Quartet Decomposition server presented here provides a convenient means to perform Quartet Decomposition analyses and will empower users to find statistically supported phylogenetic conflicts.
Prochlorococcus is a genus of marine cyanobacteria characterized by small cell and genome size, an evolutionary trend toward low GC content, the possession of chlorophyll b, and the absence of phycobilisomes. Whereas many shared derived characters define Prochlorococcus as a clade, many genome-based analyses recover them as paraphyletic, with some low-light adapted Prochlorococcus spp. grouping with marine Synechococcus. Here, we use 18 Prochlorococcus and marine Synechococcus genomes to analyze gene flow within and between these taxa. We introduce embedded quartet scatter plots as a tool to screen for genes whose phylogeny agrees or conflicts with the plurality phylogenetic signal, with accepted taxonomy and naming, with GC content, and with the ecological adaptation to high and low light intensities. We find that most gene families support high-light adapted Prochlorococcus spp. as a monophyletic clade and low-light adapted Prochlorococcus sp. as a paraphyletic group. But we also detect 16 gene families that were transferred between high-light adapted and low-light adapted Prochlorococcus sp. and 495 gene families, including 19 ribosomal proteins, that do not cluster designated Prochlorococcus and Synechococcus strains in the expected manner. To explain the observed data, we propose that frequent gene transfer between marine Synechococcus spp. and low-light adapted Prochlorococcus spp. has created a “highway of gene sharing” (Beiko RG, Harlow TJ, Ragan MA. 2005. Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA. 102:14332–14337) that tends to erode genus boundaries without erasing the Prochlorococcus-specific ecological adaptations.
marine cyanobacteria; horizontal gene transfer; introgression; quartet decomposition; supertree; genome evolution
Lateral gene transfers (LGT) (also called horizontal gene transfers) have been a major force shaping the Thermosipho africanus TCF52B genome, whose sequence we describe here. Firmicutes emerge as the principal LGT partner. Twenty-six percent of phylogenetic trees suggest LGT with this group, while 13% of the open reading frames indicate LGT with Archaea.
Usual BLAST-based methods for assessing gene presence and absence lead to systematic overestimation of within-species gene gain by lateral transfer.
The usual BLAST-based methods for assessing gene presence and absence lead to systematic overestimation of within-species gene gain by lateral transfer.
All cultivated isolates of the bacterial order Thermotogales are either thermophiles or hyperthermophiles, but Thermotogales 16S rRNA gene sequences have been detected in many mesophilic anaerobic and microaerophilic environments, particularly within communities involved in the remediation of pollutants. Here we provide metagenomic evidence for the existence of Thermotogales lineages, which we informally call “mesotoga,” that are adapted to growth at lower temperatures. Two fosmid clones containing mesotoga DNA, originating from a low-temperature enrichment culture that degrades a polychlorinated biphenyl congener, were sequenced. Phylogenetic analysis clearly puts this bacterial lineage within the Thermotogales order, with the rRNA gene trees and 21 of 58 open reading frames strongly supporting this relationship. An analysis of protein sequence composition showed that mesotoga proteins are adapted to function at lower temperatures than are their identifiable homologs from thermophilic and hyperthermophilic members of the order Thermotogales, supporting the notion that this bacterium lives and grows optimally at lower temperatures. The phylogenetic analysis suggests that the mesotoga lineage from which our fosmids derive has used both the acquisition of genes from its neighbors and the modification of existing thermophilic sequences to adapt to a mesophilic lifestyle.
Sequencing 16S rRNA genes (SSU) cloned from Aeromonas strains revealed that strains contained up to six copies differing by ≤1.5%. The SSU copies from Aeromonas veronii LMG13695 clustered with sequences from four Aeromonas species. These results demonstrate intragenomic heterogeneity of SSU and suggest caution when using SSU to identify aeromonads.
Dekapentagonal maps depict the phylogenetic relationships of five genomes in a visually appealing diagram and can be viewed as an alternative to a single evolutionary consensus tree. In particular, the generated maps focus attention on those gene families that significantly deviate from the consensus or plurality phylogeny. PentaPlot is a software tool that computes such dekapentagonal maps given an appropriate probability support matrix.
The visualization with dekapentagonal maps critically depends on the optimal layout of unrooted tree topologies representing different evolutionary relationships among five organisms along the vertices of the dekapentagon. This is a difficult optimization problem given the large number of possible layouts. At its core our tool utilizes a genetic algorithm with demes and a local search strategy to search for the optimal layout. The hybrid genetic algorithm performs satisfactorily even in those cases where the chosen genomes are so divergent that little phylogenetic information has survived in the individual gene families.
PentaPlot is being made publicly available as an open source project at .
Reconstructing the early evolution of photosynthesis has been guided in part by the geological record, but the complexity and great antiquity of these early events require molecular genetic techniques as the primary tools of inference. Recent genome sequencing efforts have made whole genome data available from representatives of each of the five phyla of bacteria with photosynthetic members, allowing extensive phylogenetic comparisons of these organisms. Here, we have undertaken whole genome comparisons using maximum likelihood to compare 527 unique sets of orthologous genes from all five photosynthetic phyla. Substantiating recent whole genome analyses of other prokaryotes, our results indicate that horizontal gene transfer (HGT) has played a significant part in the evolution of these organisms, resulting in genomes with mosaic evolutionary histories. A small plurality phylogenetic signal was observed, which may be a core of remnant genes not subject to HGT, or may result from a propensity for gene exchange between two or more of the photosynthetic organisms compared.
Dekapentagonal maps depict phylogenetic information for orthologous genes present in five genomes, and provide a pre-screen for putatively horizontally transferred genes.
The methods presented here summarize phylogenetic relationships of genomes in visually appealing and informative figures. Dekapentagonal maps depict phylogenetic information for orthologous genes present in five genomes, and provide a pre-screen for putatively horizontally transferred genes. If the majority of individual gene phylogenies are unresolved, bipartition histograms provide a means of uncovering and analyzing the plurality consensus. Analyses of genomes representing five photosynthetic bacterial phyla and of the prokaryotic contributions to the eukaryotic cell illustrate the utility of the methods.
Maximum likelihood and posterior probability mapping are useful visualization techniques that are used to ascertain the mosaic nature of prokaryotic genomes. However, posterior probabilities, especially when calculated for four-taxon cases, tend to overestimate the support for tree topologies. Furthermore, because of poor taxon sampling four-taxon analyses suffer from sensitivity to the long branch attraction artifact. Here we extend the probability mapping approach by improving taxon sampling of the analyzed datasets, and by using bootstrap support values, a more conservative tool to assess reliability.
Quartets of orthologous proteins were complemented with homologs from selected reference genomes. The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping. The more conservative nature of the plotted support values allows to focus further analyses on those protein families that strongly disagree with the majority or plurality of genes present in the analyzed genomes.
Posterior probability is a non-conservative measure for support, and posterior probability mapping only provides a quick estimation of phylogenetic information content of four genomes. This approach can be utilized as a pre-screen to select genes that might have been horizontally transferred. Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation. Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.
maximum likelihood mapping; long-branch attraction; horizontal gene transfer; taxon sampling; bootstrap support values mapping
Horizontal gene transfer (HGT) played an important role in shaping microbial genomes. In addition to genes under sporadic selection, HGT also affects housekeeping genes and those involved in information processing, even ribosomal RNA encoding genes. Here we describe tools that provide an assessment and graphic illustration of the mosaic nature of microbial genomes.
We adapted the Maximum Likelihood (ML) mapping to the analyses of all detected quartets of orthologous genes found in four genomes. We have automated the assembly and analyses of these quartets of orthologs given the selection of four genomes. We compared the ML-mapping approach to more rigorous Bayesian probability and Bootstrap mapping techniques. The latter two approaches appear to be more conservative than the ML-mapping approach, but qualitatively all three approaches give equivalent results. All three tools were tested on mitochondrial genomes, which presumably were inherited as a single linkage group.
In some instances of interphylum relationships we find nearly equal numbers of quartets strongly supporting the three possible topologies. In contrast, our analyses of genome quartets containing the cyanobacterium Synechocystis sp. indicate that a large part of the cyanobacterial genome is related to that of low GC Gram positives. Other groups that had been suggested as sister groups to the cyanobacteria contain many fewer genes that group with the Synechocystis orthologs. Interdomain comparisons of genome quartets containing the archaeon Halobacterium sp. revealed that Halobacterium sp. shares more genes with Bacteria that live in the same environment than with Bacteria that are more closely related based on rRNA phylogeny . Many of these genes encode proteins involved in substrate transport and metabolism and in information storage and processing. The performed analyses demonstrate that relationships among prokaryotes cannot be accurately depicted by or inferred from the tree-like evolution of a core of rarely transferred genes; rather prokaryotic genomes are mosaics in which different parts have different evolutionary histories. Probability mapping is a valuable tool to explore the mosaic nature of genomes.