Here we describe the genome of Mesotoga prima MesG1.Ag4.2, the first genome of a mesophilic Thermotogales bacterium. Mesotoga prima was isolated from a polychlorinated biphenyl (PCB)-dechlorinating enrichment culture from Baltimore Harbor sediments. Its 2.97 Mb genome is considerably larger than any previously sequenced Thermotogales genomes, which range between 1.86 and 2.30 Mb. This larger size is due to both higher numbers of protein-coding genes and larger intergenic regions. In particular, the M. prima genome contains more genes for proteins involved in regulatory functions, for instance those involved in regulation of transcription. Together with its closest relative, Kosmotoga olearia, it also encodes different types of proteins involved in environmental and cell–cell interactions as compared with other Thermotogales bacteria. Amino acid composition analysis of M. prima proteins implies that this lineage has inhabited low-temperature environments for a long time. A large fraction of the M. prima genome has been acquired by lateral gene transfer (LGT): a DarkHorse analysis suggests that 766 (32%) of predicted protein-coding genes have been involved in LGT after Mesotoga diverged from the other Thermotogales lineages. A notable example of a lineage-specific LGT event is a reductive dehalogenase gene—a key enzyme in dehalorespiration, indicating M. prima may have a more active role in PCB dechlorination than was previously assumed.
lateral gene transfer; thermotogales; mesophilic; temperature adaptation
Integron cassette arrays in a dozen cultivars of the most prevalent group of Vibrio isolates obtained from mucus expelled by a scleractinian coral (Pocillopora damicornis) colony living on the Great Barrier Reef were sequenced and compared. Although all cultivars showed >99% identity across recA, pyrH and rpoB genes, no two had more than 10% of their integron-associated gene cassettes in common, and some individuals shared cassettes exclusively with distantly-related members of the genus. Of cassettes shared within the population, a number appear to have been transferred between Vibrio isolates, as assessed by phylogenetic analysis. Prominent among the mucus Vibrio cassettes with potentially inferable functions are acetyltransferases, some with close similarity to known antibiotic-resistance determinants. A subset of these potential resistance cassettes were shared exclusively between the mucus Vibrio cultivars, Vibrio coral pathogens and human pathogens, thus illustrating a direct link between these microbial niches through exchange of integron-associated gene cassettes.
integrons; coral; Vibrio; gene cassettes; microbial defense
All cultivated Thermotogales are thermophiles or hyperthermophiles. However, optimized 16S rRNA primers successfully amplified Thermotogales sequences from temperate hydrocarbon-impacted sites, mesothermic oil reservoirs, and enrichment cultures incubated at <46°C. We conclude that distinct Thermotogales lineages commonly inhabit low-temperature environments but may be underreported, likely due to “universal” 16S rRNA gene primer bias.
Debates over the status of the tree of life (TOL) often proceed without agreement as to what it is supposed to be: a hierarchical classification scheme, a tracing of genomic and organismal history or a hypothesis about evolutionary processes and the patterns they can generate. I will argue that for Darwin it was a hypothesis, which lateral gene transfer in prokaryotes now shows to be false. I will propose a more general and relaxed evolutionary theory and point out why anti-evolutionists should take no comfort from disproof of the TOL hypothesis.
tree of life; lateral gene transfer; horizontal gene transfer; prokaryote genome evolution; phylogenetics
Prochlorococcus is a genus of marine cyanobacteria characterized by small cell and genome size, an evolutionary trend toward low GC content, the possession of chlorophyll b, and the absence of phycobilisomes. Whereas many shared derived characters define Prochlorococcus as a clade, many genome-based analyses recover them as paraphyletic, with some low-light adapted Prochlorococcus spp. grouping with marine Synechococcus. Here, we use 18 Prochlorococcus and marine Synechococcus genomes to analyze gene flow within and between these taxa. We introduce embedded quartet scatter plots as a tool to screen for genes whose phylogeny agrees or conflicts with the plurality phylogenetic signal, with accepted taxonomy and naming, with GC content, and with the ecological adaptation to high and low light intensities. We find that most gene families support high-light adapted Prochlorococcus spp. as a monophyletic clade and low-light adapted Prochlorococcus sp. as a paraphyletic group. But we also detect 16 gene families that were transferred between high-light adapted and low-light adapted Prochlorococcus sp. and 495 gene families, including 19 ribosomal proteins, that do not cluster designated Prochlorococcus and Synechococcus strains in the expected manner. To explain the observed data, we propose that frequent gene transfer between marine Synechococcus spp. and low-light adapted Prochlorococcus spp. has created a “highway of gene sharing” (Beiko RG, Harlow TJ, Ragan MA. 2005. Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA. 102:14332–14337) that tends to erode genus boundaries without erasing the Prochlorococcus-specific ecological adaptations.
marine cyanobacteria; horizontal gene transfer; introgression; quartet decomposition; supertree; genome evolution
Lateral gene transfers (LGT) (also called horizontal gene transfers) have been a major force shaping the Thermosipho africanus TCF52B genome, whose sequence we describe here. Firmicutes emerge as the principal LGT partner. Twenty-six percent of phylogenetic trees suggest LGT with this group, while 13% of the open reading frames indicate LGT with Archaea.
Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms.
By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data) can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided.
ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.
Several characteristics of the 16S rRNA gene, such as its essential function, ubiquity, and evolutionary properties, have allowed it to become the most commonly used molecular marker in microbial ecology. However, one fact that has been overlooked is that multiple copies of this gene are often present in a given bacterium. These intragenomic copies can differ in sequence, leading to identification of multiple ribotypes for a single organism. To evaluate the impact of such intragenomic heterogeneity on the performance of the 16S rRNA gene as a molecular marker, we compared its phylogenetic and evolutionary characteristics to those of the single-copy gene rpoB. Full-length gene sequences and gene fragments commonly used for denaturing gradient gel electrophoresis were compared at various taxonomic levels. Heterogeneity found between intragenomic 16S rRNA gene copies was concentrated in specific regions of rRNA secondary structure. Such “heterogeneity hot spots” occurred within all gene fragments commonly used in molecular microbial ecology. This intragenomic heterogeneity influenced 16S rRNA gene tree topology, phylogenetic resolution, and operational taxonomic unit estimates at the species level or below. rpoB provided comparable phylogenetic resolution to that of the 16S rRNA gene at all taxonomic levels, except between closely related organisms (species and subspecies levels), for which it provided better resolution. This is particularly relevant in the context of a growing number of studies focusing on subspecies diversity, in which single-copy protein-encoding genes such as rpoB could complement the information provided by the 16S rRNA gene.
A leading hypothesis for the role of bacteria in inflammatory bowel diseases is that an imbalance in normal gut flora is a prerequisite for inflammation. Testing this hypothesis requires comparisons between the microbiota compositions of ulcerative colitis and Crohn's disease patients and those of healthy individuals. In this study, we obtained biopsy samples from patients with Crohn's disease and ulcerative colitis and from healthy controls. Bacterial DNA was extracted from the tissue samples, amplified using universal bacterial 16S rRNA gene primers, and cloned into a plasmid vector. Insert-containing colonies were picked for high-throughput sequencing, and sequence data were analyzed, yielding species-level phylogenetic data. The clone libraries yielded 3,305 sequenced clones, representing 151 operational taxonomical units. There was no significant difference between floras from inflamed and healthy tissues from within the same individual. Proteobacteria were significantly (P = 0.0007) increased in Crohn's disease patients, as were Bacteroidetes (P < 0.0001), while Clostridia were decreased in that group (P < 0.0001) in comparison with the healthy and ulcerative colitis groups, which displayed no significant differences. Thus, the bacterial flora composition of Crohn's patients appears to be significantly altered from that of healthy controls, unlike that of ulcerative colitis patients. Imbalance in flora in Crohn's disease is probably not sufficient to cause inflammation, since microbiotas from inflamed and noninflamed tissues were of similar compositions within the same individual.
Usual BLAST-based methods for assessing gene presence and absence lead to systematic overestimation of within-species gene gain by lateral transfer.
The usual BLAST-based methods for assessing gene presence and absence lead to systematic overestimation of within-species gene gain by lateral transfer.
All cultivated isolates of the bacterial order Thermotogales are either thermophiles or hyperthermophiles, but Thermotogales 16S rRNA gene sequences have been detected in many mesophilic anaerobic and microaerophilic environments, particularly within communities involved in the remediation of pollutants. Here we provide metagenomic evidence for the existence of Thermotogales lineages, which we informally call “mesotoga,” that are adapted to growth at lower temperatures. Two fosmid clones containing mesotoga DNA, originating from a low-temperature enrichment culture that degrades a polychlorinated biphenyl congener, were sequenced. Phylogenetic analysis clearly puts this bacterial lineage within the Thermotogales order, with the rRNA gene trees and 21 of 58 open reading frames strongly supporting this relationship. An analysis of protein sequence composition showed that mesotoga proteins are adapted to function at lower temperatures than are their identifiable homologs from thermophilic and hyperthermophilic members of the order Thermotogales, supporting the notion that this bacterium lives and grows optimally at lower temperatures. The phylogenetic analysis suggests that the mesotoga lineage from which our fosmids derive has used both the acquisition of genes from its neighbors and the modification of existing thermophilic sequences to adapt to a mesophilic lifestyle.
Do we need to describe bacteria as species, and if so, can we?
Whether or not bacteria have species is a perennially vexatious question. Given what we now know about variation among bacterial genomes, we argue that there is no intrinsic reason why the processes driving diversification and adaptation must produce groups of individuals sufficiently coherent in their genetic and phenotypic properties to merit the designation 'species' - although sometimes they might.
Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present.
The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions.
These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities.
Integrons are genetic elements capable of the acquisition, rearrangement and expression of genes contained in gene cassettes. Gene cassettes generally consist of a promoterless gene associated with a recombination site known as a 59-base element (59-be). Multiple insertion events can lead to the assembly of large integron-associated cassette arrays. The most striking examples are found in Vibrio, where such cassette arrays are widespread and can range from 30 kb to 150 kb. Besides those found in completely sequenced genomes, no such array has yet been recovered in its entirety. We describe an approach to systematically isolate, sequence and annotate large integron gene cassette arrays from bacterial strains.
The complete Vibrio sp. DAT722 integron cassette array was determined through the streamlined approach described here. To place it in an evolutionary context, we compare the DAT722 array to known vibrio arrays and performed phylogenetic analyses for all of its components (integrase, 59-be sites, gene cassette encoded genes). It differs extensively in terms of genomic context as well as gene cassette content and organization. The phylogenetic tree of the 59-be sites collectively found in the Vibrio gene cassette pool suggests frequent transfer of cassettes within and between Vibrio species, with slower transfer rates between more phylogenetically distant relatives. We also identify multiple cases where non-integron chromosomal genes seem to have been assembled into gene cassettes and others where cassettes have been inserted into chromosomal locations outside integrons.
Our systematic approach greatly facilitates the isolation and annotation of large integrons gene cassette arrays. Comparative analysis of the Vibrio sp. DAT722 integron obtained through this approach to those found in other vibrios confirms the role of this genetic element in promoting lateral gene transfer and suggests a high rate of gene gain/loss relative to most other loci on vibrio chromosomes. We identify a relationship between the phylogenetic distance separating two species and the rate at which they exchange gene cassettes, interactions between the non-mobile portion of bacterial genomes and the vibrio gene cassette pool as well as intragenomic translocation events of integrons in vibrios.
Golgi bodies are nearly ubiquitous in eukaryotic cells. The apparent lack of such structures in certain eukaryotic lineages might be taken to mean that these protists evolved prior to the acquisition of the Golgi, and it raises questions of how these organisms function in the absence of this crucial organelle. Here, we report gene sequences from five proposed 'Golgi-lacking' organisms (Giardia intestinalis, Spironucleus barkhanus, Entamoeba histolytica, Naegleria gruberi and Mastigamoeba balamuthi). BLAST and phylogenetic analyses show these genes to be homologous to those encoding components of the retromer, coatomer and adaptin complexes, all of which have Golgi-related functions in mammals and yeast. This is, to our knowledge, the first molecular evidence for Golgi bodies in two major eukaryotic lineages (the pelobionts and heteroloboseids). This substantiates the suggestion that there are no extant primitively 'Golgi-lacking' lineages, and that this apparatus was present in the last common eukaryotic ancestor, but has been altered beyond recognition several times.
There are many ways to group completed genome sequences in hierarchical patterns (trees) reflecting relationships between their genes. Such groupings help us organize biological information and bear crucially on underlying processes of genome and organismal evolution. Genome trees make use of all comparable genes but can variously weight the contributions of these genes according to similarity, congruent patterns of similarity, or prevalence among genomes. Here we explore such possible weighting strategies, in an analysis of 142 prokaryotic and 5 eukaryotic genomes. We demonstrate that alternate weighting strategies have different advantages, and we propose that each may have its specific uses in systematic or evolutionary biology. Comparisons of results obtained with different methods can provide further clues to major events and processes in genome evolution.
More than one copy of rRNA operons, which code for both the small-subunit (SSU) and large-subunit (LSU) rRNA, are often found in prokaryotes. It is generally assumed that all rRNA operons within a single cell are almost identical. A notable exception is the extremely halophilic archaeal genus Haloarcula, most species of which are known to harbor highly divergent rRNA operons that differ at ∼5% of the nucleotide positions in the SSU gene and at 1 to 2% of the nucleotide positions in the LSU gene. We report that such intragenomic heterogeneity is not unique to Haloarcula, as high levels of intragenomic sequence variation have been observed for the SSU genes of two other genera of extreme halophiles, Halosimplex and Natrinema. To investigate this in detail, the two rRNA operons of Halosimplex carlsbadense and the four operons of Natrinema sp. strain XA3-1 were cloned and completely sequenced. The SSU and LSU genes of H. carlsbadense show the highest levels of intragenomic heterogeneity observed so far in archaea (6.7 and 2.6%). The operons of Natrinema sp. strain XA3-1 have additional unusual characteristics, such as identical internal transcribed spacers, while one of four SSU genes is 5% divergent and all LSU genes differ from each other by 0.9 to 1.9%. The heterogeneity among the Natrinema sp. strain XA3-1 LSU genes is localized in hot spots, and one of these regions is shown to be the result of a recombination event with a distantly related halophile. This is the first example of interspecies recombination between rRNA genes in archaea, and the recombination occurred over one of the largest phylogenetic distances ever reported for such an event. We suggest that intragenomic heterogeneity of rRNA operons is an ancient and stable trait in several lineages of the Halobacteriales. The impact of this phenomenon on the taxonomy of extremely halophilic archaea is discussed.
Comparisons between genomes of closely related bacteria often show large variations in gene content, even between strains of the same species. Such studies have focused mainly on pathogens; here, we examined Thermotoga maritima, a free-living hyperthermophilic bacterium, by using suppressive subtractive hybridization. The genome sequence of T. maritima MSB8 is available, and DNA from this strain served as a reference to obtain strain-specific sequences from Thermotoga sp. strain RQ2, a very close relative (∼96% identity for orthologous protein-coding genes, 99.7% identity in the small-subunit rRNA sequence). Four hundred twenty-six RQ2 subtractive clones were sequenced. One hundred sixty-six had no DNA match in the MSB8 genome. These differential clones comprise, in sum, 48 kb of RQ2-specific DNA and match 72 genes in the GenBank database. From the number of identical clones, we estimated that RQ2 contains 350 to 400 genes not found in MSB8. Assuming a similar genome size, this corresponds to 20% of the RQ2 genome. A large proportion of the RQ2-specific genes were predicted to be involved in sugar transport and polysaccharide degradation, suggesting that polysaccharides are more important as nutrients for this strain than for MSB8. Several clones encode proteins involved in the production of surface polysaccharides. RQ2 encodes multiple subunits of a V-type ATPase, while MSB8 possesses only an F-type ATPase. Moreover, an RQ2-specific MutS homolog was found among the subtractive clones and appears to belong to a third novel archaeal type MutS lineage. Southern blot analyses showed that some of the RQ2 differential sequences are found in some other members of the order Thermotogales, but the distribution of these variable genes is patchy, suggesting frequent lateral gene transfer within the group.
Class 1 release factor in eukaryotes (eRF1) recognizes stop codons and promotes peptide release from the ribosome. The ‘molecular mimicry’ hypothesis suggests that domain 1 of eRF1 is analogous to the tRNA anticodon stem–loop. Recent studies strongly support this hypothesis and several models for specific interactions between stop codons and residues in domain 1 have been proposed. In this study we have sequenced and identified novel eRF1 sequences across a wide diversity of eukaryotes and re-evaluated the codon-binding site by bioinformatic analyses of a large eRF1 dataset. Analyses of the eRF1 structure combined with estimates of evolutionary rates at amino acid sites allow us to define the residues that are under structural (i.e. those involved in intramolecular interactions) versus non-structural selective constraints. Furthermore, we have re-assessed convergent substitutions in the ciliate variant code eRF1s using maximum likelihood-based phylogenetic approaches. Our results favor the model proposed by Bertram et al. that stop codons bind to three ‘cavities’ on the protein surface, although we suggest that the stop codon may bind in the opposite orientation to the original model. We assess the feasibility of this alternative binding orientation with a triplet stop codon and the eRF1 domain 1 structures using molecular modeling techniques.
In eukaryotes with the universal genetic code a single class
I release factor (eRF1) most probably recognizes all stop codons
(UAA, UAG and UGA) and is essential for termination of nascent peptide synthesis.
It is well established that stop codons have been reassigned to
amino acid codons at least three times among ciliates. The codon
specificities of ciliate eRF1s must have been modified to accommodate
the variant codes. In this study we have amplified, cloned and sequenced eRF1 genes of two hypotrichous ciliates, Oxytricha
trifallax (UAA and UAG for Gln) and Euplotes aediculatus (UGA
for Cys). We also sequenced/identified three protist and
two archaeal class I RF genes to enlarge the database
of eRF1/aRF1s with the universal code. Extensive comparisons
between universal code eRF1s and those of Oxytricha, Euplotes and Tetrahymena, which
represent three lineages that acquired variant codes independently,
provide important clues to identify stop codon-binding regions in
eRF1. Domain 1 in the five ciliate eRF1s, particulary the TASNIKS heptapeptide
and its adjacent region, differs significantly from domain 1 in
universal code eRF1s. This observation suggests that domain 1 contains
the codon recognition site, but that the mechanism of eRF1 codon
recognition may be more complex than proposed by Nakamura et
al. or Knight and Landweber.
We describe a mutant (strain 704) of the obligate photoautotroph Anacystis nidulans which behaves like the wild type under continuous illumination but which in the dark rapidly loses viability, respires little, and incorporates label into ribonucleic acid and protein at rates considerably less than observed with the darkened wild type. Extracts of this mutant strain show no detectable 6-phosphogluconate dehydrogenase (EC 22.214.171.124) activity. Spontaneous revertants of mutant 704 were selected as survivors of prolonged incubation in darkness. Of 10 such strains examined, none had regained 6-phosphogluconate dehydrogenase activity, and all had lost detectable glucose-6-phosphate dehydrogenase (EC 126.96.36.199) activity. Although dark survival of these revertants paralleled that of the wild type, rates of dark endogenous respiration and incorporation of labeled precursors into ribonucleic acid were still very low, comparable to those observed with strain 704. These results are consistent with the following hypotheses concerning dark endogenous metabolism in unicellular blue-green bacteria. (i) Although the oxidative pentose phosphate cycle (hexose monophosphate shunt) may play a major role in endogenous metabolism in A. nidulans, as proposed by others, it is not the only pathway capable of providing energy for maintenance of viability in darkness. (ii) Much of the endogenous metabolic activity (respiration and macromolecular synthesis) observed in darkened cultures of wild-type A. nidulans is not required for survival alone, and must therefore serve other functions.
In the dark, the obligately photoautotrophic blue-green alga Anacystis nidulans accumulates large relative amounts of two novel stable ribonucleic acid species (RNAs). These species are also made in illuminated cells but are unstable in them. When darkened cells are reilluminated, these RNAs are rapidly degraded; degradation is inhibited by chloramphenicol. Upon denaturation with heat or urea, one novel species (0.33 × 106 daltons) dissociates into two fragments that comigrate with the second novel species (0.16 × 106 daltons) on polyacrylamide gels. Both RNAs are associated with particles sedimenting between 30S and 50S through sucrose gradients and are removed from these particles at low magnesium concentration. The function(s) of these RNAs remains unknown.
The maturation of 5S ribosomal ribonucleic acid (rRNA) in the obligately photoautotrophic unicellular blue-green alga Anacystis nidulans has been studied by using polyacrylamide gel electrophoresis and T1 ribonuclease oligonucleotide analysis. A. nidulans mature 5S rRNA (m5) is of approximately the same molecular weight as the 5S rRNA of Escherichia coli, and is derived by cleavage of a precursor (p5) containing a few (three to six) additional nucleotides. Some of these additional nucleotides occur at the 5′ end of the precursor molecule; others may occur at the 3′ end. Kinetic experiments indicate that precursors of mature 5S rRNA larger than p5 either do not exist or are very transient in A. nidulans. These results are discussed in relation to those obtained with other prokaryotes.
Data are presented consistent with the notion that the 23s ribosomal ribonucleic acid (rRNA) of Anacystis nidulans undergoes specific endonucleolytic cleavage in vivo, to produce two fragments with molecular weights of 0.88 × 106 and 0.17 × 106 daltons. Cleavage occurred at random after 23s rRNA formation and was stimulated by light in this organism, an obligately photoautotrophic unicellular blue-green alga. The half-life of intact 23s rRNA was about 5 h in illuminated cultures and 10 h in unilluminated cultures. 3-(p-Chlorophenyl)-1, 1-dimethylurea, an inhibitor of photosystem II, retarded 23s rRNA cleavage in the light. The results are discussed in the context of recent reports of rRNA instability in a variety of eukaryotic and prokaryotic organisms.
Methods are described for preparation of pulse-labeled ribonucleic acid (RNA) from the blue-green alga Anacystis nidulans. Synthesis of labeled RNA was found to be in part dependent on concurrent photosynthesis and was inhibited by the antibiotic streptolydigin. Mature 23S ribosomal RNA (rRNA) appeared before mature 16S rRNA. Formation of either molecule was inhibited by chloramphenicol, and RNA species of lesser mobility accumulated. These species may be precursors of the mature forms. Maturation of 16S rRNA was also inhibited by streptolydigin. (The effect of this antibiotic on 23S rRNA maturation was not examined). In many respects, ribosomal RNA synthesis and maturation in this blue-green alga appear to follow the pattern already established for bacteria.