Nonulosonic acids (NulOs) encompass a large group of structurally diverse nine-carbon backbone α-keto sugars widely distributed among the three domains of life. Mammals express a specialized version of NulOs called sialic acids, which are displayed in prominent terminal positions of cell surface and secreted glycoconjugates. Within bacteria, the ability to synthesize NulOs has been demonstrated in a number of human pathogens and is phylogenetically widespread. Here we examine the distribution, diversity, evolution, and function of NulO biosynthesis pathways in members of the family Vibrionaceae. Among 27 species of Vibrionaceae examined at the genomic level, 12 species contained nab gene clusters. We document examples of duplication, divergence, horizontal transfer, and recombination of nab gene clusters in different Vibrionaceae lineages. Biochemical analyses, including mass spectrometry, confirmed that many species do, in fact, produce di-N-acetylated NulOs. A library of clinical and environmental isolates of Vibrio vulnificus served as a model for further investigation of nab allele genotypes and levels of NulO expression. The data show that lineage I isolates produce about 20-fold higher levels of NulOs than lineage II isolates. Moreover, nab gene alleles found in a subset of V. vulnificus clinical isolates express 40-fold higher levels of NulOs than nab alleles associated with environmental isolates. Taken together, the data implicate the family Vibrionaceae as a “hot spot” of NulO evolution and suggest that these molecules may have diverse roles in environmental persistence and/or animal virulence.
The steadily increasing number of prokaryotic genomes has accelerated the study of genome evolution; in particular, the availability of sets of genomes from closely related bacteria has facilitated the exploration of the mechanisms underlying genome plasticity. The family Vibrionaceae is found in the Gammaproteobacteria and is abundant in aquatic environments. Taxa from the family Vibrionaceae are diversified in their life styles; some species are free living, others are symbiotic, and others are human pathogens. This diversity makes this family a useful set of model organisms for studying bacterial evolution. This evolution is driven by several forces, among them gene duplication and lateral gene transfer, which are believed to provide raw material for functional redundancy and novelty. The resultant gene copy increase in one genome is then detected as lineage-specific expansion (LSE).
Here we present the results of a detailed comparison of the genomes of eleven Vibrionaceae strains that have distinct life styles and distinct phenotypes. The core genome shared by all eleven strains is composed of 1,882 genes, which make up about 31%–50% of the genome repertoire. We further investigated the distribution and features of genes that have been specifically expanded in one unique lineage of the eleven strains. Abundant duplicate genes have been identified in the eleven Vibrionaceae strains, with 1–11% of the whole genomes composed lineage specific radiations. These LSEs occurred in two distinct patterns: the first type yields one or more copies of a single gene; we call this a single gene expansion. The second pattern has a high evolutionary impact, as the expansion involves two or more gene copies in a block, with the duplicated block located next to the original block (a contiguous block expansion) or at some distance from the original block (a discontiguous block expansion). We showed that LSEs involve genes that are tied to defense and pathogenesis mechanisms as well as in the fundamental life cycle of Vibrionaceae species.
Our results provide evidence of genome plasticity and rapid evolution within the family Vibrionaceae. The comparisons point to sources of genomic variation and candidates for lineage-specific adaptations of each Vibrionaceae pathogen or nonpathogen strain. Such lineage specific expansions could reveal components in bacterial systems that, by their enhanced genetic variability, can be tied to responses to environmental challenges, interesting phenotypes, or adaptive pathogenic responses to host challenges.
All members of the Vibrionaceae harbor LuxO, a response regulator that integrates outputs from various signaling systems, ultimately controlling specific traits that are crucial to the distinct biology of each species. LuxO is phosphorylated in response to low cell density, activating the transcription of a family of small RNAs called Qrrs, which in turn, control the levels of a global regulatory protein conserved within the Vibrionaceae. Although the function of each Qrr is similar, the number of qrr genes varies among the different species. Using a bioinformatics approach, we have determined the number of qrr genes in fully-sequenced Vibrionaceae members. Phylogenetic analysis suggests the most recent common ancestor of all Vibrionaceae shared a single, ancestral qrr gene, which duplicated and diverged into multiple qrr genes in some present-day vibrio lineages. To demonstrate that a single qrr gene is sufficient to mediate repression of LitR, the global regulator in Vibrio fischeri, we have performed a series of genetic and phenotypic analyses of the LuxO pathway and its output. Our studies contribute to a better understanding of the ancestral state of these pathways in vibrios, as well as to the evolution and divergence of other sRNAs within different bacterial lineages.
On a global research expedition, over 500 bacterial strains inhibitory towards pathogenic bacteria were isolated. Three hundred of the antibacterial strains were assigned to the Vibrionaceae family. The purpose of the present study was to investigate the phylogeny and bioactivity of five Vibrionaceae strains with pronounced antibacterial activity. These were identified as Vibrio coralliilyticus (two strains), V. neptunius (two strains), and Photobacterium halotolerans (one strain) on the basis of housekeeping gene sequences. The two related V. coralliilyticus and V. neptunius strains were isolated from distant oceanic regions. Chemotyping by LC-UV/MS underlined genetic relationships by showing highly similar metabolite profiles for each of the two V. coralliilyticus and V. neptunius strains, respectively, but a unique profile for P. halotolerans. Bioassay-guided fractionation identified two known antibiotics as being responsible for the antibacterial activity; andrimid (from V. coralliilyticus) and holomycin (from P. halotolerans). Despite the isolation of already known antibiotics, our findings show that marine Vibrionaceae are a resource of antibacterial compounds and may have potential for future natural product discovery.
Vibrio coralliilyticus; Vibrio neptunius; Photobacterium halotolerans; chemotyping; andrimid; holomycin
Vibrionaceae are regarded as important marine chitin degraders, and attachment to chitin regulates important biological functions; yet, the degree of chitin pathway conservation in Vibrionaceae is unknown. Here, a core chitin degradation pathway is proposed based on comparison of 19 Vibrio and Photobacterium genomes with a detailed metabolic map assembled for V. cholerae from published biochemical, genomic, and transcriptomic results. Further, to assess whether chitin degradation is a conserved property of Vibrionaceae, a set of 54 strains from 32 taxa were tested for the ability to grow on various forms of chitin. All strains grew on N-acetylglucosamine (GlcNAc), the monomer of chitin. The majority of isolates grew on α (crab shell) and β (squid pen) chitin and contained chitinase A (chiA) genes. chiA sequencing and phylogenetic analysis suggest that this gene is a good indicator of chitin metabolism but appears subject to horizontal gene transfer and duplication. Overall, chitin metabolism appears to be a core function of Vibrionaceae, but individual pathway components exhibit dynamic evolutionary histories.
The SOS response is a well-known regulatory network present in most bacteria and aimed at addressing DNA damage. It has also been linked extensively to stress-induced mutagenesis, virulence and the emergence and dissemination of antibiotic resistance determinants. Recently, the SOS response has been shown to regulate the activity of integrases in the chromosomal superintegrons of the Vibrionaceae, which encompasses a wide range of pathogenic species harboring multiple chromosomes. Here we combine in silico and in vitro techniques to perform a comparative genomics analysis of the SOS regulon in the Vibrionaceae, and we extend the methodology to map this transcriptional network in other bacterial species harboring multiple chromosomes.
Our analysis provides the first comprehensive description of the SOS response in a family (Vibrionaceae) that includes major human pathogens. It also identifies several previously unreported members of the SOS transcriptional network, including two proteins of unknown function. The analysis of the SOS response in other bacterial species with multiple chromosomes uncovers additional regulon members and reveals that there is a conserved core of SOS genes, and that specialized additions to this basic network take place in different phylogenetic groups. Our results also indicate that across all groups the main elements of the SOS response are always found in the large chromosome, whereas specialized additions are found in the smaller chromosomes and plasmids.
Our findings confirm that the SOS response of the Vibrionaceae is strongly linked with pathogenicity and dissemination of antibiotic resistance, and suggest that the characterization of the newly identified members of this regulon could provide key insights into the pathogenesis of Vibrio. The persistent location of key SOS genes in the large chromosome across several bacterial groups confirms that the SOS response plays an essential role in these organisms and sheds light into the mechanisms of evolution of global transcriptional networks involved in adaptability and rapid response to environmental changes, suggesting that small chromosomes may act as evolutionary test beds for the rewiring of transcriptional networks.
Luminescent bacteria in the family Vibrionaceae (Bacteria: γ-Proteobacteria) are commonly found in complex, bilobed light organs of sepiolid and loliginid squids. Although morphology of these organs in both families of squid is similar, the species of bacteria that inhabit each host has yet to be verified. We utilized sequences of 16S ribosomal RNA, luciferase α-subunit (luxA) and the glyceraldehyde-3-phosphate dehydrogenase (gapA) genes to determine phylogenetic relationships between 63 strains of Vibrio bacteria, which included representatives from different environments as well as unidentified luminescent isolates from loliginid and sepiolid squid from Thailand. A combined phylogenetic analysis was used including biochemical data such as carbon use, growth and luminescence. Results demonstrated that certain symbiotic Thai isolates found in the same geographic area were included in a clade containing bacterial species phenotypically suitable to colonize light organs. Moreover, multiple strains isolated from a single squid host were identified as more than one bacteria species in our phylogeny. This research presents evidence of species of luminescent bacteria that have not been previously described as symbiotic strains colonizing light organs of Indo-West Pacific loliginid and sepiolid squids, and supports the hypothesis of a non-species-specific association between certain sepiolid and loliginid squids and marine luminescent bacteria.
We analyzed the usefulness of rpoA, recA, and pyrH gene sequences for the identification of vibrios. We sequenced fragments of these loci from a collection of 208 representative strains, including 192 well-documented Vibrionaceae strains and 16 presumptive Vibrio isolates associated with coral bleaching. In order to determine the intraspecies variation among the three loci, we included several representative strains per species. The phylogenetic trees constructed with the different genetic loci were roughly in agreement with former polyphasic taxonomic studies, including the 16S rRNA-based phylogeny of vibrios. The families Vibrionaceae, Photobacteriaceae, Enterovibrionaceae, and Salinivibrionaceae were all differentiated on the basis of each genetic locus. Each species clearly formed separated clusters with at least 98, 94, and 94% rpoA, recA, and pyrH gene sequence similarity, respectively. The genus Vibrio was heterogeneous and polyphyletic, with Vibrio fischeri, V. logei, and V. wodanis grouping closer to the Photobacterium genus. V. halioticoli-, V. harveyi-, V. splendidus-, and V. tubiashii-related species formed groups within the genus Vibrio. Overall, the three genetic loci were more discriminatory among species than were 16S rRNA sequences. In some cases, e.g., within the V. splendidus and V. tubiashii group, rpoA gene sequences were slightly less discriminatory than recA and pyrH sequences. In these cases, the combination of several loci will yield the most robust identification. We can conclude that strains of the same species will have at least 98, 94, and 94% rpoA, recA, and pyrH gene sequence similarity, respectively.
Horizontal gene transfer (HGT) is thought to occur frequently in bacteria in nature and to play an important role in bacterial evolution, contributing to the formation of new species. To gain insight into the frequency of HGT in Vibrionaceae and its possible impact on speciation, we assessed the incidence of interspecies transfer of the lux genes (luxCDABEG), which encode proteins involved in luminescence, a distinctive phenotype. Three hundred three luminous strains, most of which were recently isolated from nature and which represent 11 Aliivibrio, Photobacterium, and Vibrio species, were screened for incongruence of phylogenies based on a representative housekeeping gene (gyrB or pyrH) and a representative lux gene (luxA). Strains exhibiting incongruence were then subjected to detailed phylogenetic analysis of horizontal transfer by using multiple housekeeping genes (gyrB, recA, and pyrH) and multiple lux genes (luxCDABEG). In nearly all cases, housekeeping gene and lux gene phylogenies were congruent, and there was no instance in which the lux genes of one luminous species had replaced the lux genes of another luminous species. Therefore, the lux genes are predominantly vertically inherited in Vibrionaceae. The few exceptions to this pattern of congruence were as follows: (i) the lux genes of the only known luminous strain of Vibrio vulnificus, VVL1 (ATCC 43382), were evolutionarily closely related to the lux genes of Vibrio harveyi; (ii) the lux genes of two luminous strains of Vibrio chagasii, 21N-12 and SB-52, were closely related to those of V. harveyi and Vibrio splendidus, respectively; (iii) the lux genes of a luminous strain of Photobacterium damselae, BT-6, were closely related to the lux genes of the lux-rib2 operon of Photobacterium leiognathi; and (iv) a strain of the luminous bacterium Photobacterium mandapamensis was found to be merodiploid for the lux genes, and the second set of lux genes was closely related to the lux genes of the lux-rib2 operon of P. leiognathi. In none of these cases of apparent HGT, however, did acquisition of the lux genes correlate with phylogenetic divergence of the recipient strain from other members of its species. The results indicate that horizontal transfer of the lux genes in nature is rare and that horizontal acquisition of the lux genes apparently has not contributed to speciation in recipient taxa.
We have examined the presence of methylated adenine at GATC sequences (Dam phenotype) in the DNA of 23 eubacteria and 13 archaebacteria by using isoshizomer restriction enzymes. We have found a completely Dam+ phenotype in bacteria of nine genera related to the families Enterobacteriaceae, Parvobacteriaceae, and Vibrionaceae, and in the five cyanobacteria tested. We have found a partial Dam+ phenotype in the two archaebacteria Halobacterium saccharovorum and Methanobacterium sp. strain Ivanov. All of the other archaebacteria (three genera) and eubacteria (nine genera) tested were Dam-. Phylogenetic analysis, based on the evolutionary tree of Fox et al. (Science 209:457-463, 1980), indicates that dam methylation in the Escherichia coli lineage appeared recently in bacterial evolution and is restricted to a small range of closely related bacteria.
The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups.
The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin.
The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
Loliginid and sepiolid squid light organs are known to host a variety of bacterial species from the family Vibrionaceae, yet little is known about the species diversity and characteristics among different host squids. Here we present a broad-ranging molecular and physiological analysis of the bacteria colonizing light organs in loliginid and sepiolid squids from various field locations of the Indo-West Pacific (Australia and Thailand). Our PCR-RFLP analysis, physiological characterization, carbon utilization profiling, and electron microscopy data indicate that loliginid squid in the Indo-West Pacific carry a consortium of bacterial species from the families Vibrionaceae and Photobacteriaceae. This research also confirms our previous report of the presence of Vibrio harveyi as a member of the bacterial population colonizing light organs in loliginid squid. pyrH sequence data were used to confirm isolate identity, and indicates that Vibrio and Photobacterium comprise most of the light organ colonizers of squids from Australia, confirming previous reports for Australian loliginid and sepiolid squids. In addition, combined phylogenetic analysis of PCR-RFLP and 16S rDNA data from Australian and Thai isolates associated both Photobacterium and Vibrio clades with both loliginid and sepiolid strains, providing support that geographical origin does not correlate with their relatedness. These results indicate that both loliginid and sepiolid squids demonstrate symbiont specificity (Vibrionaceae), but their distribution is more likely due to environmental factors that are present during the infection process. This study adds significantly to the growing evidence for complex and dynamic associations in nature and highlights the importance of exploring symbiotic relationships in which non-virulent strains of pathogenic Vibrio species could establish associations with marine invertebrates.
Thirty-two genome sequences of various Vibrionaceae members are compared, with emphasis on what makes V. cholerae unique. As few as 1,000 gene families are conserved across all the Vibrionaceae genomes analysed; this fraction roughly doubles for gene families conserved within the species V. cholerae. Of these, approximately 200 gene families that cluster on various locations of the genome are not found in other sequenced Vibrionaceae; these are possibly unique to the V. cholerae species. By comparing gene family content of the analysed genomes, the relatedness to a particular species is identified for two unspeciated genomes. Conversely, two genomes presumably belonging to the same species have suspiciously dissimilar gene family content. We are able to identify a number of genes that are conserved in, and unique to, V. cholerae. Some of these genes may be crucial to the niche adaptation of this species.
The Vibrionaceae family is distantly related to Enterobacteriaceae within the group of bacteria possessing the Dam methylase system. We have cloned, sequenced, and analyzed the dnaA gene region of Vibrio harveyi and found that although the organization of the V. harveyi dnaA region differs from that of Escherichia coli, the expression of both genes is autoregulated and ATP-DnaA binds cooperatively to ATP-DnaA boxes in the dnaA promoter region. The DnaA proteins of V. harveyi and E. coli are interchangeable and function nearly identically in controlling dnaA transcription and the initiation of chromosomal DNA replication despite the evolutionary distance between these bacteria.
Phylogenetic hypotheses based on complete genome data are presented for the Gammaproteobacteria family Vibrionaceae. Two taxon samplings are presented: one including all those taxa for which the genome sequences are complete in terms of arrangement (chromosomal location of fragments; 19 taxa) and one for which the genome sequences contain multiple contigs (44 taxa). Analyses are presented under the Maximum Parsimony and Maximum Likelihood optimality criteria for total evidence datasets, the two chromosomes separately, and individual analyses of locally collinear blocks. Three of the genomes included in the 44 taxon dataset, those of Vibrio gazogenes, Salinivibrio costicola, and Aliivibrio logei have been newly sequenced and their genome sequences are documented here.
Phylogenetic results for the 19-taxon datasets show similar levels of collinear subset of dataset incongruence as a previous study of 22 taxa from the sister family Shewanellaceae, while also echoing the strong phylogenetic performance of random subsets of data also shown in this study. Phylogenetic results for both the 19-taxon and 44-taxon datasets corroborate previous hypotheses about the placement of Photobacterium and Aliivibrio within Vibrionaceae and also highlight problems with how Photobacterium is delimited and indicate that it likely should be dissolved into Vibrio to produce a phylogenetic taxonomy. The 19-taxon and 44-taxon trees based on the large chromosome are congruent for the majority of taxa that are present in both datasets. Analyses of the 44-taxon sampling based on the second, small chromosome are quite different from those based on the large chromosome, which is not surprising given the dramatically divergent nature of the small chromosome and the difficulty in postulating primary homologies.
The phylogenetic analyses presented here represent the most comprehensive genome-level phylogenetic analyses in terms of taxa and data. Based on the availability of genome data for many bacterial species on GenBank, many other bacterial groups would also be amenable to similar genome-scale phylogenetic analyses even when present in multiple contigs. The result that collinear subsets of data are incongruent with the concatenated dataset and with each other while random data subsets show very little incongruence echoes the result of previous work on Shewanellaceae. The 44-taxon phylogenetic analysis presented here thus represents the future of phylogenomic analyses in scope and complexity.
Species of the family Vibrionaceae are ubiquitous in marine environments. Several of these species are important pathogens of humans and marine species. Evidence indicates that genetic exchange plays an important role in the emergence of new pathogenic strains within this family. Data from the sequenced genomes of strains in this family could show how the genes encoded by all these strains, known as the pangenome, are distributed. Information about the core, accessory and panproteome of this family can show how, for example, genes encoding virulence-associated proteins are distributed and help us understand how virulence emerges.
We deduced the complete set of orthologs for eleven strains from this family. The core proteome consists of 1,882 orthologous groups, which is 28% of the 6,629 orthologous groups in this family. There were 4,411 accessory orthologous groups (i.e., proteins that occurred in from 2 to 10 proteomes) and 5,584 unique proteins (encoded once on only one of the eleven genomes). Proteins that have been associated with virulence in V. cholerae were widely distributed across the eleven genomes, but the majority was found only on the genomes of the two V. cholerae strains examined.
The proteomes are reflective of the differing evolutionary trajectories followed by different strains to similar phenotypes. The composition of the proteomes supports the notion that genetic exchange among species of the Vibrionaceae is widespread and that this exchange aids these species in adapting to their environments.
Bobtail squid from the genera Sepiola and Rondeletiola (Cephalopoda: Sepiolidae) form mutualistic associations with luminous Gram-negative bacteria (Gammaproteobacteria: Vibrionaceae) from the genera Vibrio and Photobacterium. Symbiotic bacteria proliferate inside a bilobed light organ until they are actively expelled by the host into the surrounding environment on a diel basis. This event results in a dynamic symbiont population with the potential to establish the symbiosis with newly hatched sterile (axenic) juvenile sepiolids. In this study, we examined the genetic diversity found in populations of sympatric sepiolid squid species and their symbionts by the use of nested clade analysis with multiple gene analyses. Variation found in the distribution of different species of symbiotic bacteria suggests a strong influence of abiotic factors in the local environment, affecting bacterial distribution among sympatric populations of hosts. These abiotic factors include temperature differences incurred by a shallow thermocline, as well as a lack of strong coastal water movement accompanied by seasonal temperature changes in overlapping niches. Host populations are stable and do not appear to have a significant role in the formation of symbiont populations relative to their distribution across the Mediterranean Sea. Additionally, all squid species examined (Sepiola affinis, S. robusta, S. ligulata, S. intermedia, and Rondeletiola minor) are genetically distinct from one another regardless of location and demonstrate very little intraspecific variation within species. These findings suggest that physical boundaries and distance in relation to population size, and not host specificity, are important factors in limiting or defining gene flow within sympatric marine squids and their associated bacterial symbionts in the Mediterranean Sea.
Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered completely-sequenced proteomes of prokaryotes on the basis of their protein domain content, as defined by Pfam (release 16.0). This revealed that, although there was a correlation between phylogeny and domain content, other factors also have an influence. This observation motivated an investigation of the relationship between an organism's lifestyle and the complement of domains and domain architectures found within its proteome.
We took a census of all protein domains and domain combinations (architectures) encoded in the completely-sequenced proteobacterial genomes. Nine protein domain families were identified that are found in phylogenetically disparate plant-associated bacteria but are absent from non-plant-associated bacteria. Most of these are known to play a role in the plant-associated lifestyle, but they also included domain of unknown function DUF1427, which is found in plant symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but not known in any other organism. Further, several domains were identified as being restricted to phytobacteria and Eukaryotes. One example is the RolB/RolC glucosidase family, which is found only in Agrobacterium species and in plants. We identified the 0.5% of Pfam protein domain families that were most significantly over-represented in the plant-associated Proteobacteria with respect to the background frequencies in the whole set of available proteobacterial proteomes. These included guanylate cyclase, domains implicated in aromatic catabolism, cellulase and several domains of unknown function.
We identified 459 unique domain architectures found in phylogenetically diverse plant pathogens and symbionts that were absent from non-pathogenic and non-symbiotic relatives. The vast majority of these were restricted to a single species or several closely related species and so their distributions could be better explained by phylogeny than by lifestyle. However, several architectures were found in two or more very distantly related phytobacteria but absent from non-plant-associated bacteria. Many of the proteins with these unique architectures are predicted to be secreted.
In Pseudomonas syringae pathovar tomato, those genes encoding genes with novel domain architectures tended to have atypical GC contents and were adjacent to insertion sequence elements and phage-like sequences, suggesting acquisition by horizontal transfer.
By identifying domains and architectures unique to plant pathogens and symbionts, we highlighted candidate proteins for involvement in plant-associated bacterial lifestyles. Given that characterisation of novel gene products in vivo and in vitro is time-consuming and expensive, this computational approach may be useful for reducing experimental search space. Furthermore we discuss the biological significance of novel proteins highlighted by this study in the context of plant-associated lifestyles.
The virulence regulatory protein ToxR of Vibrio cholerae is unique in that it contains a cytoplasmic DNA-binding–transcriptional activation domain, a transmembrane domain, and a periplasmic domain. Although ToxR and other transmembrane transcriptional activators have been discovered in other bacteria, little is known about their mechanism of activation. Utilizing degenerate oligonucleotides and PCR, we have amplified internal toxR gene sequences from seven Vibrio and Photobacterium species and subspecies, demonstrating that toxR is an ancestral gene of the family Vibrionaceae. Sequence alignment of all available ToxR amino acid sequences revealed a region between the transcriptional activation and transmembrane domains that displays wide divergence among Vibrio species. We hypothesize that this region merely tethers the transcriptional activation domain to the cytoplasmic membrane and thus can tolerate wide divergence and multiple insertions and deletions. The divergence in the tether region at the nucleotide level may provide a useful tool for the distinction of Vibrio and Photobacterium species.
Genomic DNA from eubacteria belonging to the gamma-3 subdivision of purple bacteria, as classified by Woese (C.R. Woese, Microbiol. Rev. 51:221-271, 1987), were probed with the argT operon of Escherichia coli encoding 5'-tRNA(Arg)-tRNA(His)-tRNA(Leu)-tRNA(Pro)-3'. The homologous operon from Vibrio harveyi was isolated and sequenced. Comparison of the five available sequences of this tRNA cluster from members of the families Enterobacteriaceae, Aeromonadaceae, and Vibrionaceae led to the conclusion that variations in different versions of this operon arose not only by point mutations but also by duplication and addition-deletion of entire tRNA genes. This data base permitted the formulation of a proposal dealing with the evolutionary history of this operon and suggested that DNA regions containing tRNA genes are active centers (hot spots) of recombination. Finally, since the operon from V. harveyi was not highly repetitive and did not contain tRNA pseudogenes, as in the Photobacterium phosphoreum operon, hybridization of genomic DNAs from different photobacterial strains with probes specific for the repeated pseudogene element was performed. We conclude that the phylogenetic distribution of the repetitive DNA is restricted to strains of P. phosphoreum.
In recent years genome sequencing has been used to characterize new bacterial species, a method of analysis available as a result of improved methodology and reduced cost. Included in a constantly expanding list of Vibrio species are several that have been reclassified as novel members of the Vibrionaceae. The description of two putative new Vibrio species, Vibrio sp. RC341 and Vibrio sp. RC586 for which we propose the names V. metecus and V. parilis, respectively, previously characterized as non-toxigenic environmental variants of V. cholerae is presented in this study.
Based on results of whole-genome average nucleotide identity (ANI), average amino acid identity (AAI), rpoB similarity, MLSA, and phylogenetic analysis, the new species are concluded to be phylogenetically closely related to V. cholerae and V. mimicus. Vibrio sp. RC341 and Vibrio sp. RC586 demonstrate features characteristic of V. cholerae and V. mimicus, respectively, on differential and selective media, but their genomes show a 12 to 15% divergence (88 to 85% ANI and 92 to 91% AAI) compared to the sequences of V. cholerae and V. mimicus genomes (ANI <95% and AAI <96% indicative of separate species). Vibrio sp. RC341 and Vibrio sp. RC586 share 2104 ORFs (59%) and 2058 ORFs (56%) with the published core genome of V. cholerae and 2956 (82%) and 3048 ORFs (84%) with V. mimicus MB-451, respectively. The novel species share 2926 ORFs with each other (81% Vibrio sp. RC341 and 81% Vibrio sp. RC586). Virulence-associated factors and genomic islands of V. cholerae and V. mimicus, including VSP-I and II, were found in these environmental Vibrio spp.
Results of this analysis demonstrate these two environmental vibrios, previously characterized as variant V. cholerae strains, are new species which have evolved from ancestral lineages of the V. cholerae and V. mimicus clade. The presence of conserved integration loci for genomic islands as well as evidence of horizontal gene transfer between these two new species, V. cholerae, and V. mimicus suggests genomic islands and virulence factors are transferred between these species.
Comparisons of gene content and orthologous protein sequence constitute a major strategy in whole-genome comparison studies. It is expected that horizontal gene transfer between phylogenetically distant organisms and lineage-specific gene loss have greater influence on gene content-based phylogenetic analysis than orthologous protein sequence-based phylogenetic analysis. To determine the evolution of the syntrophic bacterium Symbiobacterium thermophilum, we analyzed phylogenetic relationships among Clostridia on the basis of gene content and orthologous protein sequence comparisons. These comparisons revealed that these 2 phylogenetic relationships are topologically different. Our results suggest that each Clostridia has a species-specific gene content because frequent genetic exchanges or gene losses have occurred during evolution. Specifically, the phylogenetic positions of syntrophic Clostridia were different between these 2 phylogenetic analyses, suggesting that large diversity in the living environments may cause the observed species-specific gene content. S. thermophilum occupied the most distant position from the other syntrophic Clostridia in the gene content-based phylogenetic tree. We identified 32 genes (14 under relaxed selection and 18 under functional constraint) evolving under Symbiobacterium-specific selection on the basis of synonymous-to-nonsynonymous substitution ratios. Five of the 14 genes under relaxed selection are related to transcription. In contrast, none of the 18 genes under functional constraint is related to transcription.
The generation of genome-scale data is becoming more routine, yet the subsequent analysis of omics data remains a significant challenge. Here, an approach that integrates multiple omics datasets with bioinformatics tools was developed that produces a detailed annotation of several microbial genomic features. This methodology was used to characterize the genome of Thermotoga maritima—a phylogenetically deep-branching, hyperthermophilic bacterium. Experimental data were generated for whole-genome resequencing, transcription start site (TSS) determination, transcriptome profiling, and proteome profiling. These datasets, analyzed in combination with bioinformatics tools, served as a basis for the improvement of gene annotation, the elucidation of transcription units (TUs), the identification of putative non-coding RNAs (ncRNAs), and the determination of promoters and ribosome binding sites. This revealed many distinctive properties of the T. maritima genome organization relative to other bacteria. This genome has a high number of genes per TU (3.3), a paucity of putative ncRNAs (12), and few TUs with multiple TSSs (3.7%). Quantitative analysis of promoters and ribosome binding sites showed increased sequence conservation relative to other bacteria. The 5′UTRs follow an atypical bimodal length distribution comprised of “Short” 5′UTRs (11–17 nt) and “Common” 5′UTRs (26–32 nt). Transcriptional regulation is limited by a lack of intergenic space for the majority of TUs. Lastly, a high fraction of annotated genes are expressed independent of growth state and a linear correlation of mRNA/protein is observed (Pearson r = 0.63, p<2.2×10−16 t-test). These distinctive properties are hypothesized to be a reflection of this organism's hyperthermophilic lifestyle and could yield novel insights into the evolutionary trajectory of microbial life on earth.
Genomic studies have greatly benefited from the advent of high-throughput technologies and bioinformatics tools. Here, a methodology integrating genome-scale data and bioinformatics tools is developed to characterize the genome organization of the hyperthermophilic, phylogenetically deep-branching bacterium Thermotoga maritima. This approach elucidates several features of the genome organization and enables comparative analysis of these features across diverse taxa. Our results suggest that the genome of T. maritima is reflective of its hyperthermophilic lifestyle. Ultimately, constraints imposed on the genome have negative impacts on regulatory complexity and phenotypic diversity. Investigating the genome organization of Thermotogae species will help resolve various causal factors contributing to the genome organization such as phylogeny and environment. Applying a similar analysis of the genome organization to numerous taxa will likely provide insights into microbial evolution.
Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies.The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.
In the arginine biosynthetic pathway of the vast majority of prokaryotes, the formation of ornithine is catalyzed by an enzyme transferring the acetyl group of N-α-acetylornithine to glutamate (ornithine acetyltransferase [OATase]) (argJ encoded). Only two exceptions had been reported—the Enterobacteriaceae and Myxococcus xanthus (members of the γ and δ groups of the class Proteobacteria, respectively)—in which ornithine is produced from N-α-acetylornithine by a deacylase, acetylornithinase (AOase) (argE encoded). We have investigated the gene-enzyme relationship in the arginine regulons of two psychrophilic Moritella strains belonging to the Vibrionaceae, a family phylogenetically related to the Enterobacteriaceae. Most of the arg genes were found to be clustered in one continuous sequence divergently transcribed in two wings, argE and argCBFGH(A) [“H(A)” indicates that the argininosuccinase gene consists of a part homologous to known argH sequences and of a 3′ extension able to complement an Escherichia coli mutant deficient in the argA gene, encoding N-α-acetylglutamate synthetase, the first enzyme committed to the pathway]. Phylogenetic evidence suggests that this new clustering pattern arose in an ancestor common to Vibrionaceae and Enterobacteriaceae, where OATase was lost and replaced by a deacylase. The AOase and ornithine carbamoyltransferase of these psychrophilic strains both display distinctly cold-adapted activity profiles, providing the first cold-active examples of such enzymes.