Growth rate has long been considered one of the most valuable phenotypes that can be measured in cells. Aside from being highly accessible and informative in laboratory cultures, maximal growth rate is often a prime determinant of cellular fitness, and predicting phenotypes that underlie fitness is key to both understanding and manipulating life. Despite this, current methods for predicting microbial fitness typically focus on yields [e.g., predictions of biomass yield using GEnome-scale metabolic Models (GEMs)] or notably require many empirical kinetic constants or substrate uptake rates, which render these methods ineffective in cases where fitness derives most directly from growth rate. Here we present a new method for predicting cellular growth rate, termed SUMEX, which does not require any empirical variables apart from a metabolic network (i.e., a GEM) and the growth medium. SUMEX is calculated by maximizing the SUM of molar EXchange fluxes (hence SUMEX) in a genome-scale metabolic model. SUMEX successfully predicts relative microbial growth rates across species, environments, and genetic conditions, outperforming traditional cellular objectives (most notably, the convention assuming biomass maximization). The success of SUMEX suggests that the ability of a cell to catabolize substrates and produce a strong proton gradient enables fast cell growth. Easily applicable heuristics for predicting growth rate, such as what we demonstrate with SUMEX, may contribute to numerous medical and biotechnological goals, ranging from the engineering of faster-growing industrial strains, modeling of mixed ecological communities, and the inhibition of cancer growth.
The ability to exchange DNA between cells is a molecular process that exists in different species in the domain Archaea. Such horizontal gene transfer events were shown to take place between distant species of archaea and to result in the transfer of large genomic regions. Here we describe recent progress in this field, discuss the potential use of natural gene exchange processes to perform genome shuffling and argue its possible biotechnological applications.
Archaea; Cell fusion; Haloferax volcanii; Haloferax mediterranei; Sulfolobus
Haloferax volcanii uses extracellular DNA as a source for carbon, nitrogen, and phosphorous. However, it can also grow to a limited extend in the absence of added phosphorous, indicating that it contains an intracellular phosphate storage molecule. As Hfx. volcanii is polyploid, it was investigated whether DNA might be used as storage polymer, in addition to its role as genetic material. It could be verified that during phosphate starvation cells multiply by distributing as well as by degrading their chromosomes. In contrast, the number of ribosomes stayed constant, revealing that ribosomes are distributed to descendant cells, but not degraded. These results suggest that the phosphate of phosphate-containing biomolecules (other than DNA and RNA) originates from that stored in DNA, not in rRNA. Adding phosphate to chromosome depleted cells rapidly restores polyploidy. Quantification of desiccation survival of cells with different ploidy levels showed that under phosphate starvation Hfx. volcanii diminishes genetic advantages of polyploidy in favor of cell multiplication. The consequences of the usage of genomic DNA as phosphate storage polymer are discussed as well as the hypothesis that DNA might have initially evolved in evolution as a storage polymer, and the various genetic benefits evolved later.
Predatory bacteria are taxonomically disparate, exhibit diverse predatory strategies and are widely distributed in varied environments. To date, their predatory phenotypes cannot be discerned in genome sequence data thereby limiting our understanding of bacterial predation, and of its impact in nature. Here, we define the ‘predatome,' that is, sets of protein families that reflect the phenotypes of predatory bacteria. The proteomes of all sequenced 11 predatory bacteria, including two de novo sequenced genomes, and 19 non-predatory bacteria from across the phylogenetic and ecological landscapes were compared. Protein families discriminating between the two groups were identified and quantified, demonstrating that differences in the proteomes of predatory and non-predatory bacteria are large and significant. This analysis allows predictions to be made, as we show by confirming from genome data an over-looked bacterial predator. The predatome exhibits deficiencies in riboflavin and amino acids biosynthesis, suggesting that predators obtain them from their prey. In contrast, these genomes are highly enriched in adhesins, proteases and particular metabolic proteins, used for binding to, processing and consuming prey, respectively. Strikingly, predators and non-predators differ in isoprenoid biosynthesis: predators use the mevalonate pathway, whereas non-predators, like almost all bacteria, use the DOXP pathway. By defining predatory signatures in bacterial genomes, the predatory potential they encode can be uncovered, filling an essential gap for measuring bacterial predation in nature. Moreover, we suggest that full-genome proteomic comparisons are applicable to other ecological interactions between microbes, and provide a convenient and rational tool for the functional classification of bacteria.
microbial predators; bacterial predation; comparative genomics; Bdellovibrio
Extracellular DNA is found in all environments and is a dynamic component of the microbial ecosystem. Microbial cells produce and interact with extracellular DNA through many endogenous mechanisms. Extracellular DNA is processed and internalized for use as genetic information and as a major source of macronutrients, and plays several key roles within prokaryotic biofilms. Hypersaline sites contain some of the highest extracellular DNA concentrations measured in nature–a potential rich source of carbon, nitrogen, and phosphorus for halophilic microorganisms. We conducted DNA growth studies for the halophilic archaeon Haloferax volcanii DS2 and show that this model Halobacteriales strain is capable of using exogenous double-stranded DNA as a nutrient. Further experiments with varying medium composition, DNA concentration, and DNA types revealed that DNA is utilized primarily as a phosphorus source, that growth on DNA is concentration-dependent, and that DNA isolated from different sources is metabolized selectively, with a bias against highly divergent methylated DNA. Additionally, fluorescence microscopy showed that labeled DNA co-localized with H. volcanii cells. The gene Hvo_1477 was also identified using a comparative genomic approach as a factor likely to be involved in DNA processing at the cell surface, and deletion of Hvo_1477 created a strain deficient in the ability to grow on extracellular DNA. Widespread distribution of Hvo_1477 homologs in archaea suggests metabolism of extracellular DNA may be of broad ecological and physiological relevance in this domain of life.
extracellular DNA; Haloferax volcanii; DNA metabolism; Halobacteria; halophiles; archaea; natural competence; archaeal genetics
The evolutionary history of all life forms is usually represented as a vertical tree-like process. In prokaryotes, however, the vertical signal is partly obscured by the massive influence of horizontal gene transfer (HGT). The HGT creates widespread discordance between evolutionary histories of different genes as genomes become mosaics of gene histories. Thus, the Tree of Life (TOL) has been questioned as an appropriate representation of the evolution of prokaryotes. Nevertheless a common hypothesis is that prokaryotic evolution is primarily tree-like, and a routine effort is made to place new isolates in their appropriate location in the TOL. Moreover, it appears desirable to exploit non–tree-like evolutionary processes for the task of microbial classification. In this work, we present a novel technique that builds on the straightforward observation that gene order conservation (‘synteny’) decreases in time as a result of gene mobility. This is particularly true in prokaryotes, mainly due to HGT. Using a ‘synteny index’ (SI) that measures the average synteny between a pair of genomes, we developed the phylogenetic reconstruction tool ‘Phylo SI’. Phylo SI offers several attractive properties such as easy bootstrapping, high sensitivity in cases where phylogenetic signal is weak and computational efficiency. Phylo SI was tested both on simulated data and on two bacterial data sets and compared with two well-established phylogenetic methods. Phylo SI is particularly efficient on short evolutionary distances where synteny footprints remain detectable, whereas the nucleotide substitution signal is too weak for reliable sequence-based phylogenetic reconstruction. The method is publicly available at http://research.haifa.ac.il/ssagi/software/PhyloSI.zip.
Cells of undomesticated species of Bacillus subtilis frequently form complex colonies during spreading on agar surfaces. Given that menaquinone is involved in another form of coordinated behavior, namely, sporulation, we looked for a possible role for menaquinone in complex colony development (CCD) in the B. subtilis
strain NCIB 3610. Here we show that inhibition of menaquinone biosynthesis in B. subtilis indeed abolished its ability to develop complex colonies. Additionally some mutations of B. subtilis which confer defective CCD could be suppressed by menaquinone derivatives. Several such mutants mapped to the dhb operon encoding the genes responsible for the biosynthesis of the iron siderophore, bacillibactin. Our results demonstrate that both menaquinone and iron are essential for CCD in B. subtilis.
Methanosphaera stadtmanae is a commensal methanogenic archaeon found in the human gut. As most of its niche-neighbors are bacteria, it is expected that lateral gene transfer (LGT) from bacteria might have contributed to the evolutionary history of this organism. We performed a phylogenomic survey of putative LGT events in M. stadtmanae, using a phylogenetic pipeline. Our analysis indicates that a substantial fraction of the proteins of M. stadtmanae are inferred to have been involved in inter-domain LGT. Laterally acquired genes have had a large contribution to surface functions, by providing novel glycosyltransferase functions. In addition, several ABC transporters seem to be of bacterial origin, including the molybdate transporter. Thus, bacterial genes contributed to the adaptation of M. stadtmanae to a host-dependent lifestyle by allowing a larger variation in surface structures and increasing transport efficiency in the gut niche which is diverse and competitive.
horizontal gene transfer; microbial evolution; archaeal genomics; archaea; methanogens; human gut
KEOPS is an important cellular complex conserved in Eukarya, with some subunits conserved in Archaea and Bacteria. This complex was recently found to play an essential role in formation of the tRNA modification threonylcarbamoyladenosine (t6A), and was previously associated with telomere length maintenance and transcription. KEOPS subunits are conserved in Archaea, especially in the Euryarchaea, where they had been studied in vitro. Here we attempted to delete the genes encoding the four conserved subunits of the KEOPS complex in the euryarchaeote Haloferax volcanii and study their phenotypes in vivo. The fused kae1-bud32 gene was shown to be essential as was cgi121, which is dispensable in yeast. In contrast, pcc1 (encoding the putative dimerizing unit of KEOPS) was not essential in H. volcanii. Deletion of pcc1 led to pleiotropic phenotypes, including decreased growth rate, reduced levels of t6A modification, and elevated levels of intra-cellular glycation products.
Microbial ecosystems are often assumed to be relatively stable over short periods of time, but this assumption is seldom tested. An urban stream influenced by both flow and varying levels of anthropogenic influences is expected to have high temporal variability in microbial composition, and short-term ecological instability. Thus, we analyzed the bacterioplankton composition of a weir-fragmented urban stream using Automated rRNA Intergenic Spacer Analysis (ARISA). A total of 46 sequential samples were collected in July 2009 for 7 days, every 7 hours, from both the up-stream side of the weir (stream water) and the downstream side of the weir (estuarine) water. Bray-Curtis similarity based analysis showed a clear division between upstream and downstream communities. A sudden pH drop induced change in both communities, but composition stability partially recovered within less than a day. Thus, our results show that microbial ecosystems can change rapidly, but re-establish a new equilibrium relatively quickly.
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci have been shown to provide prokaryotes with an adaptive immunity against viruses and plasmids. CRISPR arrays are transcribed and processed into small CRISPR RNA molecules, which base-pair with invading DNA or RNA and lead to its degradation by CRISPR-associated (Cas) protein complexes. New spacers can be acquired by active CRISPR/Cas systems, and thus the sequences of these spacers provide a record of the past “infection history” of the organism. Recently we used spacer sequences from archaeal genomes to infer gene exchange events among archaeal species and genera and to demonstrate that at least in this domain of life CRISPR indeed has an anti-viral role.
CRISPR; Lateral Gene Transfer; archaea; horizontal gene transfer; viruses
There is a growing interest in the study of the human gut microbiota, as correlations
between changes in bacterial profiles and diseases are increasingly discovered. Studies in
this field generally use fecal samples, but it is often easier to obtain colon content
aspirates during colonoscopy. This study used automated ribosomal internal spacer analysis
(ARISA) to examine the extent to which the microbiota of colon aspirate samples obtained
after bowel cleansing can reflect interindividual differences and serve as a proxy for
fecal samples. Pre-bowel preparation fecal samples as well as colonoscopy aspirate samples
from the cecum and rectum were obtained from 19 subjects. DNA was extracted from all
samples, and comparative analysis was performed, including analysis of similarity (ANOSIM)
and nonmetric multidimensional scaling. ANOSIM confirmed that samples from the same
individual were well separated from samples from different individuals. Significantly
larger differences were found between samples from different individuals than between
samples of the same individual (R = 0.7605, p < 0.0001). These findings show that
post-bowel preparation aspirates maintain a strong individual signature. Colonoscopy
aspirates can therefore serve as a substitute for fecal samples in studies comparing the
microbiota of different clinical study groups, especially when fecal samples are
microbiota analysis; ARISA; ITS; colonoscopy; inter-individual variation; intra-individual variation
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
Degradation of mRNA in bacteria is a regulatory mechanism, providing an efficient way to fine-tune protein abundance in response to environmental changes. While the mechanisms responsible for initiation and subsequent propagation of mRNA degradation are well studied, the mRNA features that affect its stability are yet to be elucidated. We calculated three properties for each mRNA in the E. coli transcriptome: G+C content, tRNA adaptation index (tAI) and folding energy. Each of these properties were then correlated with the experimental transcript half life measured for each transcript and detected significant correlations. A sliding window analysis identified the regions that displayed the maximal signal. The correlation between transcript half life and both G+C content and folding energy was strongest at the 5′ termini of the mRNAs. Partial correlations showed that each of the parameters contributes separately to mRNA half life. Notably, mRNAs of recently-acquired genes in the E. coli genome, which have a distinct nucleotide composition, tend to be highly stable. This high stability may aid the evolutionary fixation of horizontally acquired genes.
In recent years, both homing endonucleases (HEases) and zinc-finger nucleases (ZFNs) have been engineered and selected for the targeting of desired human loci for gene therapy. However, enzyme engineering is lengthy and expensive and the off-target effect of the manufactured endonucleases is difficult to predict. Moreover, enzymes selected to cleave a human DNA locus may not cleave the homologous locus in the genome of animal models because of sequence divergence, thus hampering attempts to assess the in vivo efficacy and safety of any engineered enzyme prior to its application in human trials. Here, we show that naturally occurring HEases can be found, that cleave desirable human targets. Some of these enzymes are also shown to cleave the homologous sequence in the genome of animal models. In addition, the distribution of off-target effects may be more predictable for native HEases. Based on our experimental observations, we present the HomeBase algorithm, database and web server that allow a high-throughput computational search and assignment of HEases for the targeting of specific loci in the human and other genomes. We validate experimentally the predicted target specificity of candidate fungal, bacterial and archaeal HEases using cell free, yeast and archaeal assays.
Horizontal gene transfer (HGT) is a major force in microbial evolution. Previous studies have suggested that a variety of factors, including restricted recombination and toxicity of foreign gene products, may act as barriers to the successful integration of horizontally transferred genes. This study identifies an additional central barrier to HGT—the lack of co-adaptation between the codon usage of the transferred gene and the tRNA pool of the recipient organism. Analyzing the genomic sequences of more than 190 microorganisms and the HGT events that have occurred between them, we show that the number of genes that were horizontally transferred between organisms is positively correlated with the similarity between their tRNA pools. Those genes that are better adapted to the tRNA pools of the target genomes tend to undergo more frequent HGT. At the community (or environment) level, organisms that share a common ecological niche tend to have similar tRNA pools. These results remain significant after controlling for diverse ecological and evolutionary parameters. Our analysis demonstrates that there are bi-directional associations between the similarity in the tRNA pools of organisms and the number of HGT events occurring between them. Similar tRNA pools between a donor and a host tend to increase the probability that a horizontally acquired gene will become fixed in its new genome. Our results also suggest that frequent HGT may be a homogenizing force that increases the similarity in the tRNA pools of organisms within the same community.
Inteins are parasitic genetic elements, analogous to introns that excise themselves at the protein level by self-splicing, allowing the formation of functional non-disrupted proteins. Many inteins contain a homing endonuclease (HEN) gene, and rely on its activity for horizontal propagation. In the halophilic archaeon, Haloferax volcanii, the gene encoding DNA polymerase B (polB) contains an intein with an annotated but uncharacterized HEN. Here we examine the activity of the polB HEN in vivo, within its natural archaeal host. We show that this HEN is highly active, and able to insert the intein into both a chromosomal target and an extra-chromosomal plasmid target, by gene conversion. We also demonstrate that the frequency of its incorporation depends on the length of the flanking homologous sequences around the target site, reflecting its dependence on the homologous recombination machinery. Although several evolutionary models predict that the presence of an intein involves a change in the fitness of the host organism, our results show that a strain deleted for the intein sequence shows no significant changes in growth rate compared to the wild type.
Amadori-modified proteins (AMPs) are the products of nonenzymatic glycation formed by reaction of reducing sugars with primary amine-containing amino acids and can develop into advanced glycated end products (AGEs), highly stable toxic compounds. AGEs are known to participate in many age-related human diseases, including cardiovascular, neurological, and liver diseases. The metabolism of these glycated proteins is not yet understood, and the mechanisms that reduce their accumulation are not known so far. Here, we show for Escherichia coli that a conserved glycopeptidase (Gcp, also called Kae1), which is encoded by nearly every sequenced genome in the three domains of life, prevents the accumulation of Amadori products and AGEs. Using mutants, we show that Gcp depletion results in accumulation of AMPs and eventually leads to the accumulation of AGEs. We demonstrate that Gcp binds to glycated proteins, including pyruvate dehydrogenase, previously shown to be a glycation-prone enzyme. Our experiments also show that the severe phenotype of Gcp depletion can be relieved under conditions of low intracellular glycation. As glycated proteins are ubiquitous, the involvement of Gcp in the metabolism of AMPs and AGEs is likely to have been conserved in evolution, suggesting a universal involvement of Gcp in cellular aging and explaining the essentiality of Gcp in many organisms.
Glycated proteins (Amadori-modified proteins [AMPs] and advanced glycated end products [AGEs]) are known to participate in many age-related diseases. Their existence in fast-growing organisms was considered unlikely, as their formation was assumed to be slow. Yet, recent evidence demonstrated their existence in bacteria, and our data suggest a bacterial mechanism that reduced their accumulation. We identify in Escherichia coli a protein, Gcp, which carries out this function. Gcp is conserved in all domains of life and is essential in many organisms. Although it was annotated as a chaperon protease, there were no experimental data to support this function. Our findings are compatible with the annotation and will open up studies of the bacterial metabolism of glycated proteins. Furthermore, the data from the bacterial systems may also be instrumental in understanding the metabolism of glycated proteins, including their toxicity in human health and disease.
We propose a method for deriving enzymatic signatures from short read metagenomic data of unknown species. The short read data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzymatic function as defined by the Enzyme Commission (EC) nomenclature. The number of SP hits on an ensemble of short reads is counted and then converted to estimates of numbers of enzymatic genes associated with different EC categories in the studied metagenome. Relative amounts of different EC categories define the enzymatic spectrum, without the need to perform genomic assemblies of short reads.
The method is developed and tested on 22 bacteria for which there exist many EC annotations in Uniprot. Enzymatic signatures are derived for 3 metagenomes, and their functional profiles are explored.
We extend the SP methodology to taxon-specific SPs (TSPs), allowing us to estimate taxonomic features of metagenomic data from short reads. Using recent Swiss-Prot data we obtain TSPs for different phyla of bacteria, and different classes of proteobacteria. These allow us to analyze the major taxonomic content of 4 different metagenomic data-sets.
The SP methodology can be successfully extended to applications on short read genomic and metagenomic data. This leads to direct derivation of enzymatic signatures from raw short reads. Furthermore, by employing TSPs, one obtains valuable taxonomic information.
Carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1), an immunoglobulin (Ig)-related glycoprotein, serves as cellular receptor for a variety of Gram-negative bacterial pathogens associated with the human mucosa. In particular, Neisseria gonorrhoeae, N. meningitidis, Moraxella catarrhalis, and Haemophilus influenzae possess well-characterized CEACAM1-binding adhesins. CEACAM1 is typically involved in cell-cell attachment, epithelial differentiation, neovascularisation and regulation of T-cell proliferation, and is one of the few CEACAM family members with homologues in different mammalian lineages. However, it is unknown whether bacterial adhesins of human pathogens can recognize CEACAM1 orthologues from other mammals.
Sequence comparisons of the amino-terminal Ig-variable-like domain of CEACAM1 reveal that the highest sequence divergence between human, murine, canine and bovine orthologues is found in the β-strands comprising the bacteria-binding CC'FG-face of the Ig-fold. Using GFP-tagged, soluble amino-terminal domains of CEACAM1, we demonstrate that bacterial pathogens selectively associate with human, but not other mammalian CEACAM1 orthologues. Whereas full-length human CEACAM1 can mediate internalization of Neisseria gonorrhoeae in transfected cells, murine CEACAM1 fails to support bacterial internalization, demonstrating that the sequence divergence of CEACAM1 orthologues has functional consequences with regard to bacterial recognition and cellular invasion.
Our results establish the selective interaction of several human-restricted bacterial pathogens with human CEACAM1 and suggest that co-evolution of microbial adhesins with their corresponding receptors on mammalian cells contributes to the limited host range of these highly adapted infectious agents.
In their natural environments, microorganisms form complex systems of interactions. Understating the structure and organization of bacterial communities is likely to have broad medical and ecological consequences, yet a comprehensive description of the network of environmental interactions is currently lacking. Here, we mine co-occurrences in the scientific literature to construct such a network and demonstrate an expected pattern of association between the species’ lifestyle and the recorded number of co-occurring partners. We further focus on the well-annotated gut community and show that most co-occurrence interactions of typical gut bacteria occur within this community. The network is then clustered into species-groups that significantly correspond with natural occurring communities. The relationships between resource competition, metabolic yield and growth rate within the clusters correspond with the r/K selection theory. Overall, these results support the constructed clusters as a first approximation of a bacterial ecosystem model. This comprehensive collection of predicted communities forms a new data resource for further systematic characterization of the ecological design principals shaping communities. Here, we demonstrate its utility for predicting cooperation and inhibition within communities.
The evolutionary origins of genetic robustness are still under debate: it may arise as a consequence of requirements imposed by varying environmental conditions, due to intrinsic factors such as metabolic requirements, or directly due to an adaptive selection in favor of genes that allow a species to endure genetic perturbations. Stratifying the individual effects of each origin requires one to study the pertaining evolutionary forces across many species under diverse conditions. Here we conduct the first large-scale computational study charting the level of robustness of metabolic networks of hundreds of bacterial species across many simulated growth environments. We provide evidence that variations among species in their level of robustness reflect ecological adaptations. We decouple metabolic robustness into two components and quantify the extents of each: the first, environmental-dependent, is responsible for at least 20% of the non-essential reactions and its extent is associated with the species' lifestyle (specialized/generalist); the second, environmental-independent, is associated (correlation = ∼0.6) with the intrinsic metabolic capacities of a species—higher robustness is observed in fast growers or in organisms with an extensive production of secondary metabolites. Finally, we identify reactions that are uniquely susceptible to perturbations in human pathogens, potentially serving as novel drug-targets.
When a species is grown under optimal conditions the single-knockout of most of its genes is not likely to affect its viability. The resilience of biological systems to mutations is termed genetic robustness and its extent across different species has not yet been systematically described. Since the deletion of a gene can have varying consequences depending on the environmental conditions, the extent of species' genetic robustness reflects both the range of conditions (or environments) in which it can survive as well as the availability of alternative cellular routes (compensating for a gene's loss of function). Here, we developed a computational model for estimating the essentiality of metabolic reactions across natural-like environments and applied it to chart species' level of genetic robustness, providing the first systematic description of genetic robustness across species. Studying robustness across a wide collection of natural-like environments enables one to stratify, for each species individually, the extent of environmental-dependant and independent robustness and hence advances our understanding of its evolutionary origins. Our main finding is that the level of environmental dependent robustness is associated with the lifestyle of a species (i.e., specialists versus generalist), whereas the level of environmental-independent robustness is associated with its metabolic production capacities.
Thymidylate synthases (Thy) are key enzymes in the synthesis of deoxythymidylate, 1 of the 4 building blocks of DNA. As such, they are essential for all DNA-based forms of life and therefore implicated in the hypothesized transition from RNA genomes to DNA genomes. Two evolutionally unrelated Thy enzymes, ThyA and ThyX, are known to catalyze the same biochemical reaction. Both enzymes are sporadically distributed within each of the 3 domains of life in a pattern that suggests multiple nonhomologous lateral gene transfer (LGT) events. We present a phylogenetic analysis of the evolution of the 2 enzymes, aimed at unraveling their entangled evolutionary history and tracing their origin back to early life. A novel probabilistic evolutionary model was developed, which allowed us to compute the posterior probabilities and the posterior expectation of the number of LGT events. Simulation studies were performed to validate the model's ability to accurately detect LGT events, which have occurred throughout a large phylogeny. Applying the model to the Thy data revealed widespread nonhomologous LGT between and within all 3 domains of life. By reconstructing the ThyA and ThyX gene trees, the most likely donor of each LGT event was inferred. The role of viruses in LGT of Thy is finally discussed.
Evolutionary models; lateral gene transfer; thymidylate synthase
Probabilistic evolutionary models revolutionized our capability to extract biological insights from sequence data. While these models accurately describe the stochastic processes of site-specific substitutions, single-base substitutions represent only a fraction of all the events that shape genomes. Specifically, in microbes, events in which entire genes are gained (e.g. via horizontal gene transfer) and lost play a pivotal evolutionary role. In this research, we present a novel likelihood-based evolutionary model for gene gains and losses, and use it to analyse genome-wide patterns of the presence and absence of gene families. The model assumes a Markovian stochastic process, where gains and losses are represented by the transition between presence and absence, respectively, given an underlying phylogenetic tree. To account for differences in the rates of gain and loss of different gene families, we assume among-gene family rate variability, thus allowing for more accurate description of the data. Using the Bayesian approach, we estimated an evolutionary rate for each gene family. Simulation studies demonstrated that our methodology accurately infers these rates. Our methodology was applied to analyse a large corpus of data, consisting of 4873 gene families spanning 63 species and revealed novel insights regarding the evolutionary nature of genome-wide gain and loss dynamics.
phyletic pattern; probabilistic evolutionary models; genome evolution; gene gain and loss; horizontal gene transfer; gene content
Bacterial ecological strategies revealed by metabolic network analysis show that ecological diversity correlates with metabolic flexibility, faster growth rate and intense co-habitation.
The growth-rate of an organism is an important phenotypic trait, directly affecting its ability to survive in a given environment. Here we present the first large scale computational study of the association between ecological strategies and growth rate across 113 bacterial species, occupying a variety of metabolic habitats. Genomic data are used to reconstruct the species' metabolic networks and habitable metabolic environments. These reconstructions are then used to investigate the typical ecological strategies taken by organisms in terms of two basic species-specific measures: metabolic variability - the ability of a species to survive in a variety of different environments; and co-habitation score vector - the distribution of other species that co-inhabit each environment.
We find that growth rate is significantly correlated with metabolic variability and the level of co-habitation (that is, competition) encountered by an organism. Most bacterial organisms adopt one of two main ecological strategies: a specialized niche with little co-habitation, associated with a typically slow rate of growth; or ecological diversity with intense co-habitation, associated with a typically fast rate of growth.
The pattern observed suggests a universal principle where metabolic flexibility is associated with a need to grow fast, possibly in the face of competition. This new ability to produce a quantitative description of the growth rate-metabolism-community relationship lays a computational foundation for the study of a variety of aspects of the communal metabolic life.