Since the reclassification of all life forms in three Domains (Archaea, Bacteria, Eukarya), the identity of their alleged forerunner (Last Universal Common Ancestor or LUCA) has been the subject of extensive controversies: progenote or already complex organism, prokaryote or protoeukaryote, thermophile or mesophile, product of a protracted progression from simple replicators to complex cells or born in the cradle of "catalytically closed" entities? We present a critical survey of the topic and suggest a scenario.
LUCA does not appear to have been a simple, primitive, hyperthermophilic prokaryote but rather a complex community of protoeukaryotes with a RNA genome, adapted to a broad range of moderate temperatures, genetically redundant, morphologically and metabolically diverse. LUCA's genetic redundancy predicts loss of paralogous gene copies in divergent lineages to be a significant source of phylogenetic anomalies, i.e. instances where a protein tree departs from the SSU-rRNA genealogy; consequently, horizontal gene transfer may not have the rampant character assumed by many. Examining membrane lipids suggest LUCA had sn1,2 ester fatty acid lipids from which Archaea emerged from the outset as thermophilic by "thermoreduction," with a new type of membrane, composed of sn2,3 ether isoprenoid lipids; this occurred without major enzymatic reconversion. Bacteria emerged by reductive evolution from LUCA and some lineages further acquired extreme thermophily by convergent evolution. This scenario is compatible with the hypothesis that the RNA to DNA transition resulted from different viral invasions as proposed by Forterre. Beyond the controversy opposing "replication first" to metabolism first", the predictive arguments of theories on "catalytic closure" or "compositional heredity" heavily weigh in favour of LUCA's ancestors having emerged as complex, self-replicating entities from which a genetic code arose under natural selection.
Life was born complex and the LUCA displayed that heritage. It had the "body "of a mesophilic eukaryote well before maturing by endosymbiosis into an organism adapted to an atmosphere rich in oxygen. Abundant indications suggest reductive evolution of this complex and heterogeneous entity towards the "prokaryotic" Domains Archaea and Bacteria. The word "prokaryote" should be abandoned because epistemologically unsound.
This article was reviewed by Anthony Poole, Patrick Forterre, and Nicolas Galtier.
Glutaminyl-tRNA synthetase and asparaginyl-tRNA synthetase evolved from glutamyl-tRNA synthetase and aspartyl-tRNA synthetase, respectively, after the split in the last universal communal ancestor (LUCA). Glutaminyl-tRNAGln and asparaginyl-tRNAAsn were likely formed in LUCA by amidation of the mischarged species, glutamyl-tRNAGln and aspartyl-tRNAAsn, by tRNA-dependent amidotransferases as is still the case in most bacteria and all known archaea. The amidotransferase GatCAB is found in both domains of life while the heterodimeric amidotransferase, GatDE, is found only in Archaea. The GatB and GatE subunits belong to a unique protein family with Pet112 that is encoded in the nuclear genomes of numerous eukaryotes. GatE was thought to have evolved from GatB after the emergence of the modern lines of decent. Our phylogenetic analysis though places the split between GatE and GatB prior to the phylogenetic divide between Bacteria and Archaea and Pet112 to be of mitochondrial origin. In addition, GatD appears to have emerged prior to the bacterial-archaeal phylogenetic divide. Thus, while GatDE is an archaeal signature protein it likely was present in LUCA together with GatCAB. Archaea retained both amidotransferases while Bacteria emerged with only GatCAB. The presence of GatDE has favored a unique archaeal tRNAGln that may be preventing acquisition of glutaminyl-tRNA synthetase in Archaea. Archaeal GatCAB on the other hand has not favored a distinct tRNAAsn suggesting tRNAAsn recognition is not a major barrier to the retention of asparaginyl-tRNA synthetase in more Archaea.
tRNA-dependent amidotransferase; GatCAB; GatDE; Pet112; LUCA
Despite recent advances in our understanding of diverse aspects of virus evolution, particularly on the epidemiological scale, revealing the ultimate origins of viruses has proven to be a more intractable problem. Herein, I review some current ideas on the evolutionary origins of viruses and assess how well these theories accord with what we know about the evolution of contemporary viruses. I note the growing evidence for the theory that viruses arose before the last universal cellular ancestor (LUCA). This ancient origin theory is supported by the presence of capsid architectures that are conserved among diverse RNA and DNA viruses and by the strongly inverse relationship between genome size and mutation rate across all replication systems, such that pre-LUCA genomes were probably both small and highly error prone and hence RNA virus-like. I also highlight the advances that are needed to come to a better understanding of virus origins, most notably the ability to accurately infer deep evolutionary history from the phylogenetic analysis of conserved protein structures.
RNA metabolism, broadly defined as the compendium of all processes that involve RNA, including transcription, processing and modification of transcripts, translation, RNA degradation and its regulation, is the central and most evolutionarily conserved part of cell physiology. A comprehensive, genome-wide census of all enzymatic and non-enzymatic protein domains involved in RNA metabolism was conducted by using sequence profile analysis and structural comparisons. Proteins related to RNA metabolism comprise from 3 to 11% of the complete protein repertoire in bacteria, archaea and eukaryotes, with the greatest fraction seen in parasitic bacteria with small genomes. Approximately one-half of protein domains involved in RNA metabolism are present in most, if not all, species from all three primary kingdoms and are traceable to the last universal common ancestor (LUCA). The principal features of LUCA’s RNA metabolism system were reconstructed by parsimony-based evolutionary analysis of all relevant groups of orthologous proteins. This reconstruction shows that LUCA possessed not only the basal translation system, but also the principal forms of RNA modification, such as methylation, pseudouridylation and thiouridylation, as well as simple mechanisms for polyadenylation and RNA degradation. Some of these ancient domains form paralogous groups whose evolution can be traced back in time beyond LUCA, towards low-specificity proteins, which probably functioned as cofactors for ribozymes within the RNA world framework. The main lineage-specific innovations of RNA metabolism systems were identified. The most notable phase of innovation in RNA metabolism coincides with the advent of eukaryotes and was brought about by the merge of the archaeal and bacterial systems via mitochondrial endosymbiosis, but also involved emergence of several new, eukaryote-specific RNA-binding domains. Subsequent, vast expansions of these domains mark the origin of alternative splicing in animals and probably in plants. In addition to the reconstruction of the evolutionary history of RNA metabolism, this analysis produced numerous functional predictions, e.g. of previously undetected enzymes of RNA modification.
The last universal common ancestor (LUCA) might have been either prokaryotic- or eukaryotic-like. Nevertheless, the universally distributed components suggest rather LUCA consistent with the pre-cell theory of Kandler. The hypotheses for the origin of eukaryotes are briefly summarized. The models under which prokaryotes or their chimeras were direct ancestors of eukaryotes are criticized. It is proposed that the pre-karyote (a host entity for α-proteobacteria) was a remnant of pre-cellular world, and was unlucky to have evolved fusion prohibiting cell surface, and thus could have evolved sex. The DNA damage checkpoint pathway could have represented the only pre-karyotic checkpoint control allowing division only when DNA was completely replicated without mistakes. The fusion of two partially diploid (in S-phase blocked) pre-karyotes might have represented another repair strategy. After completing replication of both haploid sets, DNA damage checkpoint would allow two subsequent rounds of fission. Alternatively, pre-karyote might have possessed two membranes inherited from LUCA. Under this hypothesis symbiotic α-proteobacterial ancestors of mitochondria might have ancestrally been selfish parasites of pre-karyote intermembrane space whose infection might have been analogous to infection of G--bacterial periplasm by Bdellovibrio sp. It is suggested that eukaryotic plasma membrane might be derived from pre-karyote outer membrane and nuclear/ER membrane might be derived from pre-karyote inner membrane. Thus the nucleoplasm might be derived from pre-karyote cytoplasm and eukaryotic cytoplasm might be homologous to pre-karyote periplasm.
archaea; Bdellovibrio; endomembranes; evolution; ER; LACA; LECA; LUCA; meiosis; mitochondria; nucleus; phagocytosis; prokaryote
An evolutionary tree of key enzymes from the Complex-Iron-Sulfur-Molybdoenzyme (CISM) superfamily distinguishes “ancient” members, i.e. enzymes present already in the last universal common ancestor (LUCA) of prokaryotes, from more recently evolved subfamilies. The majority of the presented subfamilies and, as a consequence, the Molybdo-enzyme superfamily as a whole, appear to have existed in LUCA. The results are discussed with respect to the nature of bioenergetic substrates available to early life and to problems arising from the low solubility of molybdenum under conditions of the primordial Earth.
Despite progresses in ancestral protein sequence reconstruction, much needs to be unraveled about the nature of the putative last common ancestral proteome that served as the prototype of all extant lifeforms. Here, we present data that indicate a steady decline (oil escape) in proteome hydrophobicity over species evolvedness (node number) evident in 272 diverse proteomes, which indicates a highly hydrophobic (oily) last common ancestor (LCA). This trend, obtained from simple considerations (free from sequence reconstruction methods), was corroborated by regression studies within homologous and orthologous protein clusters as well as phylogenetic estimates of the ancestral oil content. While indicating an inherent irreversibility in molecular evolution, oil escape also serves as a rare and universal reaction-coordinate for evolution (reinforcing Darwin's principle of Common Descent), and may prove important in matters such as (i) explaining the emergence of intrinsically disordered proteins, (ii) developing composition- and speciation-based “global” molecular clocks, and (iii) improving the statistical methods for ancestral sequence reconstruction.
Although of importance to both evolution and protein design, the manner in which the first proteome came to be, and the actual features of the earliest ancestral proteomes are both unknown. Through the analysis of diverse proteomes, we provide glimpses into the composition of the last common ancestor (LUCA) of all lifeforms, which indicate that the earliest/last common ancestor had a proteome that was highly hydrophobic/oily. Notably, the evidence presented (a) indicates that proteomes of all species ranging from bacteria to mammals appear to adhere to the same universal constraint (“oil escape”) set into motion by the last common ancestor more than 3.5 billion years ago, (b) indicates the presence of a previously untapped global (composition-level) molecular clock, and (c) strengthens the non-equilibrium/directional view of amino acid substitutions that challenges central dogmas regarding reversibility in molecular evolution.
The cystatin superfamily comprises cysteine protease inhibitors that play key regulatory roles in protein degradation processes. Although they have been the subject of many studies, little is known about their genesis, evolution and functional diversification. Our aim has been to obtain a comprehensive insight into their origin, distribution, diversity, evolution and classification in Eukaryota, Bacteria and Archaea.
We have identified in silico the full complement of the cystatin superfamily in more than 2100 prokaryotic and eukaryotic genomes. The analysis of numerous eukaryotic genomes has provided strong evidence for the emergence of this superfamily in the ancestor of eukaryotes. The progenitor of this superfamily was most probably intracellular and lacked a signal peptide and disulfide bridges, much like the extant Giardia cystatin. A primordial gene duplication produced two ancestral eukaryotic lineages, cystatins and stefins. While stefins remain encoded by a single or a small number of genes throughout the eukaryotes, the cystatins have undergone a more complex and dynamic evolution through numerous gene and domain duplications. In the cystatin superfamily we discovered twenty vertebrate-specific and three angiosperm-specific orthologous families, indicating that functional diversification has occurred only in multicellular eukaryotes. In vertebrate orthologous families, the prevailing trends were loss of the ancestral inhibitory activity and acquisition of novel functions in innate immunity. Bacterial cystatins and stefins may be emergency inhibitors that enable survival of bacteria in the host, defending them from the host's proteolytic activity.
This study challenges the current view on the classification, origin and evolution of the cystatin superfamily and provides valuable insights into their functional diversification. The findings of this comprehensive study provide guides for future structural and evolutionary studies of the cystatin superfamily as well as of other protease inhibitors and proteases.
In the Universe, oxygen is the third most widespread element, while on Earth it is the most abundant one. Moreover, oxygen is a major constituent of all biopolymers fundamental to living organisms. Besides O2, reactive oxygen species (ROS), among them hydrogen peroxide (H2O2), are also important reactants in the present aerobic metabolism. According to a widely accepted hypothesis, aerobic metabolism and many other reactions/pathways involving O2 appeared after the evolution of oxygenic photosynthesis. In this study, the hypothesis was formulated that the Last Universal Common Ancestor (LUCA) was at least able to tolerate O2 and detoxify ROS in a primordial environment. A comparative analysis was carried out of a number of the O2-and H2O2-involving metabolic reactions that occur in strict anaerobes, facultative anaerobes, and aerobes. The results indicate that the most likely LUCA possessed O2-and H2O2-involving pathways, mainly reactions to remove ROS, and had, at least in part, the components of aerobic respiration. Based on this, the presence of a low, but significant, quantity of H2O2 and O2 should be taken into account in theoretical models of the early Archean atmosphere and oceans and the evolution of life. It is suggested that the early metabolism involving O2/H2O2 was a key adaptation of LUCA to already existing weakly oxic zones in Earth's primordial environment. Key Words: Hydrogen peroxide—Oxygen—Origin of life—Photosynthesis—Superoxide dismutase—Superoxide reductase. Astrobiology 12, 775–784.
Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (http://eeb.princeton.edu/lucapedia), a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.
Single copy genes, universally distributed across the three domains of life and encoding mostly ancient parts of the translation machinery, are thought to be only rarely subjected to horizontal gene transfer (HGT). Indeed it has been proposed to have occurred in only a few genes and implies a rare, probably not advantageous event in which an ortholog displaces the original gene and has to function in a foreign context (orthologous gene displacement, OGD). Here, we have utilised an automatic method to identify HGT based on a conservative statistical approach capable of robustly assigning both donors and acceptors. Applied to 40 universally single copy genes we found that as many as 68 HGTs (implying OGDs) have occurred in these genes with a rate of 1.7 per family since the last universal common ancestor (LUCA). We examined a number of factors that have been claimed to be fundamental to HGT in general and tested their validity in the subset of universally distributed single copy genes. We found that differing functional constraints impact rates of OGD and the more evolutionarily distant the donor and acceptor, the less likely an OGD is to occur. Furthermore, species with larger genomes are more likely to be subjected to OGD. Most importantly, regardless of the trends above, the number of OGDs increases linearly with time, indicating a neutral, constant rate. This suggests that levels of HGT above this rate may be indicative of positively selected transfers that may allow niche adaptation or bestow other benefits to the recipient organism.
Two theories for the origin of animal life cycles with planktotrophic larvae are now discussed seriously: The terminal addition theory proposes a holopelagic, planktotrophic gastraea as the ancestor of the eumetazoans with addition of benthic adult stages and retention of the planktotrophic stages as larvae, i.e. the ancestral life cycles were indirect. The intercalation theory now proposes a benthic, deposit-feeding gastraea as the bilaterian ancestor with a direct development, and with planktotrophic larvae evolving independently in numerous lineages through specializations of juveniles.
Information from the fossil record, from mapping of developmental types onto known phylogenies, from occurrence of apical organs, and from genetics gives no direct information about the ancestral eumetazoan life cycle; however, there are plenty of examples of evolution from an indirect development to direct development, and no unequivocal example of evolution in the opposite direction. Analyses of scenarios for the two types of evolution are highly informative. The evolution of the indirect spiralian life cycle with a trochophora larva from a planktotrophic gastraea is explained by the trochophora theory as a continuous series of ancestors, where each evolutionary step had an adaptational advantage. The loss of ciliated larvae in the ecdysozoans is associated with the loss of outer ciliated epithelia. A scenario for the intercalation theory shows the origin of the planktotrophic larvae of the spiralians through a series of specializations of the general ciliation of the juvenile. The early steps associated with the enhancement of swimming seem probable, but the following steps which should lead to the complicated downstream-collecting ciliary system are without any advantage, or even seem disadvantageous, until the whole structure is functional. None of the theories account for the origin of the ancestral deuterostome (ambulacrarian) life cycle.
All the available information is strongly in favor of multiple evolution of non-planktotrophic development, and only the terminal addition theory is in accordance with the Darwinian theory by explaining the evolution through continuous series of adaptational changes. This implies that the ancestor of the eumetazoans was a holopelagic, planktotrophic gastraea, and that the adult stages of cnidarians (sessile) and bilaterians (creeping) were later additions to the life cycle. It further implies that the various larval types are of considerable phylogenetic value.
Larvae; Evolution; Adaptation; Planktotrophy; Gastraea; Trochaea; Dipleurula
The root of the tree of life has been a holy grail ever since Darwin first used the tree as a metaphor for evolution. New methods seek to narrow down the location of the root by excluding it from branches of the tree of life. This is done by finding traits that must be derived, and excluding the root from the taxa those traits cover. However the two most comprehensive attempts at this strategy, performed by Cavalier-Smith and Lake et al., have excluded each other's rootings.
The indel polarizations of Lake et al. rely on high quality alignments between paralogs that diverged before the last universal common ancestor (LUCA). Therefore, sequence alignment artifacts may skew their conclusions. We have reviewed their data using protein structure information where available. Several of the conclusions are quite different when viewed in the light of structure which is conserved over longer evolutionary time scales than sequence. We argue there is no polarization that excludes the root from all Gram-negatives, and that polarizations robustly exclude the root from the Archaea.
We conclude that there is no contradiction between the polarization datasets. The combination of these datasets excludes the root from every possible position except near the Chloroflexi.
This article was reviewed by Greg Fournier (nominated by J. Peter Gogarten), Purificación López-García, and Eugene Koonin.
Following the publication of the Origin of Species in 1859, many naturalists adopted the idea that living organisms were the historical outcome of gradual transformation of lifeless matter. These views soon merged with the developments of biochemistry and cell biology and led to proposals in which the origin of protoplasm was equated with the origin of life. The heterotrophic origin of life proposed by Oparin and Haldane in the 1920s was part of this tradition, which Oparin enriched by transforming the discussion of the emergence of the first cells into a workable multidisciplinary research program.
On the other hand, the scientific trend toward understanding biological phenomena at the molecular level led authors like Troland, Muller, and others to propose that single molecules or viruses represented primordial living systems. The contrast between these opposing views on the origin of life represents not only contrasting views of the nature of life itself, but also major ideological discussions that reached a surprising intensity in the years following Stanley Miller’s seminal result which showed the ease with which organic compounds of biochemical significance could be synthesized under putative primitive conditions. In fact, during the years following the Miller experiment, attempts to understand the origin of life were strongly influenced by research on DNA replication and protein biosynthesis, and, in socio-political terms, by the atmosphere created by Cold War tensions.
The catalytic versatility of RNA molecules clearly merits a critical reappraisal of Muller’s viewpoint. However, the discovery of ribozymes does not imply that autocatalytic nucleic acid molecules ready to be used as primordial genes were floating in the primitive oceans, or that the RNA world emerged completely assembled from simple precursors present in the prebiotic soup. The evidence supporting the presence of a wide range of organic molecules on the primitive Earth, including membrane-forming compounds, suggests that the evolution of membrane-bounded molecular systems preceded cellular life on our planet, and that life is the evolutionary outcome of a process, not of a single, fortuitous event.
Research on life's origins started with naturalists following in Darwin's footsteps. It has since given us the “prebiotic soup,” the “RNA world,” and the notion that life resulted from a process, not an event.
The discovery of giant viruses with genome and physical size comparable to cellular organisms, remnants of protein translation machinery and virus-specific parasites (virophages) have raised intriguing questions about their origin. Evidence advocates for their inclusion into global phylogenomic studies and their consideration as a distinct and ancient form of life.
Here we reconstruct phylogenies describing the evolution of proteomes and protein domain structures of cellular organisms and double-stranded DNA viruses with medium-to-very-large proteomes (giant viruses). Trees of proteomes define viruses as a ‘fourth supergroup’ along with superkingdoms Archaea, Bacteria, and Eukarya. Trees of domains indicate they have evolved via massive and primordial reductive evolutionary processes. The distribution of domain structures suggests giant viruses harbor a significant number of protein domains including those with no cellular representation. The genomic and structural diversity embedded in the viral proteomes is comparable to the cellular proteomes of organisms with parasitic lifestyles. Since viral domains are widespread among cellular species, we propose that viruses mediate gene transfer between cells and crucially enhance biodiversity.
Results call for a change in the way viruses are perceived. They likely represent a distinct form of life that either predated or coexisted with the last universal common ancestor (LUCA) and constitute a very crucial part of our planet’s biosphere.
The nucleo-cytoplasmic large DNA viruses (NCLDV) constitute an apparently monophyletic group that consists of 6 families of viruses infecting a broad variety of eukaryotes. A comprehensive genome comparison and maximum-likelihood reconstruction of NCLDV evolution reveal a set of approximately 50 conserved genes that can be tentatively mapped to the genome of the common ancestor of this class of eukaryotic viruses. We address the origins and evolution of NCLDV.
Phylogenetic analysis indicates that some of the major clades of NCLDV infect diverse animals and protists, suggestive of early radiation of the NCLDV, possibly concomitant with eukaryogenesis. The core NCLDV genes seem to have originated from different sources including homologous genes of bacteriophages, bacteria and eukaryotes. These observations are compatible with a scenario of the origin of the NCLDV at an early stage of the evolution of eukaryotes through extensive mixing of genes from widely different genomes.
The common ancestor of the NCLDV probably evolved from a bacteriophage as a result of recruitment of numerous eukaryotic and some bacterial genes, and concomitant loss of the majority of phage genes except for a small core of genes coding for proteins essential for virus genome replication and virion formation.
Bacteriophage; Eukaryogenesis; Nucleo-cytoplasmic large DNA viruses, evolution; Phylogenetic analysis
Since the late 1970s, determining the phylogenetic relationships among the contemporary domains of life, the Archaea (archaebacteria), Bacteria (eubacteria), and Eucarya (eukaryotes), has been central to the study of early cellular evolution. The two salient issues surrounding the universal tree of life are whether all three domains are monophyletic (i.e., all equivalent in taxanomic rank) and where the root of the universal tree lies. Evaluation of the status of the Archaea has become key to answering these questions. This review considers our cumulative knowledge about the Archaea in relationship to the Bacteria and Eucarya. Particular attention is paid to the recent use of molecular phylogenetic approaches to reconstructing the tree of life. In this regard, the phylogenetic analyses of more than 60 proteins are reviewed and presented in the context of their participation in major biochemical pathways. Although many gene trees are incongruent, the majority do suggest a sisterhood between Archaea and Eucarya. Altering this general pattern of gene evolution are two kinds of potential interdomain gene transferrals. One horizontal gene exchange might have involved the gram-positive Bacteria and the Archaea, while the other might have occurred between proteobacteria and eukaryotes and might have been mediated by endosymbiosis.
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.
All theories about the origin and evolution of membrane bound cells necessarily have to cope with the nature of the last common ancestor of cellular life. One of the most important aspect of this ancestor, whether it had a closed biological membrane or not, has recently been intensely debated. Having a consensus about it would be an important step towards an eventual (though probably still remote) synthesis of the best elements of the current multitude of cell evolution models. Here I analyse the structural and functional conservation of the few universally distributed proteins that were undoubtedly present in the last common ancestor and that carry out membrane-associated functions. These include the SecY subunit of the protein-conducting channel, the signal recognition particle, the signal recognition particle receptor, the signal peptidase, and the proton ATPase. The conserved structural and functional aspects of these proteins indicate that the last common ancestor was associated with a hydrophobic layer with two hydrophilic sides (an inside and an outside) that had a full-fledged and asymmetric protein insertion and translocation machinery and served as a permeability barrier for protons and other small molecules. It is difficult to escape the conclusion that the last common ancestor had a closed biological membrane from which all cellular membranes evolved.
Glycosylation is an important aspect of epigenetic regulation. Glycosyltransferase is a key enzyme in the biosynthesis of glycans, which glycosylates more than half of all proteins in eukaryotes and is involved in a wide range of biological processes. It has been suggested previously that homooligomerization in glycosyltransferases and other proteins might be crucial for their function. In this study, we explore functional homooligomeric states of glycosyltransferases in various organisms, trace their evolution and perform comparative analyses to find structural features which can mediate or disrupt the formation of different homooligomers. First we make a structure-based classification of the diverse superfamily of glycosyltransferases and confirm that the majority of the structures are indeed clustered into the GT-A or GT-B folds. We find that homooligomeric glycosyltransferases appear to be as ancient as monomeric glycosyltransferases and go back in evolution to the last universal common ancestor (LUCA). Moreover, we show that interface residues have significant bias to be gapped out or unaligned in the monomers implying that they might represent features crucial for oligomer formation. Structural analysis of these features reveals that the vast majority of them represent loops, terminal regions and helices indicating that these secondary structure elements mediate the formation of glycosyltransferases' homooligomers and directly contribute to the specific binding. We also observe relatively short protein regions which disrupt the homodimer interactions although such cases are rare. These results suggest that relatively small structural changes in the non-conserved regions may contribute to the formation of different functional oligomeric states and might be important in regulation of enzyme activity through homooligomerization.
Glycosyltransferase; Homodimer; Homooligomerization; Interface; Protein structural evolution
Since Haldane first noticed an excess of paternally derived mutations, it has
been considered that most mutations derive from errors during germ line
replication. Miyata et al. (1987) proposed that differences in the rate of
neutral evolution on X, Y, and autosome can be employed to measure the extent of
this male bias. This commonly applied method assumes replication to be the sole
source of between-chromosome variation in substitution rates. We propose a
simple test of this assumption: If true, estimates of the male bias should be
independent of which two chromosomal classes are compared. Prior evidence from
rodents suggested that this might not be true, but conclusions were limited by a
lack of rat Y-linked sequence. We therefore sequenced two rat Y-linked bacterial
artificial chromosomes and determined evolutionary rate by comparison with
mouse. For estimation of rates we consider both introns and synonymous rates.
Surprisingly, for both data sets the prediction of congruent estimates of
α is strongly rejected. Indeed, some comparisons suggest a female bias
with autosomes evolving faster than Y-linked sequence. We conclude that the
method of Miyata et al. (1987) has the potential to provide incorrect estimates.
Correcting the method requires understanding of the other causes of substitution
that might differ between chromosomal classes. One possible cause is
recombination-associated substitution bias for which we find some evidence. We
note that if, as some suggest, this association is dominantly owing to male
recombination, the high estimates of α seen in birds is to be expected
as Z chromosomes recombine in males.
male-mutation bias; male-driven evolution; mutation; recombination; introns; rodents
The ubiquity of mechanosensitive (MS) channels triggered a search for
their functional homologs in Archaea. Archaeal MS channels were found
to share a common ancestral origin with bacterial MS channels of large
and small conductance, and sequence homology with several proteins
that most likely function as MS ion channels in prokaryotic and
eukaryotic cell-walled organisms. Although bacterial and archaeal MS
channels differ in conductive and mechanosensitive properties, they
share similar gating mechanisms triggered by mechanical force
transmitted via the lipid bilayer. In this review, we suggest that MS
channels of Archaea can bridge the evolutionary gap between bacterial
and eukaryotic MS channels, and that MS channels of Bacteria, Archaea
and cell-walled Eukarya may serve similar physiological functions and
may have evolved to protect the fragile cellular membranes in these
organisms from excessive dilation and rupture upon osmotic challenge.
Arabidopsis; mechanosensitivity; phylogeny; yeast
Topoisomerases are essential enzymes that solve topological problems arising from the double-helical structure of DNA. As a consequence, one should have naively expected to find homologous topoisomerases in all cellular organisms, dating back to their last common ancestor. However, as observed for other enzymes working with DNA, this is not the case. Phylogenomics analyses indicate that different sets of topoisomerases were present in the most recent common ancestors of each of the three cellular domains of life (some of them being common to two or three domains), whereas other topoisomerases families or subfamilies were acquired in a particular domain, or even a particular lineage, by horizontal gene transfers. Interestingly, two groups of viruses encode topoisomerases that are only distantly related to their cellular counterparts. To explain these observations, we suggest that topoisomerases originated in an ancestral virosphere, and that various subfamilies were later on transferred independently to different ancient cellular lineages. We also proposed that topoisomerases have played a critical role in the origin of modern genomes and in the emergence of the three cellular domains.
Explaining the origin of viruses remains an important challenge for evolutionary biology. Previous explanatory frameworks described viruses as founders of cellular life, as parasitic reductive products of ancient cellular organisms or as escapees of modern genomes. Each of these frameworks endow viruses with distinct molecular, cellular, dynamic and emergent properties that carry broad and important implications for many disciplines, including biology, ecology and epidemiology. In a recent genome-wide structural phylogenomic analysis, we have shown that large-to-medium-sized viruses coevolved with cellular ancestors and have chosen the evolutionary reductive route. Here we interpret these results and provide a parsimonious hypothesis for the origin of viruses that is supported by molecular data and objective evolutionary bioinformatic approaches. Results suggest two important phases in the evolution of viruses: (1) origin from primordial cells and coexistence with cellular ancestors, and (2) prolonged pressure of genome reduction and relatively late adaptation to the parasitic lifestyle once virions and diversified cellular life took over the planet. Under this evolutionary model, new viral lineages can evolve from existing cellular parasites and enhance the diversity of the world’s virosphere.
giant viruses; parasitism; phylogenomics; protein domains; reductive evolution
Protein domains represent the basic units in the evolution of proteins. Domain duplication and shuffling by recombination and fusion, followed by divergence are the most common mechanisms in this process. Such domain fusion and recombination events are predicted to occur only once for a given multidomain architecture. However, other scenarios may be relevant in the evolution of specific proteins, such as convergent evolution of multidomain architectures. With this in mind, we study glutaredoxin (GRX) domains, because these domains of approximately one hundred amino acids are widespread in archaea, bacteria and eukaryotes and participate in fusion proteins. GRXs are responsible for the reduction of protein disulfides or glutathione-protein mixed disulfides and are involved in cellular redox regulation, although their specific roles and targets are often unclear.
In this work we analyze the distribution and evolution of GRX proteins in archaea, bacteria and eukaryotes. We study over one thousand GRX proteins, each containing at least one GRX domain, from hundreds of different organisms and trace the origin and evolution of the GRX domain within the tree of life.
Our results suggest that single domain GRX proteins of the CGFS and CPYC classes have, each, evolved through duplication and divergence from one initial gene that was present in the last common ancestor of all organisms. Remarkably, we identify a case of convergent evolution in domain architecture that involves the GRX domain. Two independent recombination events of a TRX domain to a GRX domain are likely to have occurred, which is an exception to the dominant mechanism of domain architecture evolution.