Since the reclassification of all life forms in three Domains (Archaea, Bacteria, Eukarya), the identity of their alleged forerunner (Last Universal Common Ancestor or LUCA) has been the subject of extensive controversies: progenote or already complex organism, prokaryote or protoeukaryote, thermophile or mesophile, product of a protracted progression from simple replicators to complex cells or born in the cradle of "catalytically closed" entities? We present a critical survey of the topic and suggest a scenario.
LUCA does not appear to have been a simple, primitive, hyperthermophilic prokaryote but rather a complex community of protoeukaryotes with a RNA genome, adapted to a broad range of moderate temperatures, genetically redundant, morphologically and metabolically diverse. LUCA's genetic redundancy predicts loss of paralogous gene copies in divergent lineages to be a significant source of phylogenetic anomalies, i.e. instances where a protein tree departs from the SSU-rRNA genealogy; consequently, horizontal gene transfer may not have the rampant character assumed by many. Examining membrane lipids suggest LUCA had sn1,2 ester fatty acid lipids from which Archaea emerged from the outset as thermophilic by "thermoreduction," with a new type of membrane, composed of sn2,3 ether isoprenoid lipids; this occurred without major enzymatic reconversion. Bacteria emerged by reductive evolution from LUCA and some lineages further acquired extreme thermophily by convergent evolution. This scenario is compatible with the hypothesis that the RNA to DNA transition resulted from different viral invasions as proposed by Forterre. Beyond the controversy opposing "replication first" to metabolism first", the predictive arguments of theories on "catalytic closure" or "compositional heredity" heavily weigh in favour of LUCA's ancestors having emerged as complex, self-replicating entities from which a genetic code arose under natural selection.
Life was born complex and the LUCA displayed that heritage. It had the "body "of a mesophilic eukaryote well before maturing by endosymbiosis into an organism adapted to an atmosphere rich in oxygen. Abundant indications suggest reductive evolution of this complex and heterogeneous entity towards the "prokaryotic" Domains Archaea and Bacteria. The word "prokaryote" should be abandoned because epistemologically unsound.
This article was reviewed by Anthony Poole, Patrick Forterre, and Nicolas Galtier.
Glutaminyl-tRNA synthetase and asparaginyl-tRNA synthetase evolved from glutamyl-tRNA synthetase and aspartyl-tRNA synthetase, respectively, after the split in the last universal communal ancestor (LUCA). Glutaminyl-tRNAGln and asparaginyl-tRNAAsn were likely formed in LUCA by amidation of the mischarged species, glutamyl-tRNAGln and aspartyl-tRNAAsn, by tRNA-dependent amidotransferases as is still the case in most bacteria and all known archaea. The amidotransferase GatCAB is found in both domains of life while the heterodimeric amidotransferase, GatDE, is found only in Archaea. The GatB and GatE subunits belong to a unique protein family with Pet112 that is encoded in the nuclear genomes of numerous eukaryotes. GatE was thought to have evolved from GatB after the emergence of the modern lines of decent. Our phylogenetic analysis though places the split between GatE and GatB prior to the phylogenetic divide between Bacteria and Archaea and Pet112 to be of mitochondrial origin. In addition, GatD appears to have emerged prior to the bacterial-archaeal phylogenetic divide. Thus, while GatDE is an archaeal signature protein it likely was present in LUCA together with GatCAB. Archaea retained both amidotransferases while Bacteria emerged with only GatCAB. The presence of GatDE has favored a unique archaeal tRNAGln that may be preventing acquisition of glutaminyl-tRNA synthetase in Archaea. Archaeal GatCAB on the other hand has not favored a distinct tRNAAsn suggesting tRNAAsn recognition is not a major barrier to the retention of asparaginyl-tRNA synthetase in more Archaea.
tRNA-dependent amidotransferase; GatCAB; GatDE; Pet112; LUCA
Ribonucleotide reduction is the only de novo pathway for synthesis of deoxyribonucleotides, the building blocks of DNA. The reaction is catalysed by ribonucleotide reductases (RNRs), an ancient enzyme family comprised of three classes. Each class has distinct operational constraints, and are broadly distributed across organisms from all three domains, though few class I RNRs have been identified in archaeal genomes, and classes II and III likewise appear rare across eukaryotes. In this study, we examine whether this distribution is best explained by presence of all three classes in the Last Universal Common Ancestor (LUCA), or by horizontal gene transfer (HGT) of RNR genes. We also examine to what extent environmental factors may have impacted the distribution of RNR classes.
Our phylogenies show that the Last Eukaryotic Common Ancestor (LECA) possessed a class I RNR, but that the eukaryotic class I enzymes are not directly descended from class I RNRs in Archaea. Instead, our results indicate that archaeal class I RNR genes have been independently transferred from bacteria on two occasions. While LECA possessed a class I RNR, our trees indicate that this is ultimately bacterial in origin. We also find convincing evidence that eukaryotic class I RNR has been transferred to the Bacteroidetes, providing a stunning example of HGT from eukaryotes back to Bacteria. Based on our phylogenies and available genetic and genomic evidence, class II and III RNRs in eukaryotes also appear to have been transferred from Bacteria, with subsequent within-domain transfer between distantly-related eukaryotes. Under the three-domains hypothesis the RNR present in the last common ancestor of Archaea and eukaryotes appears, through a process of elimination, to have been a dimeric class II RNR, though limited sampling of eukaryotes precludes a firm conclusion as the data may be equally well accounted for by HGT.
Horizontal gene transfer has clearly played an important role in the evolution of the RNR repertoire of organisms from all three domains of life. Our results clearly show that class I RNRs have spread to Archaea and eukaryotes via transfers from the bacterial domain, indicating that class I likely evolved in the Bacteria. However, against the backdrop of ongoing transfers, it is harder to establish whether class II or III RNRs were present in the LUCA, despite the fact that ribonucleotide reduction is an essential cellular reaction and was pivotal to the transition from RNA to DNA genomes. Instead, a general pattern of ongoing horizontal transmission emerges wherein environmental and enzyme operational constraints, especially the presence or absence of oxygen, are likely to be major determinants of the RNR repertoire of genomes.
Marinomonas posidonica IVIA-Po-181T Lucas-Elío et al. 2011 belongs to the family Oceanospirillaceae within the phylum Proteobacteria. Different species of the genus Marinomonas can be readily isolated from the seagrass Posidonia oceanica. M. posidonica is among the most abundant species of the genus detected in the cultured microbiota of P. oceanica, suggesting a close relationship with this plant, which has a great ecological value in the Mediterranean Sea, covering an estimated surface of 38,000 Km2. Here we describe the genomic features of M. posidonica. The 3,899,940 bp long genome harbors 3,544 protein-coding genes and 107 RNA genes and is a part of the Genomic
Aerobic; Gram-negative; marine; plant-associated
Glycosylation is an important aspect of epigenetic regulation. Glycosyltransferase is a key enzyme in the biosynthesis of glycans, which glycosylates more than half of all proteins in eukaryotes and is involved in a wide range of biological processes. It has been suggested previously that homooligomerization in glycosyltransferases and other proteins might be crucial for their function. In this study, we explore functional homooligomeric states of glycosyltransferases in various organisms, trace their evolution and perform comparative analyses to find structural features which can mediate or disrupt the formation of different homooligomers. First we make a structure-based classification of the diverse superfamily of glycosyltransferases and confirm that the majority of the structures are indeed clustered into the GT-A or GT-B folds. We find that homooligomeric glycosyltransferases appear to be as ancient as monomeric glycosyltransferases and go back in evolution to the last universal common ancestor (LUCA). Moreover, we show that interface residues have significant bias to be gapped out or unaligned in the monomers implying that they might represent features crucial for oligomer formation. Structural analysis of these features reveals that the vast majority of them represent loops, terminal regions and helices indicating that these secondary structure elements mediate the formation of glycosyltransferases' homooligomers and directly contribute to the specific binding. We also observe relatively short protein regions which disrupt the homodimer interactions although such cases are rare. These results suggest that relatively small structural changes in the non-conserved regions may contribute to the formation of different functional oligomeric states and might be important in regulation of enzyme activity through homooligomerization.
Glycosyltransferase; Homodimer; Homooligomerization; Interface; Protein structural evolution
It is proposed that the pre-cellular stage of biological evolution unraveled within networks of inorganic compartments that harbored a diverse mix of virus-like genetic elements. This stage of evolution might comprise the Last Universal Cellular Ancestor (LUCA) that more appropriately could be denoted Last Universal Cellular Ancestral State (LUCAS). This scenario for the origin of cellular life recapitulates the early ideas of J. B. S. Haldane sketched in his classic 1928 essay. However, unlike in Haldane’s day, there is now considerable support for this scenario from three major lines of comparative-genomic evidence: i) lack of homology between the core components of the DNA replication systems of the two primary lines of descent of cellular life forms, archaea and bacteria, ii) distinct membrane chemistries and lack of homology between the enzymes of lipid biosynthesis in archaea and bacteria, iii) spread of several viral hallmark genes, which encode proteins with key functions in viral replication and morphogenesis, among numerous and extremely diverse groups of viruses, in contrast to their absence in cellular life forms, iv) the extant archaeal and bacterial chromosomes appear to be shaped by accretion of diverse, smaller replicons, suggesting a continuity between the hypothetical, primordial virus stage of life’s evolution and the dynamic prokaryotic world that existed ever since. Under the viral model of pre-cellular evolution, the key components of cells including the replication apparatus, membranes, and molecular complexes involved in membrane transport and translocation originated as components of virus-like entities. The two surviving types of cellular life forms, archaea and bacteria, might have emerged from the LUCAS independently, along with, probably, numerous forms now extinct.
comparative genomics; evolution of cells; evolution of viruses; origin of membranes; viral hallmark genes
In the Universe, oxygen is the third most widespread element, while on Earth it is the most abundant one. Moreover, oxygen is a major constituent of all biopolymers fundamental to living organisms. Besides O2, reactive oxygen species (ROS), among them hydrogen peroxide (H2O2), are also important reactants in the present aerobic metabolism. According to a widely accepted hypothesis, aerobic metabolism and many other reactions/pathways involving O2 appeared after the evolution of oxygenic photosynthesis. In this study, the hypothesis was formulated that the Last Universal Common Ancestor (LUCA) was at least able to tolerate O2 and detoxify ROS in a primordial environment. A comparative analysis was carried out of a number of the O2-and H2O2-involving metabolic reactions that occur in strict anaerobes, facultative anaerobes, and aerobes. The results indicate that the most likely LUCA possessed O2-and H2O2-involving pathways, mainly reactions to remove ROS, and had, at least in part, the components of aerobic respiration. Based on this, the presence of a low, but significant, quantity of H2O2 and O2 should be taken into account in theoretical models of the early Archean atmosphere and oceans and the evolution of life. It is suggested that the early metabolism involving O2/H2O2 was a key adaptation of LUCA to already existing weakly oxic zones in Earth's primordial environment. Key Words: Hydrogen peroxide—Oxygen—Origin of life—Photosynthesis—Superoxide dismutase—Superoxide reductase. Astrobiology 12, 775–784.
Despite recent advances in our understanding of diverse aspects of virus evolution, particularly on the epidemiological scale, revealing the ultimate origins of viruses has proven to be a more intractable problem. Herein, I review some current ideas on the evolutionary origins of viruses and assess how well these theories accord with what we know about the evolution of contemporary viruses. I note the growing evidence for the theory that viruses arose before the last universal cellular ancestor (LUCA). This ancient origin theory is supported by the presence of capsid architectures that are conserved among diverse RNA and DNA viruses and by the strongly inverse relationship between genome size and mutation rate across all replication systems, such that pre-LUCA genomes were probably both small and highly error prone and hence RNA virus-like. I also highlight the advances that are needed to come to a better understanding of virus origins, most notably the ability to accurately infer deep evolutionary history from the phylogenetic analysis of conserved protein structures.
Domains are modules within proteins that can fold and function independently and are evolutionarily conserved. Here we compared the usage and distribution of protein domain families in the free-living proteomes of Archaea, Bacteria and Eukarya and reconstructed species phylogenies while tracing the history of domain emergence and loss in proteomes. We show that both gains and losses of domains occurred frequently during proteome evolution. The rate of domain discovery increased approximately linearly in evolutionary time. Remarkably, gains generally outnumbered losses and the gain-to-loss ratios were much higher in akaryotes compared to eukaryotes. Functional annotations of domain families revealed that both Archaea and Bacteria gained and lost metabolic capabilities during the course of evolution while Eukarya acquired a number of diverse molecular functions including those involved in extracellular processes, immunological mechanisms, and cell regulation. Results also highlighted significant contemporary sharing of informational enzymes between Archaea and Eukarya and metabolic enzymes between Bacteria and Eukarya. Finally, the analysis provided useful insights into the evolution of species. The archaeal superkingdom appeared first in evolution by gradual loss of ancestral domains, bacterial lineages were the first to gain superkingdom-specific domains, and eukaryotes (likely) originated when an expanding proto-eukaryotic stem lineage gained organelles through endosymbiosis of already diversified bacterial lineages. The evolutionary dynamics of domain families in proteomes and the increasing number of domain gains is predicted to redefine the persistence strategies of organisms in superkingdoms, influence the make up of molecular functions, and enhance organismal complexity by the generation of new domain architectures. This dynamics highlights ongoing secondary evolutionary adaptations in akaryotic microbes, especially Archaea.
Proteins are made up of well-packed structural units referred to as domains. Domain structure in proteins is responsible for protein function and is evolutionarily conserved. Here we report global patterns of protein domain gain and loss in the three superkingdoms of life. We reconstructed phylogenetic trees using domain fold families as phylogenetic characters and retraced the history of character changes along the many branches of the tree of life. Results revealed that both domain gains and losses were frequent events in the evolution of cells. However, domain gains generally overshadowed the number of losses. This trend was consistent in the three superkingdoms. However, the rate of domain discovery was highest in akaryotic microbes. Domain gains occurred throughout the evolutionary timeline albeit at a non-uniform rate. Our study sheds light into the evolutionary history of living organisms and highlights important ongoing mechanisms that are responsible for secondary evolutionary adaptations in the three superkingdoms of life.
We analyzed length differences of eukaryotic, bacterial and archaeal proteins in relation to function, conservation and environmental factors. Comparing Eukaryotes and Prokaryotes, we found that the greater length of eukaryotic proteins is pervasive over all functional categories and involves the vast majority of protein families. The magnitude of these differences suggests that the evolution of eukaryotic proteins was influenced by processes of fusion of single-function proteins into extended multi-functional and multi-domain proteins. Comparing Bacteria and Archaea, we determined that the small but significant length difference observed between their proteins results from a combination of three factors: (i) bacterial proteomes include a greater proportion than archaeal proteomes of longer proteins involved in metabolism or cellular processes, (ii) within most functional classes, protein families unique to Bacteria are generally longer than protein families unique to Archaea and (iii) within the same protein family, homologs from Bacteria tend to be longer than the corresponding homologs from Archaea. These differences are interpreted with respect to evolutionary trends and prevailing environmental conditions within the two prokaryotic groups.
Likely DNA-binding domains in archaeal proteins were analyzed using sequence profile methods and available structural information. It is shown that all archaea encode a large number of proteins containing the helix-turn-helix (HTH) DNA-binding domains whose sequences are much more similar to bacterial HTH domains than to eukaryotic ones, such as the PAIRED, POU and homeodomains. The predominant class of HTH domains in archaea is the winged-HTH domain. The number and diversity of HTH domains in archaea is comparable to that seen in bacteria. The HTH domain in archaea combines with a variety of other domains that include replication system components, such as MCM proteins, translation system components, such as the alpha-subunit of phenyl-alanyl-tRNA synthetase, and several metabolic enzymes. The majority of the archaeal HTH-containing proteins are predicted to be gene/operon-specific transcriptional regulators. This apparent bacterial-type mode of transcription regulation is in sharp contrast to the eukaryote-like layout of the core transcription machinery in the archaea. In addition to the predicted bacterial-type transcriptional regulators, the HTH domain is conserved in archaeal and eukaryotic core transcription factors, such as TFIIB, TFIIE-alpha and MBF1. MBF1 is the only highly conserved, classical HTH domain that is vertically inherited in all archaea and eukaryotes. In contrast, while eukaryotic TFIIB and TFIIE-alpha possess forms of the HTH domain that are divergent in sequence, their archaeal counterparts contain typical HTH domains. It is shown that, besides the HTH domain, archaea encode unexpectedly large numbers of two other predicted DNA-binding domains, namely the Arc/MetJ domain and the Zn-ribbon. The core transcription regulators in archaea and eukaryotes (TFIIB/TFB, TFIIE-alpha and MBF1) and in bacteria (the sigma factors) share no similarity beyond the presence of distinct HTH domains. Thus HTH domains might have been independently recruited for a role in transcription regulation in the bacterial and archaeal/eukaryotic lineages. During subsequent evolution, the similarity between archaeal and bacterial gene/operon transcriptional regulators might have been established and maintained through multiple horizontal gene transfer events.
Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (http://eeb.princeton.edu/lucapedia), a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.
Small nucleolar RNAs (snoRNAs) and microRNAs (miRNAs) are integral to a range of processes, including ribosome biogenesis and gene regulation. Some are intron encoded, and this organization may facilitate coordinated coexpression of host gene and RNA. However, snoRNAs and miRNAs are known to be mobile, so intron-RNA associations may not be evolutionarily stable. We have used genome alignments across 11 mammals plus chicken to examine positional orthology of snoRNAs and miRNAs and report that 21% of annotated snoRNAs and 11% of miRNAs are positionally conserved across mammals. Among RNAs traceable to the bird–mammal common ancestor, 98% of snoRNAs and 76% of miRNAs are intronic. Comparison of the most evolutionarily stable mammalian intronic snoRNAs with those positionally conserved among primates reveals that the former are more overrepresented among host genes involved in translation or ribosome biogenesis and are more broadly and highly expressed. This stability is likely attributable to a requirement for overlap between host gene and intronic snoRNA expression profiles, consistent with an ancestral role in ribosome biogenesis. In contrast, whereas miRNA positional conservation is comparable to that observed for snoRNAs, intronic miRNAs show no obvious association with host genes of a particular functional category, and no statistically significant differences in host gene expression are found between those traceable to mammalian or primate ancestors. Our results indicate evolutionarily stable associations of numerous intronic snoRNAs and miRNAs and their host genes, with probable continued diversification of snoRNA function from an ancestral role in ribosome biogenesis.
snoRNA; miRNA; intron; evolution
The CRISPR-Cas adaptive immunity systems that are present in most Archaea and many Bacteria function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA. Comparative analysis of the Cas protein sequences and structures led to the classification of the CRISPR-Cas systems into three Types (I, II and III).
A detailed comparison of the available sequences and structures of Cas proteins revealed several unnoticed homologous relationships. The Repeat-Associated Mysterious Proteins (RAMPs) containing a distinct form of the RNA Recognition Motif (RRM) domain, which are major components of the CRISPR-Cas systems, were classified into three large groups, Cas5, Cas6 and Cas7. Each of these groups includes many previously uncharacterized proteins now shown to adopt the RAMP structure. Evidence is presented that large subunits contained in most of the CRISPR-Cas systems could be homologous to Cas10 proteins which contain a polymerase-like Palm domain and are predicted to be enzymatically active in Type III CRISPR-Cas systems but inactivated in Type I systems. These findings, the fact that the CRISPR polymerases, RAMPs and Cas2 all contain core RRM domains, and distinct gene arrangements in the three types of CRISPR-Cas systems together provide for a simple scenario for origin and evolution of the CRISPR-Cas machinery. Under this scenario, the CRISPR-Cas system originated in thermophilic Archaea and subsequently spread horizontally among prokaryotes.
Because of the extreme diversity of CRISPR-Cas systems, in-depth sequence and structure comparison continue to reveal unexpected homologous relationship among Cas proteins. Unification of Cas protein families previously considered unrelated provides for improvement in the classification of CRISPR-Cas systems and a reconstruction of their evolution.
Open peer review
This article was reviewed by Malcolm White (nominated by Purficacion Lopez-Garcia), Frank Eisenhaber and Igor Zhulin. For the full reviews, see the Reviewers' Comments section.
Nucleoside diphosphate kinases NDPK are evolutionarily conserved enzymes present in Bacteria, Archaea and Eukarya, with human Nme1 the most studied representative of the family and the first identified metastasis suppressor. Sponges (Porifera) are simple metazoans without tissues, closest to the common ancestor of all animals. They changed little during evolution and probably provide the best insight into the metazoan ancestor's genomic features. Recent studies show that sponges have a wide repertoire of genes many of which are involved in diseases in more complex metazoans. The original function of those genes and the way it has evolved in the animal lineage is largely unknown. Here we report new results on the metastasis suppressor gene/protein homolog from the marine sponge Suberites domuncula, NmeGp1Sd. The purpose of this study was to investigate the properties of the sponge Group I Nme gene and protein, and compare it to its human homolog in order to elucidate the evolution of the structure and function of Nme.
We found that sponge genes coding for Group I Nme protein are intron-rich. Furthermore, we discovered that the sponge NmeGp1Sd protein has a similar level of kinase activity as its human homolog Nme1, does not cleave negatively supercoiled DNA and shows nonspecific DNA-binding activity. The sponge NmeGp1Sd forms a hexamer, like human Nme1, and all other eukaryotic Nme proteins. NmeGp1Sd interacts with human Nme1 in human cells and exhibits the same subcellular localization. Stable clones expressing sponge NmeGp1Sd inhibited the migratory potential of CAL 27 cells, as already reported for human Nme1, which suggests that Nme's function in migratory processes was engaged long before the composition of true tissues.
This study suggests that the ancestor of all animals possessed a NmeGp1 protein with properties and functions similar to evolutionarily recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis.
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.
Sequence-directed genetic interference pathways control gene expression and preserve genome integrity in all kingdoms of life. The importance of such pathways is highlighted by the extensive study of RNA interference (RNAi) and related processes in eukaryotes. In many bacteria and most archaea, clustered, regularly interspaced short palindromic repeats (CRISPRs) are involved in a more recently discovered interference pathway that protects cells from bacteriophages and conjugative plasmids. CRISPR sequences provide an adaptive, heritable record of past infections and express CRISPR RNAs — small RNAs that target invasive nucleic acids. Here, we review the mechanisms of CRISPR interference and its roles in microbial physiology and evolution. We also discuss potential applications of this novel interference pathway.
The modern ribosome was largely formed at the time of the last common ancestor, LUCA. Hence its earliest origins likely lie in the RNA world. Central to its development were RNAs that spawned the modern tRNAs and a symmetrical region deep within the large ribosomal RNA, (rRNA), where the peptidyl transferase reaction occurs. To understand pre-LUCA developments, it is argued that events that are coupled in time are especially useful if one can infer a likely order in which they occurred. Using such timing events, the relative age of various proteins and individual regions within the large rRNA are inferred. An examination of the properties of modern ribosomes strongly suggests that the initial peptides made by the primitive ribosomes were likely enriched for l-amino acids, but did not completely exclude d-amino acids. This has implications for the nature of peptides made by the first ribosomes. From the perspective of ribosome origins, the immediate question regarding coding is when did it arise rather than how did the assignments evolve. The modern ribosome is very dynamic with tRNAs moving in and out and the mRNA moving relative to the ribosome. These movements may have become possible as a result of the addition of a template to hold the tRNAs. That template would subsequently become the mRNA, thereby allowing the evolution of the code and making an RNA genome useful. Finally, a highly speculative timeline of major events in ribosome history is presented and possible future directions discussed.
The ribosome evolved before the last universal common ancestor. Evidence from primary sequences, high resolution structural studies, and functional properties of various components provide significant insights to that evolutionary history, which is linked to the origins of the code and chirality.
Hfq and other Sm proteins are central in RNA metabolism, forming an evolutionarily conserved family that plays key roles in RNA processing in organisms ranging from archaea to bacteria to human. Sm-based cellular pathways vary in scope from eukaryotic mRNA splicing to bacterial quorum sensing, with at least one step in each of these pathways being mediated by an RNA-associated molecular assembly built upon Sm proteins. Though the first structures of Sm assemblies were from archaeal systems, the functions of Sm-like archaeal proteins (SmAPs) remain murky. Our ignorance about SmAP biology, particularly vis-à-vis the eukaryotic and bacterial Sm homologs, can be partly reduced by leveraging the homology between these lineages to make phylogenetic inferences about Sm functions in archaea. Nevertheless, whether SmAPs are more eukaryotic (RNP scaffold) or bacterial (RNA chaperone) in character remains unclear. Thus, the archaeal domain of life is a missing link, and an opportunity, in Sm-based RNA biology.
Sm; Lsm; Hfq protein; Sm fold; oligomers; archaeal RNA; RNA chaperone; RNP assembly; RNP evolution
The candidate tumour suppressor gene, LUCA-15/RBM5/H37, maps to the lung cancer tumour suppressor locus 3p21.3. The LUCA-15 gene locus encodes at least four alternatively spliced transcripts which have been shown to function as regulators of apoptosis, a fact which may have major significance in tumour regulation. This review highlights recent evidence which further implicates the LUCA-15 locus in the control of apoptosis and cell proliferation, and focuses on the observations which confirm the tumour suppressor activity of this gene.
LUCA-15 gene is located at a hotspot for deletions in a number of cancers. It is considered as a putative tumour suppressor gene. A number of lines of evidence supports this hypothesis and indicates LUCA-15 involvement in apoptosis. Functional, structural and mechanistic studies to clarify the function of LUCA-15 could lead to the development of new diagnostic and therapeutic approaches for human cancer.
T-lymphocytes; LUCA-15; RBM5; H37; RNA-binding proteins; cell death; cell proliferation; Cell fate; cell cycle; oncology; cell biology; biochemistry and molecular biology
NAD is an indispensable redox cofactor in all organisms. Most of the genes required for NAD biosynthesis in various species are known. Ribosylnicotinamide kinase (RNK) was among the few unknown (missing) genes involved with NAD salvage and recycling pathways. Using a comparative genome analysis involving reconstruction of NAD metabolism from genomic data, we predicted and experimentally verified that bacterial RNK is encoded within the 3′ region of the nadR gene. Based on these results and previous data, the full-size multifunctional NadR protein (as in Escherichia coli) is composed of (i) an N-terminal DNA-binding domain involved in the transcriptional regulation of NAD biosynthesis, (ii) a central nicotinamide mononucleotide adenylyltransferase (NMNAT) domain, and (iii) a C-terminal RNK domain. The RNK and NMNAT enzymatic activities of recombinant NadR proteins from Salmonella enterica serovar Typhimurium and Haemophilus influenzae were quantitatively characterized. We propose a model for the complete salvage pathway from exogenous N-ribosylnicotinamide to NAD which involves the concerted action of the PnuC transporter and NRK, followed by the NMNAT activity of the NadR protein. Both the pnuC and nadR genes were proven to be essential for the growth and survival of H. influenzae, thus implicating them as potential narrow-spectrum drug targets.
As a nucleolar complex for small-subunit (SSU) ribosomal RNA processing, SSU processome
has been extensively studied mainly in Saccharomyces cerevisiae but not
in diverse organisms, leaving open the question of whether it is a ubiquitous mechanism
across eukaryotes and how it evolved in the course of the evolution of eukaryotes.
Genome-wide survey and identification of SSU processome components showed that the
majority of all 77 yeast SSU processome proteins possess homologs in almost all of the
main eukaryotic lineages, and 14 of them have homologs in archaea but few in bacteria,
suggesting that the complex is ubiquitous in eukaryotes, and its evolutionary history
began with abundant protein homologs being present in archaea and then a fairly complete
form of the complex emerged in the last eukaryotic common ancestor (LECA). Phylogenetic
analysis indicated that ancient gene duplication and functional divergence of the protein
components of the complex occurred frequently during the evolutionary origin of the LECA
from prokaryotes. We found that such duplications not only increased the complex’s
components but also produced some new functional proteins involved in other nucleolar
functions, such as ribosome biogenesis and even some nonnucleolar (but nuclear) proteins
participating in pre-mRNA splicing, implying the evolutionary emergence of the subnuclear
compartment—the nucleolus—has occurred in the LECA. Therefore, the LECA
harbored not only complicated SSU processomes but also a nucleolus. Our analysis also
revealed that gene duplication, innovation, and loss, caused further divergence of the
complex during the divergence of eukaryotes.
SSU processome; evolution; nucleolus; LECA; origin
Ribonuclease P (RNase P) is an ancient and essential endonuclease that catalyses the cleavage of the 5′ leader sequence from precursor tRNAs (pre-tRNAs). The enzyme is one of only two ribozymes which can be found in all kingdoms of life (Bacteria, Archaea, and Eukarya). Most forms of RNase P are ribonucleoproteins; the bacterial enzyme possesses a single catalytic RNA and one small protein. However, in archaea and eukarya the enzyme has evolved an increasingly more complex protein composition, whilst retaining a structurally related RNA subunit. The reasons for this additional complexity are not currently understood. Furthermore, the eukaryotic RNase P has evolved into several different enzymes including a nuclear activity, organellar activities, and the evolution of a distinct but closely related enzyme, RNase MRP, which has different substrate specificities, primarily involved in ribosomal RNA biogenesis. Here we examine the relationship between the bacterial and archaeal RNase P with the eukaryotic enzyme, and summarize recent progress in characterizing the archaeal enzyme. We review current information regarding the nuclear RNase P and RNase MRP enzymes in the eukaryotes, focusing on the relationship between these enzymes by examining their composition, structure and functions.
RNase P; RNase MRP; tRNA processing; rRNA processing; holoenzyme; nucleolus
Evolutionary related multisubunit RNA polymerases from all three domains of life, Eukarya, Archaea and Bacteria, have common structural and functional properties. We have recently shown that two RNAP subunits, F/E (RPB4/7)—which are conserved between eukaryotes and Archaea but have no bacterial homologues—interact with the nascent RNA chain and thereby profoundly modulate RNAP activity. Overall F/E increases transcription processivity, but it also stimulates transcription termination in a sequence-dependent manner. In addition to RNA-binding, these two apparently opposed processes are likely to involve an allosteric mechanism of the RNAP clamp. Spt4/5 is the only known RNAP-associated transcription factor that is conserved in all three domains of life, and it stimulates elongation similar to RNAP subunits F/E. Spt4/5 enhances processivity in a fashion that is independent of the nontemplate DNA strand, by interacting with the RNAP clamp. Whereas the molecular mechanism of Spt4/5 is universally conserved in evolution, the added functionality of F/E-like complexes has emerged after the split of the bacterial and archaeo-eukaryotic lineages. Interestingly, bacteriophage-encoded antiterminator proteins could, in theory, fulfil an analogous function in the bacterial RNAP.
transcription; RNA polymerase; F/E RPB4/7; Spt4/5; evolution; archaea
Defects in the human Shwachman-Bodian-Diamond syndrome (SBDS) protein-coding gene lead to the autosomal recessive disorder characterised by bone marrow dysfunction, exocrine pancreatic insufficiency and skeletal abnormalities. This protein is highly conserved in eukaryotes and archaea but is not found in bacteria. Although genomic and biophysical studies have suggested involvement of this protein in RNA metabolism and in ribosome biogenesis, its interacting partners remain largely unknown.
We determined the crystal structure of the SBDS orthologue from Methanothermobacter thermautotrophicus (mthSBDS). This structure shows that SBDS proteins are highly flexible, with the N-terminal FYSH domain and the C-terminal ferredoxin-like domain capable of undergoing substantial rotational adjustments with respect to the central domain. Affinity chromatography identified several proteins from the large ribosomal subunit as possible interacting partners of mthSBDS. Moreover, SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments, combined with electrophoretic mobility shift assays (EMSA) suggest that mthSBDS does not interact with RNA molecules in a sequence specific manner.
It is suggested that functional interactions of SBDS proteins with their partners could be facilitated by rotational adjustments of the N-terminal and the C-terminal domains with respect to the central domain. Examination of the SBDS protein structure and domain movements together with its possible interaction with large ribosomal subunit proteins suggest that these proteins could participate in ribosome function.