PMCC PMCC

Conseils de recherche
Les critères de recherche

Avancée
Résultats 1-25 (303)
 

Notices sélectionnées (0)
Aucune

Sélectionner un filtre

Revues
plus »
Année de publication
plus »
1.  Splendor and misery of adaptation, or the importance of neutral null for understanding evolution 
BMC Biology  2016;14:114.
The study of any biological features, including genomic sequences, typically revolves around the question: what is this for? However, population genetic theory, combined with the data of comparative genomics, clearly indicates that such a “pan-adaptationist” approach is a fallacy. The proper question is: how has this sequence evolved? And the proper null hypothesis posits that it is a result of neutral evolution: that is, it survives by sheer chance provided that it is not deleterious enough to be efficiently purged by purifying selection. To claim adaptation, the neutral null has to be falsified. The adaptationist fallacy can be costly, inducing biologists to relentlessly seek function where there is none.
doi:10.1186/s12915-016-0338-2
PMCID: PMC5180405  PMID: 28010725
2.  Activation induced deaminase mutational signature overlaps with CpG methylation sites in follicular lymphoma and other cancers 
Scientific Reports  2016;6:38133.
Follicular lymphoma (FL) is an uncurable cancer characterized by progressive severity of relapses. We analyzed sequence context specificity of mutations in the B cells from a large cohort of FL patients. We revealed substantial excess of mutations within a novel hybrid nucleotide motif: the signature of somatic hypermutation (SHM) enzyme, Activation Induced Deaminase (AID), which overlaps the CpG methylation site. This finding implies that in FL the SHM machinery acts at genomic sites containing methylated cytosine. We identified the prevalence of this hybrid mutational signature in many other types of human cancer, suggesting that AID-mediated, CpG-methylation dependent mutagenesis is a common feature of tumorigenesis.
doi:10.1038/srep38133
PMCID: PMC5141443  PMID: 27924834
3.  The Double-Stranded DNA Virosphere as a Modular Hierarchical Network of Gene Sharing 
mBio  2016;7(4):e00978-16.
ABSTRACT
Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order “Megavirales” with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections.
IMPORTANCE
Viruses and related mobile genetic elements are the dominant biological entities on earth, but their evolution is not sufficiently understood and their classification is not adequately developed. The key reason is the characteristic high rate of virus evolution that involves not only sequence change but also extensive gene loss, gain, and exchange. Therefore, in the study of virus evolution on a large scale, traditional phylogenetic approaches have limited applicability and have to be complemented by gene and genome network analyses. We applied state-of-the art methods of such analysis to reveal robust hierarchical modularity in the genomes of double-stranded DNA viruses. Some of the identified modules combine highly diverse viruses infecting bacteria, archaea, and eukaryotes, in support of previous hypotheses on direct evolutionary relationships between viruses from the three domains of cellular life. We formally identify a set of 14 viral hallmark genes that hold together the genomic network.
doi:10.1128/mBio.00978-16
PMCID: PMC4981718  PMID: 27486193
4.  ISC, a Novel Group of Bacterial and Archaeal DNA Transposons That Encode Cas9 Homologs 
Journal of Bacteriology  2016;198(5):797-807.
ABSTRACT
Bacterial genomes encode numerous homologs of Cas9, the effector protein of the type II CRISPR-Cas systems. The homology region includes the arginine-rich helix and the HNH nuclease domain that is inserted into the RuvC-like nuclease domain. These genes, however, are not linked to cas genes or CRISPR. Here, we show that Cas9 homologs represent a distinct group of nonautonomous transposons, which we denote ISC (insertion sequences Cas9-like). We identify many diverse families of full-length ISC transposons and demonstrate that their terminal sequences (particularly 3′ termini) are similar to those of IS605 superfamily transposons that are mobilized by the Y1 tyrosine transposase encoded by the TnpA gene and often also encode the TnpB protein containing the RuvC-like endonuclease domain. The terminal regions of the ISC and IS605 transposons contain palindromic structures that are likely recognized by the Y1 transposase. The transposons from these two groups are inserted either exactly in the middle or upstream of specific 4-bp target sites, without target site duplication. We also identify autonomous ISC transposons that encode TnpA-like Y1 transposases. Thus, the nonautonomous ISC transposons could be mobilized in trans either by Y1 transposases of other, autonomous ISC transposons or by Y1 transposases of the more abundant IS605 transposons. These findings imply an evolutionary scenario in which the ISC transposons evolved from IS605 family transposons, possibly via insertion of a mobile group II intron encoding the HNH domain, and Cas9 subsequently evolved via immobilization of an ISC transposon.
IMPORTANCE Cas9 endonucleases, the effectors of type II CRISPR-Cas systems, represent the new generation of genome-engineering tools. Here, we describe in detail a novel family of transposable elements that encode the likely ancestors of Cas9 and outline the evolutionary scenario connecting different varieties of these transposons and Cas9.
doi:10.1128/JB.00783-15
PMCID: PMC4810608  PMID: 26712934
5.  C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector 
Science (New York, N.Y.)  2016;353(6299):aaf5573.
The CRISPR-Cas adaptive immune system defends microbes against foreign genetic elements via DNA or RNA-DNA interference. We characterize the Class 2 type VI-A CRISPR-Cas effector C2c2 and demonstrate its RNA-guided RNase function. C2c2 from the bacterium Leptotrichia shahii provides interference against RNA phage. In vitro biochemical analysis show that C2c2 is guided by a single crRNA and can be programmed to cleave ssRNA targets carrying complementary protospacers. In bacteria, C2c2 can be programmed to knock down specific mRNAs. Cleavage is mediated by catalytic residues in the two conserved HEPN domains, mutations in which generate catalytically inactive RNA-binding proteins. These results broaden our understanding of CRISPR-Cas systems and suggest that C2c2 can be used to develop new RNA-targeting tools.
doi:10.1126/science.aaf5573
PMCID: PMC5127784  PMID: 27256883
6.  Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins 
Nature Communications  2016;7:13570.
Protein repeats are considered hotspots of protein evolution, associated with acquisition of new functions and novel phenotypic traits, including disease. Paradoxically, however, repeats are often strongly conserved through long spans of evolution. To resolve this conundrum, it is necessary to directly compare paralogous (horizontal) evolution of repeats within proteins with their orthologous (vertical) evolution through speciation. Here we develop a rigorous methodology to identify highly periodic repeats with significant sequence similarity, for which evolutionary rates and selection (dN/dS) can be estimated, and systematically characterize their evolution. We show that horizontal evolution of repeats is markedly accelerated compared with their divergence from orthologues in closely related species. This observation is universal across the diversity of life forms and implies a biphasic evolutionary regime whereby new copies experience rapid functional divergence under combined effects of strongly relaxed purifying selection and positive selection, followed by fixation and conservation of each individual repeat.
Protein repeats may be considered a paradox, being evolutionarily conserved yet also hotspots of protein evolution associated with innovation. Here, the authors use a novel method to show that new repeats undergo rapid divergence within species, but are then fixed and conserved between species.
doi:10.1038/ncomms13570
PMCID: PMC5120217  PMID: 27857066
7.  Discovery and functional characterization of diverse Class 2 CRISPR-Cas systems 
Molecular cell  2015;60(3):385-397.
Microbial CRISPR-Cas systems are divided into Class 1, with multisubunit effector complexes, and Class 2, with single protein effectors. Currently, only two Class 2 effectors, Cas9 and Cpf1, are known. We describe here three distinct Class 2 CRISPR-Cas systems. The effectors of two of the identified systems, C2c1 and C2c3, contain RuvC-like endonuclease domains distantly related to Cpf1. The third system, C2c2, contains an effector with two predicted HEPN RNase domains. Whereas production of mature CRISPR RNA (crRNA) by C2c1 depends on tracrRNA, C2c2 crRNA maturation is tracrRNA-independent. We found that C2c1 systems can mediate DNA interference in a 5’-PAM-dependent fashion analogous to Cpf1. However, unlike Cpf1, which is a single-RNA-guided nuclease, C2c1 depends on both crRNA and tracrRNA for DNA cleavage. Finally, comparative analysis indicates that Class 2 CRISPR-Cas systems evolved on multiple occasions through recombination of Class 1 adaptation modules with effector proteins acquired from distinct mobile elements.
Graphical abstract
doi:10.1016/j.molcel.2015.10.008
PMCID: PMC4660269  PMID: 26593719
CRISPR-Cas adaptive immunity; Cas9; Cpf1; crRNA; tracrRNA; PAM; RuvC-like endonuclease; HEPN domain; computational discovery pipeline; RNA-seq
8.  Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation 
Nucleic Acids Research  2016;45(Database issue):D491-D498.
Viruses are the most abundant and diverse biological entities on earth, and while most of this diversity remains completely unexplored, advances in genome sequencing have provided unprecedented glimpses into the virosphere. The Prokaryotic Virus Orthologous Groups (pVOGs, formerly called Phage Orthologous Groups, POGs) resource has aided in this task over the past decade by using automated methods to keep pace with the rapid increase in genomic data. The uses of pVOGs include functional annotation of viral proteins, identification of genes and viruses in uncharacterized DNA samples, phylogenetic analysis, large-scale comparative genomics projects, and more. The pVOGs database represents a comprehensive set of orthologous gene families shared across multiple complete genomes of viruses that infect bacterial or archaeal hosts (viruses of eukaryotes will be added at a future date). The pVOGs are constructed within the Clusters of Orthologous Groups (COGs) framework that is widely used for orthology identification in prokaryotes. Since the previous release of the POGs, the size has tripled to nearly 3000 genomes and 300 000 proteins, and the number of conserved orthologous groups doubled to 9518. User-friendly webpages are available, including multiple sequence alignments and HMM profiles for each VOG. These changes provide major improvements to the pVOGs database, at a time of rapid advances in virus genomics. The pVOGs database is hosted jointly at the University of Iowa at http://dmk-brain.ecn.uiowa.edu/pVOGs and the NCBI at ftp://ftp.ncbi.nlm.nih.gov/pub/kristensen/pVOGs/home.html.
doi:10.1093/nar/gkw975
PMCID: PMC5210652  PMID: 27789703
9.  Cpf1 is a single RNA-guided endonuclease of a Class 2 CRISPR-Cas system 
Cell  2015;163(3):759-771.
The microbial adaptive immune system CRISPR mediates defense against foreign genetic elements through two classes of RNA-guided nuclease effectors. Class 1 effectors utilize multi-protein complexes, whereas Class 2 effectors rely on single-component effector proteins such as the well-characterized Cas9. Here we report characterization of Cpf1, a putative Class 2 CRISPR effector. We demonstrate that Cpf1 mediates robust DNA interference with features distinct from Cas9. Cpf1 is a single RNA-guided endonuclease lacking tracrRNA, and it utilizes a T-rich protospacer adjacent motif. Moreover, Cpf1 cleaves DNA via a staggered DNA double stranded break. Out of 16 Cpf1-family proteins, we identified two candidate enzymes, from Acidominococcus and Lachnospiraceae, with efficient genome editing activity in human cells. Identifying this mechanism of interference broadens our understanding of CRISPR-Cas systems and advances their genome editing applications.
Graphical Abstract
doi:10.1016/j.cell.2015.09.038
PMCID: PMC4638220  PMID: 26422227
10.  ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation 
Nucleic Acids Research  2016;45(Database issue):D210-D218.
The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of ‘index’ orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html.
doi:10.1093/nar/gkw934
PMCID: PMC5210634  PMID: 28053163
11.  Casposon integration shows strong target site preference and recapitulates protospacer integration by CRISPR-Cas systems 
Nucleic Acids Research  2016;44(21):10367-10376.
Casposons are a recently discovered group of large DNA transposons present in diverse bacterial and archaeal genomes. For integration into the host chromosome, casposons employ an endonuclease that is homologous to the Cas1 protein involved in protospacer integration by the CRISPR-Cas adaptive immune system. Here we describe the site-preference of integration by the Cas1 integrase (casposase) encoded by the casposon of the archaeon Aciduliprofundum boonei. Oligonucleotide duplexes derived from the terminal inverted repeats (TIR) of the A. boonei casposon as well as mini-casposons flanked by the TIR inserted preferentially at a site reconstituting the original A. boonei target site. As in the A. boonei genome, the insertion was accompanied by a 15-bp direct target site duplication (TSD). The minimal functional target consisted of the 15-bp TSD segment and the adjacent 18-bp sequence which comprises the 3′ end of the tRNA-Pro gene corresponding to the TΨC loop. The functional casposase target site bears clear resemblance to the leader sequence-repeat junction which is the target for protospacer integration catalyzed by the Cas1–Cas2 adaptation module of CRISPR-Cas. These findings reinforce the mechanistic similarities and evolutionary connection between the casposons and the adaptation module of the prokaryotic adaptive immunity systems.
doi:10.1093/nar/gkw821
PMCID: PMC5137440  PMID: 27655632
12.  Viruses and mobile elements as drivers of evolutionary transitions 
The history of life is punctuated by evolutionary transitions which engender emergence of new levels of biological organization that involves selection acting at increasingly complex ensembles of biological entities. Major evolutionary transitions include the origin of prokaryotic and then eukaryotic cells, multicellular organisms and eusocial animals. All or nearly all cellular life forms are hosts to diverse selfish genetic elements with various levels of autonomy including plasmids, transposons and viruses. I present evidence that, at least up to and including the origin of multicellularity, evolutionary transitions are driven by the coevolution of hosts with these genetic parasites along with sharing of ‘public goods’. Selfish elements drive evolutionary transitions at two distinct levels. First, mathematical modelling of evolutionary processes, such as evolution of primitive replicator populations or unicellular organisms, indicates that only increasing organizational complexity, e.g. emergence of multicellular aggregates, can prevent the collapse of the host–parasite system under the pressure of parasites. Second, comparative genomic analysis reveals numerous cases of recruitment of genes with essential functions in cellular life forms, including those that enable evolutionary transitions.
This article is part of the themed issue ‘The major synthetic evolutionary transitions’.
doi:10.1098/rstb.2015.0442
PMCID: PMC4958936  PMID: 27431520
evolutionary transitions; mobile genetic elements; parasites; viruses; antivirus defence; host–parasite coevolution
13.  Role of mRNA structure in the control of protein folding 
Nucleic Acids Research  2016;44(22):10898-10911.
Specific structures in mRNA modulate translation rate and thus can affect protein folding. Using the protein structures from two eukaryotes and three prokaryotes, we explore the connections between the protein compactness, inferred from solvent accessibility, and mRNA structure, inferred from mRNA folding energy (ΔG). In both prokaryotes and eukaryotes, the ΔG value of the most stable 30 nucleotide segment of the mRNA (ΔGmin) strongly, positively correlates with protein solvent accessibility. Thus, mRNAs containing exceptionally stable secondary structure elements typically encode compact proteins. The correlations between ΔG and protein compactness are much more pronounced in predicted ordered parts of proteins compared to the predicted disordered parts, indicative of an important role of mRNA secondary structure elements in the control of protein folding. Additionally, ΔG correlates with the mRNA length and the evolutionary rate of synonymous positions. The correlations are partially independent and were used to construct multiple regression models which explain about half of the variance of protein solvent accessibility. These findings suggest a model in which the mRNA structure, particularly exceptionally stable RNA structural elements, act as gauges of protein co-translational folding by reducing ribosome speed when the nascent peptide needs time to form and optimize the core structure.
doi:10.1093/nar/gkw671
PMCID: PMC5159526  PMID: 27466388
14.  Horizontal gene transfer: essentiality and evolvability in prokaryotes, and roles in evolutionary transitions 
F1000Research  2016;5:F1000 Faculty Rev-1805.
The wide spread of gene exchange and loss in the prokaryotic world has prompted the concept of ‘lateral genomics’ to the point of an outright denial of the relevance of phylogenetic trees for evolution. However, the pronounced coherence congruence of the topologies of numerous gene trees, particularly those for (nearly) universal genes, translates into the notion of a statistical tree of life (STOL), which reflects a central trend of vertical evolution. The STOL can be employed as a framework for reconstruction of the evolutionary processes in the prokaryotic world. Quantitatively, however, horizontal gene transfer (HGT) dominates microbial evolution, with the rate of gene gain and loss being comparable to the rate of point mutations and much greater than the duplication rate. Theoretical models of evolution suggest that HGT is essential for the survival of microbial populations that otherwise deteriorate due to the Muller’s ratchet effect. Apparently, at least some bacteria and archaea evolved dedicated vehicles for gene transfer that evolved from selfish elements such as plasmids and viruses. Recent phylogenomic analyses suggest that episodes of massive HGT were pivotal for the emergence of major groups of organisms such as multiple archaeal phyla as well as eukaryotes. Similar analyses appear to indicate that, in addition to donating hundreds of genes to the emerging eukaryotic lineage, mitochondrial endosymbiosis severely curtailed HGT. These results shed new light on the routes of evolutionary transitions, but caution is due given the inherent uncertainty of deep phylogenies.
doi:10.12688/f1000research.8737.1
PMCID: PMC4962295  PMID: 27508073
Horizontal gene transfer; prokaryotes; evolutionary transitions; microbial evolution; statistical tree of life
15.  Virus-host arms race at the joint origin of multicellularity and programmed cell death 
Cell Cycle  2014;13(19):3083-3088.
Unicellular eukaryotes and most prokaryotes possess distinct mechanisms of programmed cell death (PCD). How an “altruistic” trait, such as PCD, could evolve in unicellular organisms? To address this question, we developed a mathematical model of the virus-host co-evolution that involves interaction between immunity, PCD and cellular aggregation. Analysis of the parameter space of this model shows that under high virus load and imperfect immunity, joint evolution of cell aggregation and PCD is the optimal evolutionary strategy. Given the abundance of viruses in diverse habitats and the wide spread of PCD in most organisms, these findings imply that multiple instances of the emergence of multicellularity and its essential attribute, PCD, could have been driven, at least in part, by the virus-host arms race.
doi:10.4161/15384101.2014.949496
PMCID: PMC4615056  PMID: 25486567
programmed cell death; host-parasite arms race; viruses; evolution of multicellularity
16.  Germline viral “fossils” guide in silico reconstruction of a mid-Cenozoic era marsupial adeno-associated virus 
Scientific Reports  2016;6:28965.
Germline endogenous viral elements (EVEs) genetically preserve viral nucleotide sequences useful to the study of viral evolution, gene mutation, and the phylogenetic relationships among host organisms. Here, we describe a lineage-specific, adeno-associated virus (AAV)-derived endogenous viral element (mAAV-EVE1) found within the germline of numerous closely related marsupial species. Molecular screening of a marsupial DNA panel indicated that mAAV-EVE1 occurs specifically within the marsupial suborder Macropodiformes (present-day kangaroos, wallabies, and related macropodoids), to the exclusion of other Diprotodontian lineages. Orthologous mAAV-EVE1 locus sequences from sixteen macropodoid species, representing a speciation history spanning an estimated 30 million years, facilitated compilation of an inferred ancestral sequence that recapitulates the genome of an ancient marsupial AAV that circulated among Australian metatherian fauna sometime during the late Eocene to early Oligocene. In silico gene reconstruction and molecular modelling indicate remarkable conservation of viral structure over a geologic timescale. Characterisation of AAV-EVE loci among disparate species affords insight into AAV evolution and, in the case of macropodoid species, may offer an additional genetic basis for assignment of phylogenetic relationships among the Macropodoidea. From an applied perspective, the identified AAV “fossils” provide novel capsid sequences for use in translational research and clinical applications.
doi:10.1038/srep28965
PMCID: PMC4932596  PMID: 27377618
17.  Are There Laws of Genome Evolution? 
PLoS Computational Biology  2011;7(8):e1002173.
Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law–like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection; therefore, the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as “laws of evolutionary genomics” in the same sense “law” is understood in modern physics.
doi:10.1371/journal.pcbi.1002173
PMCID: PMC3161903  PMID: 21901087
18.  Crystal structure of Cpf1 in complex with guide RNA and target DNA 
Cell  2016;165(4):949-962.
Cpf1 is an RNA-guided endonuclease of a type V CRISPR-Cas system that has been recently harnessed for genome editing. Here, we report the crystal structure of Acidaminococcus sp. Cpf1 (AsCpf1) in complex with the guide RNA and its target DNA, at 2.8 Å resolution. AsCpf1 adopts a bilobed architecture, with the RNA–DNA heteroduplex bound inside the central channel. The structural comparison of AsCpf1 with Cas9, a type II CRISPR-Cas nuclease, reveals both striking similarity and major differences, thereby explaining their distinct functionalities. AsCpf1 contains the RuvC domain and a putative novel nuclease domain, which are responsible for the cleavage of the non-target and target strands, respectively, and jointly generate staggered DNA double-strand breaks. AsCpf1 recognizes the 5′-TTTN-3′ protospacer adjacent motif by base and shape readout mechanisms. Our findings provide mechanistic insights into RNA-guided DNA cleavage by Cpf1, and establish a framework for rational engineering of the CRISPR-Cpf1 toolbox.
doi:10.1016/j.cell.2016.04.003
PMCID: PMC4899970  PMID: 27114038
19.  Diversity and Evolution of Type IV pili Systems in Archaea 
Many surface structures in archaea including various types of pili and the archaellum (archaeal flagellum) are homologous to bacterial type IV pili systems (T4P). The T4P consist of multiple proteins, often with poorly conserved sequences, complicating their identification in sequenced genomes. Here we report a comprehensive census of T4P encoded in archaeal genomes using sensitive methods for protein sequence comparison. This analysis confidently identifies as T4P components about 5000 archaeal gene products, 56% of which are currently annotated as hypothetical in public databases. Combining results of this analysis with a comprehensive comparison of genomic neighborhoods of the T4P, we present models of organization of 10 most abundant variants of archaeal T4P. In addition to the differentiation between major and minor pilins, these models include extra components, such as S-layer proteins, adhesins and other membrane and intracellular proteins. For most of these systems, dedicated major pilin families are identified including numerous stand alone major pilin genes of the PilA family. Evidence is presented that secretion ATPases of the T4P and cognate TadC proteins can interact with different pilin sets. Modular evolution of T4P results in combinatorial variability of these systems. Potential regulatory or modulating proteins for the T4P are identified including KaiC family ATPases, vWA domain-containing proteins and the associated MoxR/GvpN ATPase, TFIIB homologs and multiple unrelated transcription regulators some of which are associated specific T4P. Phylogenomic analysis suggests that at least one T4P system was present in the last common ancestor of the extant archaea. Multiple cases of horizontal transfer and lineage-specific duplication of T4P loci were detected. Generally, the T4P of the archaeal TACK superphylum are more diverse and evolve notably faster than those of euryarchaea. The abundance and enormous diversity of T4P in hyperthermophilic archaea present a major enigma. Apparently, fundamental aspects of the biology of hyperthermophiles remain to be elucidated.
doi:10.3389/fmicb.2016.00667
PMCID: PMC4858521  PMID: 27199977
type IV pili; archaea; evolution; comparative genomics; secretion ATPase
20.  Virus World as an Evolutionary Network of Viruses and Capsidless Selfish Elements 
SUMMARY
Viruses were defined as one of the two principal types of organisms in the biosphere, namely, as capsid-encoding organisms in contrast to ribosome-encoding organisms, i.e., all cellular life forms. Structurally similar, apparently homologous capsids are present in a huge variety of icosahedral viruses that infect bacteria, archaea, and eukaryotes. These findings prompted the concept of the capsid as the virus “self” that defines the identity of deep, ancient viral lineages. However, several other widespread viral “hallmark genes” encode key components of the viral replication apparatus (such as polymerases and helicases) and combine with different capsid proteins, given the inherently modular character of viral evolution. Furthermore, diverse, widespread, capsidless selfish genetic elements, such as plasmids and various types of transposons, share hallmark genes with viruses. Viruses appear to have evolved from capsidless selfish elements, and vice versa, on multiple occasions during evolution. At the earliest, precellular stage of life's evolution, capsidless genetic parasites most likely emerged first and subsequently gave rise to different classes of viruses. In this review, we develop the concept of a greater virus world which forms an evolutionary network that is held together by shared conserved genes and includes both bona fide capsid-encoding viruses and different classes of capsidless replicons. Theoretical studies indicate that selfish replicons (genetic parasites) inevitably emerge in any sufficiently complex evolving ensemble of replicators. Therefore, the key signature of the greater virus world is not the presence of a capsid but rather genetic, informational parasitism itself, i.e., various degrees of reliance on the information processing systems of the host.
doi:10.1128/MMBR.00049-13
PMCID: PMC4054253  PMID: 24847023
21.  Universal distribution of mutational effects on protein stability, uncoupling of protein robustness from sequence evolution and distinct evolutionary modes of prokaryotic and eukaryotic proteins 
Physical biology  2015;12(3):035001.
Robustness to destabilizing effects of mutations is thought of as a key factor of protein evolution. The connections between two measures of robustness, the relative core size and the computationally estimated effect of mutations on protein stability (ΔΔG), protein abundance and the selection pressure on protein-coding genes (dN/dS) were analyzed for the organisms with a large number of available protein structures including four eukaryotes, two bacteria and one archaeon. The distribution of the effects of mutations in the core on protein stability is universal and indistinguishable in eukaryotes and bacteria, centered at slightly destabilizing amino acid replacements, and with a heavy tail of more strongly destabilizing replacements. The distribution of mutational effects in the hyperthermophilic archaeon Thermococcus gammatolerans is significantly shifted toward strongly destabilizing replacements which is indicative of stronger constraints that are imposed on proteins in hyperthermophiles. The median effect of mutations is strongly, positively correlated with the relative core size, in evidence of the congruence between the two measures of protein robustness. However, both measures show only limited correlations to the expression level and selection pressure on protein-coding genes. Thus, the degree of robustness reflected in the universal distribution of mutational effects appears to be a fundamental, ancient feature of globular protein folds whereas the observed variations are largely neutral and uncoupled from short term protein evolution. A weak anticorrelation between protein core size and selection pressure is observed only for surface residues in prokaryotes but a stronger anticorrelation is observed for all residues in eukaryotic proteins. This substantial difference between proteins of prokaryotes and eukaryotes is likely to stem from the demonstrable higher compactness of prokaryotic proteins.
doi:10.1088/1478-3975/12/3/035001
PMCID: PMC4770899  PMID: 25927823
22.  Evolution of double-stranded DNA viruses of eukaryotes: from bacteriophages to transposons to giant viruses 
Diverse eukaryotes including animals and protists are hosts to a broad variety of viruses with double-stranded (ds) DNA genomes, from the largest known viruses, such as pandoraviruses and mimiviruses, to tiny polyomaviruses. Recent comparative genomic analyses have revealed many evolutionary connections between dsDNA viruses of eukaryotes, bacteriophages, transposable elements, and linear DNA plasmids. These findings provide an evolutionary scenario that derives several major groups of eukaryotic dsDNA viruses, including the proposed order “Megavirales,” adenoviruses, and virophages from a group of large virus-like transposons known as Polintons (Mavericks). The Polintons have been recently shown to encode two capsid proteins, suggesting that these elements lead a dual lifestyle with both a transposon and a viral phase and should perhaps more appropriately be named polintoviruses. Here, we describe the recently identified evolutionary relationships between bacteriophages of the family Tectiviridae, polintoviruses, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order “Megavirales,” and linear mitochondrial and cytoplasmic plasmids. We outline an evolutionary scenario under which the polintoviruses were the first group of eukaryotic dsDNA viruses that evolved from bacteriophages and became the ancestors of most large DNA viruses of eukaryotes and a variety of other selfish elements. Distinct lines of origin are detectable only for herpesviruses (from a different bacteriophage root) and polyoma/papillomaviruses (from single-stranded DNA viruses and ultimately from plasmids). Phylogenomic analysis of giant viruses provides compelling evidence of their independent origins from smaller members of the putative order “Megavirales,” refuting the speculations on the evolution of these viruses from an extinct fourth domain of cellular life.
doi:10.1111/nyas.12728
PMCID: PMC4405056  PMID: 25727355
Polintons; Megavirales; virus evolution; capsid proteins; translation
23.  The Dispersed Archaeal Eukaryome and the Complex Archaeal Ancestor of Eukaryotes 
The ancestral set of eukaryotic genes is a chimera composed of genes of archaeal and bacterial origins thanks to the endosymbiosis event that gave rise to the mitochondria and apparently antedated the last common ancestor of the extant eukaryotes. The proto-mitochondrial endosymbiont is confidently identified as an α-proteobacterium. In contrast, the archaeal ancestor of eukaryotes remains elusive, although evidence is accumulating that it could have belonged to a deep lineage within the TACK (Thaumarchaeota, Aigarchaeota, Crenarchaeota, Korarchaeota) superphylum of the Archaea. Recent surveys of archaeal genomes show that the apparent ancestors of several key functional systems of eukaryotes, the components of the archaeal “eukaryome,” such as ubiquitin signaling, RNA interference, and actin-based and tubulin-based cytoskeleton structures, are identifiable in different archaeal groups. We suggest that the archaeal ancestor of eukaryotes was a complex form, rooted deeply within the TACK superphylum, that already possessed some quintessential eukaryotic features, in particular, a cytoskeleton, and perhaps was capable of a primitive form of phagocytosis that would facilitate the engulfment of potential symbionts. This putative group of Archaea could have existed for a relatively short time before going extinct or undergoing genome streamlining, resulting in the dispersion of the eukaryome. This scenario might explain the difficulty with the identification of the archaeal ancestor of eukaryotes despite the straightforward detection of apparent ancestors to many signature eukaryotic functional systems.
The apparent ancestors of key eukaryotic features (e.g., ubiquitin signaling, RNA interference, and cytoskeletal structures) are identifiable in different Archaea. But the specific archaeal ancestor of eukaryotes remains elusive.
doi:10.1101/cshperspect.a016188
PMCID: PMC3970416  PMID: 24691961
24.  Classification of prokaryotic genetic replicators: between selfishness and altruism 
Prokaryotes harbor a variety of genetic replicators, including plasmids, viruses, and chromosomes, each having differing effects on the phenotype of the hosting cell. Here, we propose a classification for replicators of bacteria and archaea on the basis of their horizontal-transfer potential and the type of relationships (mutualistic, symbiotic, commensal, or parasitic) that they have with the host cell vehicle. Horizontal movement of replicators can be either active or passive, reflecting whether or not the replicator encodes the means to mediate its own transfer from one cell to another. Some replicators also have an infectious extracellular state, thus separating viruses from other mobile elements. From the perspective of the cell vehicle, the different types of replicators form a continuum from genuinely mutualistic to completely parasitic replicators. This classification provides a general framework for dissecting prokaryotic systems into evolutionarily meaningful components.
doi:10.1111/nyas.12696
PMCID: PMC4390439  PMID: 25703428
bacteria; archaea; prokaryotes; classification; replicators; cell vehicles
25.  Carl Woese's vision of cellular evolution and the domains of life 
RNA Biology  2014;11(3):197-204.
In a series of conceptual articles published around the millennium, Carl Woese emphasized that evolution of cells is the central problem of evolutionary biology, that the three-domain ribosomal tree of life is an essential framework for reconstructing cellular evolution, and that the evolutionary dynamics of functionally distinct cellular systems are fundamentally different, with the information processing systems “crystallizing” earlier than operational systems. The advances of evolutionary genomics over the last decade vindicate major aspects of Woese’s vision. Despite the observations of pervasive horizontal gene transfer among bacteria and archaea, the ribosomal tree of life comes across as a central statistical trend in the “forest” of phylogenetic trees of individual genes, and hence, an appropriate scaffold for evolutionary reconstruction. The evolutionary stability of information processing systems, primarily translation, becomes ever more striking with the accumulation of comparative genomic data indicating that nearly allof the few universal genes encode translation system components. Woese’s view on the fundamental distinctions between the three domains of cellular life also withstand the test of comparative genomics, although his non-acceptance of symbiogenetic scenarios for the origin of eukaryotes might not. Above all, Woese’s key prediction that understanding evolution of microbes will be the core of the new evolutionary biology appears to be materializing.
doi:10.4161/rna.27673
PMCID: PMC4008548  PMID: 24572480
Darwinian threshold; cellular evolution; domains of life; evolutionary transitions; horizontal gene transfer; progenote

Résultats 1-25 (303)