Search tips
Search criteria

Results 1-25 (1138091)

Clipboard (0)

Related Articles

1.  Temporal order of evolution of DNA replication systems inferred by comparison of cellular and viral DNA polymerases 
Biology Direct  2006;1:39.
The core enzymes of the DNA replication systems show striking diversity among cellular life forms and more so among viruses. In particular, and counter-intuitively, given the central role of DNA in all cells and the mechanistic uniformity of replication, the core enzymes of the replication systems of bacteria and archaea (as well as eukaryotes) are unrelated or extremely distantly related. Viruses and plasmids, in addition, possess at least two unique DNA replication systems, namely, the protein-primed and rolling circle modalities of replication. This unexpected diversity makes the origin and evolution of DNA replication systems a particularly challenging and intriguing problem in evolutionary biology.
I propose a specific succession for the emergence of different DNA replication systems, drawing argument from the differences in their representation among viruses and other selfish replicating elements. In a striking pattern, the DNA replication systems of viruses infecting bacteria and eukaryotes are dominated by the archaeal-type B-family DNA polymerase (PolB) whereas the bacterial replicative DNA polymerase (PolC) is present only in a handful of bacteriophage genomes. There is no apparent mechanistic impediment to the involvement of the bacterial-type replication machinery in viral DNA replication. Therefore, I hypothesize that the observed, markedly unequal distribution of the replicative DNA polymerases among the known cellular and viral replication systems has a historical explanation. I propose that, among the two types of DNA replication machineries that are found in extant life forms, the archaeal-type, PolB-based system evolved first and had already given rise to a variety of diverse viruses and other selfish elements before the advent of the bacterial, PolC-based machinery. Conceivably, at that stage of evolution, the niches for DNA-viral reproduction have been already filled with viruses replicating with the help of the archaeal system, and viruses with the bacterial system never took off. I further suggest that the two other systems of DNA replication, the rolling circle mechanism and the protein-primed mechanism, which are represented in diverse selfish elements, also evolved prior to the emergence of the bacterial replication system. This hypothesis is compatible with the distinct structural affinities of PolB, which has the palm-domain fold shared with reverse transcriptases and RNA-dependent RNA polymerases, and PolC that has a distinct, unrelated nucleotidyltransferase fold. I propose that PolB is a descendant of polymerases that were involved in the replication of genetic elements in the RNA-protein world, prior to the emergence of DNA replication. By contrast, PolC might have evolved from an ancient non-templated polymerase, e.g., polyA polymerase. The proposed temporal succession of the evolving DNA replication systems does not depend on the specific scenario adopted for the evolution of cells and viruses, i.e., whether viruses are derived from cells or virus-like elements are thought to originate from a primordial gene pool. However, arguments are presented in favor of the latter scenario as the most parsimonious explanation of the evolution of DNA replication systems.
Comparative analysis of the diversity of genomic strategies and organizations of viruses and cellular life forms has the potential to open windows into the deep past of life's evolution, especially, with the regard to the origin of genome replication systems. When complemented with information on the evolution of the relevant protein folds, this comparative approach can yield credible scenarios for very early steps of evolution that otherwise appear to be out of reach.
Eric Bapteste, Patrick Forterre, and Mark Ragan.
PMCID: PMC1766352  PMID: 17176463
2.  The Biological Big Bang model for the major transitions in evolution 
Biology Direct  2007;2:21.
Major transitions in biological evolution show the same pattern of sudden emergence of diverse forms at a new level of complexity. The relationships between major groups within an emergent new class of biological entities are hard to decipher and do not seem to fit the tree pattern that, following Darwin's original proposal, remains the dominant description of biological evolution. The cases in point include the origin of complex RNA molecules and protein folds; major groups of viruses; archaea and bacteria, and the principal lineages within each of these prokaryotic domains; eukaryotic supergroups; and animal phyla. In each of these pivotal nexuses in life's history, the principal "types" seem to appear rapidly and fully equipped with the signature features of the respective new level of biological organization. No intermediate "grades" or intermediate forms between different types are detectable. Usually, this pattern is attributed to cladogenesis compressed in time, combined with the inevitable erosion of the phylogenetic signal.
I propose that most or all major evolutionary transitions that show the "explosive" pattern of emergence of new types of biological entities correspond to a boundary between two qualitatively distinct evolutionary phases. The first, inflationary phase is characterized by extremely rapid evolution driven by various processes of genetic information exchange, such as horizontal gene transfer, recombination, fusion, fission, and spread of mobile elements. These processes give rise to a vast diversity of forms from which the main classes of entities at the new level of complexity emerge independently, through a sampling process. In the second phase, evolution dramatically slows down, the respective process of genetic information exchange tapers off, and multiple lineages of the new type of entities emerge, each of them evolving in a tree-like fashion from that point on. This biphasic model of evolution incorporates the previously developed concepts of the emergence of protein folds by recombination of small structural units and origin of viruses and cells from a pre-cellular compartmentalized pool of recombining genetic elements. The model is extended to encompass other major transitions. It is proposed that bacterial and archaeal phyla emerged independently from two distinct populations of primordial cells that, originally, possessed leaky membranes, which made the cells prone to rampant gene exchange; and that the eukaryotic supergroups emerged through distinct, secondary endosymbiotic events (as opposed to the primary, mitochondrial endosymbiosis). This biphasic model of evolution is substantially analogous to the scenario of the origin of universes in the eternal inflation version of modern cosmology. Under this model, universes like ours emerge in the infinite multiverse when the eternal process of exponential expansion, known as inflation, ceases in a particular region as a result of false vacuum decay, a first order phase transition process. The result is the nucleation of a new universe, which is traditionally denoted Big Bang, although this scenario is radically different from the Big Bang of the traditional model of an expanding universe. Hence I denote the phase transitions at the end of each inflationary epoch in the history of life Biological Big Bangs (BBB).
A Biological Big Bang (BBB) model is proposed for the major transitions in life's evolution. According to this model, each transition is a BBB such that new classes of biological entities emerge at the end of a rapid phase of evolution (inflation) that is characterized by extensive exchange of genetic information which takes distinct forms for different BBBs. The major types of new forms emerge independently, via a sampling process, from the pool of recombining entities of the preceding generation. This process is envisaged as being qualitatively different from tree-pattern cladogenesis.
This article was reviewed by William Martin, Sergei Maslov, and Leonid Mirny.
PMCID: PMC1973067  PMID: 17708768
3.  The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? 
Biology Direct  2006;1:22.
Ever since the discovery of 'genes in pieces' and mRNA splicing in eukaryotes, origin and evolution of spliceosomal introns have been considered within the conceptual framework of the 'introns early' versus 'introns late' debate. The 'introns early' hypothesis, which is closely linked to the so-called exon theory of gene evolution, posits that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. Under this scenario, the absence of spliceosomal introns in prokaryotes is considered to be a result of "genome streamlining". The 'introns late' hypothesis counters that spliceosomal introns emerged only in eukaryotes, and moreover, have been inserted into protein-coding genes continuously throughout the evolution of eukaryotes. Beyond the formal dilemma, the more substantial side of this debate has to do with possible roles of introns in the evolution of eukaryotes.
I argue that several lines of evidence now suggest a coherent solution to the introns-early versus introns-late debate, and the emerging picture of intron evolution integrates aspects of both views although, formally, there seems to be no support for the original version of introns-early. Firstly, there is growing evidence that spliceosomal introns evolved from group II self-splicing introns which are present, usually, in small numbers, in many bacteria, and probably, moved into the evolving eukaryotic genome from the α-proteobacterial progenitor of the mitochondria. Secondly, the concept of a primordial pool of 'virus-like' genetic elements implies that self-splicing introns are among the most ancient genetic entities. Thirdly, reconstructions of the ancestral state of eukaryotic genes suggest that the last common ancestor of extant eukaryotes had an intron-rich genome. Thus, it appears that ancestors of spliceosomal introns, indeed, have existed since the earliest stages of life's evolution, in a formal agreement with the introns-early scenario. However, there is no evidence that these ancient introns ever became widespread before the emergence of eukaryotes, hence, the central tenet of introns-early, the role of introns in early evolution of proteins, has no support. However, the demonstration that numerous introns invaded eukaryotic genes at the outset of eukaryotic evolution and that subsequent intron gain has been limited in many eukaryotic lineages implicates introns as an ancestral feature of eukaryotic genomes and refutes radical versions of introns-late. Perhaps, most importantly, I argue that the intron invasion triggered other pivotal events of eukaryogenesis, including the emergence of the spliceosome, the nucleus, the linear chromosomes, the telomerase, and the ubiquitin signaling system. This concept of eukaryogenesis, in a sense, revives some tenets of the exon hypothesis, by assigning to introns crucial roles in eukaryotic evolutionary innovation.
The scenario of the origin and evolution of introns that is best compatible with the results of comparative genomics and theoretical considerations goes as follows: self-splicing introns since the earliest stages of life's evolution – numerous spliceosomal introns invading genes of the emerging eukaryote during eukaryogenesis – subsequent lineage-specific loss and gain of introns. The intron invasion, probably, spawned by the mitochondrial endosymbiont, might have critically contributed to the emergence of the principal features of the eukaryotic cell. This scenario combines aspects of the introns-early and introns-late views.
this article was reviewed by W. Ford Doolittle, James Darnell (nominated by W. Ford Doolittle), William Martin, and Anthony Poole.
PMCID: PMC1570339  PMID: 16907971
4.  On the Origin of Cells and Viruses: Primordial Virus World Scenario 
It is proposed that the pre-cellular stage of biological evolution unraveled within networks of inorganic compartments that harbored a diverse mix of virus-like genetic elements. This stage of evolution might comprise the Last Universal Cellular Ancestor (LUCA) that more appropriately could be denoted Last Universal Cellular Ancestral State (LUCAS). This scenario for the origin of cellular life recapitulates the early ideas of J. B. S. Haldane sketched in his classic 1928 essay. However, unlike in Haldane’s day, there is now considerable support for this scenario from three major lines of comparative-genomic evidence: i) lack of homology between the core components of the DNA replication systems of the two primary lines of descent of cellular life forms, archaea and bacteria, ii) distinct membrane chemistries and lack of homology between the enzymes of lipid biosynthesis in archaea and bacteria, iii) spread of several viral hallmark genes, which encode proteins with key functions in viral replication and morphogenesis, among numerous and extremely diverse groups of viruses, in contrast to their absence in cellular life forms, iv) the extant archaeal and bacterial chromosomes appear to be shaped by accretion of diverse, smaller replicons, suggesting a continuity between the hypothetical, primordial virus stage of life’s evolution and the dynamic prokaryotic world that existed ever since. Under the viral model of pre-cellular evolution, the key components of cells including the replication apparatus, membranes, and molecular complexes involved in membrane transport and translocation originated as components of virus-like entities. The two surviving types of cellular life forms, archaea and bacteria, might have emerged from the LUCAS independently, along with, probably, numerous forms now extinct.
PMCID: PMC3380365  PMID: 19845627
comparative genomics; evolution of cells; evolution of viruses; origin of membranes; viral hallmark genes
5.  Evolution of microbes and viruses: a paradigm shift in evolutionary biology? 
When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology.
PMCID: PMC3440604  PMID: 22993722
Darwin; modern synthesis; comparative genomics; tree of life; horizontal gene transfer
6.  A virocentric perspective on the evolution of life 
Current opinion in virology  2013;3(5):546-557.
Viruses and/or virus-like selfish elements are associated with all cellular life forms and are the most abundant biological entities on Earth, with the number of virus particles in many environments exceeding the number of cells by one to two orders of magnitude. The genetic diversity of viruses is commensurately enormous and might substantially exceed the diversity of cellular organisms. Unlike cellular organisms with their uniform replication-expression scheme, viruses possess either RNA or DNA genomes and exploit all conceivable replication-expression strategies. Although viruses extensively exchange genes with their hosts, there exists a set of viral hallmark genes that are shared by extremely diverse groups of viruses to the exclusion of cellular life forms. Coevolution of viruses and host defense systems is a key aspect in the evolution of both viruses and cells, and viral genes are often recruited for cellular functions. Together with the fundamental inevitability of the emergence of genomic parasites in any evolving replicator system, these multiple lines of evidence reveal the central role of viruses in the entire evolution of life.
PMCID: PMC4326007  PMID: 23850169
7.  Virophages, polintons, and transpovirons: a complex evolutionary network of diverse selfish genetic elements with different reproduction strategies 
Virology Journal  2013;10:158.
Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor.
We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknownvirus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements.
The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements.
PMCID: PMC3671162  PMID: 23701946
8.  Modeling the Worldwide Spread of Pandemic Influenza: Baseline Case and Containment Interventions 
PLoS Medicine  2007;4(1):e13.
The highly pathogenic H5N1 avian influenza virus, which is now widespread in Southeast Asia and which diffused recently in some areas of the Balkans region and Western Europe, has raised a public alert toward the potential occurrence of a new severe influenza pandemic. Here we study the worldwide spread of a pandemic and its possible containment at a global level taking into account all available information on air travel.
Methods and Findings
We studied a metapopulation stochastic epidemic model on a global scale that considers airline travel flow data among urban areas. We provided a temporal and spatial evolution of the pandemic with a sensitivity analysis of different levels of infectiousness of the virus and initial outbreak conditions (both geographical and seasonal). For each spreading scenario we provided the timeline and the geographical impact of the pandemic in 3,100 urban areas, located in 220 different countries. We compared the baseline cases with different containment strategies, including travel restrictions and the therapeutic use of antiviral (AV) drugs. We investigated the effect of the use of AV drugs in the event that therapeutic protocols can be carried out with maximal coverage for the populations in all countries. In view of the wide diversity of AV stockpiles in different regions of the world, we also studied scenarios in which only a limited number of countries are prepared (i.e., have considerable AV supplies). In particular, we compared different plans in which, on the one hand, only prepared and wealthy countries benefit from large AV resources, with, on the other hand, cooperative containment scenarios in which countries with large AV stockpiles make a small portion of their supplies available worldwide.
We show that the inclusion of air transportation is crucial in the assessment of the occurrence probability of global outbreaks. The large-scale therapeutic usage of AV drugs in all hit countries would be able to mitigate a pandemic effect with a reproductive rate as high as 1.9 during the first year; with AV supply use sufficient to treat approximately 2% to 6% of the population, in conjunction with efficient case detection and timely drug distribution. For highly contagious viruses (i.e., a reproductive rate as high as 2.3), even the unrealistic use of supplies corresponding to the treatment of approximately 20% of the population leaves 30%–50% of the population infected. In the case of limited AV supplies and pandemics with a reproductive rate as high as 1.9, we demonstrate that the more cooperative the strategy, the more effective are the containment results in all regions of the world, including those countries that made part of their resources available for global use.
A metapopulation stochastic epidemic model for influenza shows the need to include air transportation when assessing the occurrence probability of global outbreaks. The impact of the use of antiviral drugs is also measured.
Editors' Summary
Seasonal outbreaks (epidemics) of influenza—a viral infection of the nose, throat, and airways—affect millions of people and kill about 500,000 individuals every year. Regular epidemics occur because flu viruses frequently make small changes in the viral proteins (antigens) recognized by the human immune system. Consequently, a person's immune-system response that combats influenza one year provides incomplete protection the next year. Occasionally, a human influenza virus appears that contains large antigenic changes. People have little immunity to such viruses (which often originate in birds or animals), so they can start a global epidemic (pandemic) that kills millions of people. Experts fear that a human influenza pandemic could be triggered by the avian H5N1 influenza virus, which is present in bird flocks around the world. So far, fewer than 300 people have caught this virus but more than 150 people have died.
Why Was This Study Done?
Avian H5N1 influenza has not yet triggered a human pandemic, because it rarely passes between people. If it does acquire this ability, it would take 6–8 months to develop a vaccine to provide protection against this new, potentially pandemic virus. Public health officials therefore need other strategies to protect people during the first few months of a pandemic. These could include international travel restrictions and the use of antiviral drugs. However, to get the most benefit from these interventions, public-health officials need to understand how influenza pandemics spread, both over time and geographically. In this study, the researchers have used detailed information on air travel to model the global spread of an emerging influenza pandemic and its containment.
What Did the Researchers Do and Find?
The researchers incorporated data on worldwide air travel and census data from urban centers near airports into a mathematical model of the spread of an influenza pandemic. They then used this model to investigate how the spread and health effects of a pandemic flu virus depend on the season in which it emerges (influenza virus thrives best in winter), where it emerges, and how infectious it is. Their model predicts, for example, that a flu virus originating in Hanoi, Vietnam, with a reproductive number (R0) of 1.1 (a measure of how many people an infectious individual infects on average) poses a very mild global threat. However, epidemics initiated by a virus with an R0 of more than 1.5 would often infect half the population in more than 100 countries. Next, the researchers used their model to show that strict travel restrictions would have little effect on pandemic evolution. More encouragingly, their model predicts that antiviral drugs would mitigate pandemics of a virus with an R0 up to 1.9 if every country had an antiviral drug stockpile sufficient to treat 5% of its population; if the R0 was 2.3 or higher, the pandemic would not be contained even if 20% of the population could be treated. Finally, the researchers considered a realistic scenario in which only a few countries possess antiviral stockpiles. In these circumstances, compared with a “selfish” strategy in which countries only use their antiviral drugs within their borders, limited worldwide sharing of antiviral drugs would slow down the spread of a flu virus with an R0 of 1.9 by more than a year and would benefit both drug donors and recipients.
What Do These Findings Mean?
Like all mathematical models, this model for the global spread of an emerging pandemic influenza virus contains many assumptions (for example, about viral behavior) that might affect the accuracy of its predictions. The model also does not consider variations in travel frequency between individuals or viral spread in rural areas. Nevertheless, the model provides the most extensive global simulation of pandemic influenza spread to date. Reassuringly, it suggests that an emerging virus with a low R0 would not pose a major public-health threat, since its attack rate would be limited and would not peak for more than a year, by which time a vaccine could be developed. Most importantly, the model suggests that cooperative sharing of antiviral drugs, which could be organized by the World Health Organization, might be the best way to deal with an emerging influenza pandemic.
Additional Information.
Please access these Web sites via the online version of this summary at
The US Centers for Disease Control and Prevention has information about influenza for patients and professionals, including key facts about avian influenza and antiviral drugs
The US National Institute of Allergy and Infectious Disease features information on seasonal, avian, and pandemic flu
The US Department of Health and Human Services provides information on pandemic flu and avian flu, including advice to travelers
World Health Organization has fact sheets on influenza and avian influenza, including advice to travelers and current pandemic flu threat
The UK Health Protection Agency has information on seasonal, avian, and pandemic influenza
The UK Department of Health has a feature article on bird flu and pandemic influenza
PMCID: PMC1779816  PMID: 17253899
9.  Selfishness, warfare, and economics; or integration, cooperation, and biology 
The acceptance of Darwin's theory of evolution by natural selection is not complete and it has been pointed out its limitation to explain the complex processes that constitute the transformation of species. It is necessary to discuss the explaining power of the dominant paradigm. It is common that new discoveries bring about contradictions that are intended to be overcome by adjusting results to the dominant reductionist paradigm using all sorts of gradations and combinations that are admitted for each case. In addition to the discussion on the validity of natural selection, modern findings represent a challenge to the interpretation of the observations with the Darwinian view of competition and struggle for life as theoretical basis. New holistic interpretations are emerging related to the Net of Life, in which the interconnection of ecosystems constitutes a dynamic and self-regulating biosphere: viruses are recognized as a macroorganism with a huge collection of genes, most unknown that constitute the major planet's gene pool. They play a fundamental role in evolution since their sequences are capable of integrating into the genomes in an “infective” way and become an essential part of multicellular organisms. They have content with “biological sense” i.e., they appear as part of normal life processes and have a serious role as carrier elements of complex genetic information. Antibiotics are cell signals with main effects on general metabolism and transcription on bacterial cells and communities. The hologenome theory considers an organism and all of its associated symbiotic microbes (parasites, mutualists, synergists, amensalists) as a result of symbiopoiesis. Microbes, helmints, that are normally understood as parasites are cohabitants and they have cohabited with their host and drive the evolution and existence of the partners. Each organism is the result of integration of complex systems. The eukaryotic organism is the result of combination of bacterial, virus, and eukaryotic DNA and it is the result of the interaction of its own genome with the genome of its microbiota, and their metabolism are intertwined (as a “superorganism”) along evolution. The darwinian paradigm had its origin in the free market theories and concepts of Malthus and Spencer. Then, nature was explained on the basis of market theories moving away from an accurate explanation of natural phenomena. It is necessary to acknowledge the limitations of the dominant dogma. These new interpretations about biological processes, molecules, roles of viruses in nature, and microbial interactions are remarkable points to be considered in order to construct a solid theory adjusted to the facts and with less speculations and tortuous semantic traps.
PMCID: PMC3417387  PMID: 22919645
Darwinism; natural selection; evolution; paradigm; virus; hologenome; autopoiesis
10.  On the origin of the translation system and the genetic code in the RNA world by means of natural selection, exaptation, and subfunctionalization 
Biology Direct  2007;2:14.
The origin of the translation system is, arguably, the central and the hardest problem in the study of the origin of life, and one of the hardest in all evolutionary biology. The problem has a clear catch-22 aspect: high translation fidelity hardly can be achieved without a complex, highly evolved set of RNAs and proteins but an elaborate protein machinery could not evolve without an accurate translation system. The origin of the genetic code and whether it evolved on the basis of a stereochemical correspondence between amino acids and their cognate codons (or anticodons), through selectional optimization of the code vocabulary, as a "frozen accident" or via a combination of all these routes is another wide open problem despite extensive theoretical and experimental studies. Here we combine the results of comparative genomics of translation system components, data on interaction of amino acids with their cognate codons and anticodons, and data on catalytic activities of ribozymes to develop conceptual models for the origins of the translation system and the genetic code.
Our main guide in constructing the models is the Darwinian Continuity Principle whereby a scenario for the evolution of a complex system must consist of plausible elementary steps, each conferring a distinct advantage on the evolving ensemble of genetic elements. Evolution of the translation system is envisaged to occur in a compartmentalized ensemble of replicating, co-selected RNA segments, i.e., in a RNA World containing ribozymes with versatile activities. Since evolution has no foresight, the translation system could not evolve in the RNA World as the result of selection for protein synthesis and must have been a by-product of evolution drive by selection for another function, i.e., the translation system evolved via the exaptation route. It is proposed that the evolutionary process that eventually led to the emergence of translation started with the selection for ribozymes binding abiogenic amino acids that stimulated ribozyme-catalyzed reactions. The proposed scenario for the evolution of translation consists of the following steps: binding of amino acids to a ribozyme resulting in an enhancement of its catalytic activity; evolution of the amino-acid-stimulated ribozyme into a peptide ligase (predecessor of the large ribosomal subunit) yielding, initially, a unique peptide activating the original ribozyme and, possibly, other ribozymes in the ensemble; evolution of self-charging proto-tRNAs that were selected, initially, for accumulation of amino acids, and subsequently, for delivery of amino acids to the peptide ligase; joining of the peptide ligase with a distinct RNA molecule (predecessor of the small ribosomal subunit) carrying a built-in template for more efficient, complementary binding of charged proto-tRNAs; evolution of the ability of the peptide ligase to assemble peptides using exogenous RNAs as template for complementary binding of charged proteo-tRNAs, yielding peptides with the potential to activate different ribozymes; evolution of the translocation function of the protoribosome leading to the production of increasingly longer peptides (the first proteins), i.e., the origin of translation. The specifics of the recognition of amino acids by proto-tRNAs and the origin of the genetic code depend on whether or not there is a physical affinity between amino acids and their cognate codons or anticodons, a problem that remains unresolved.
We describe a stepwise model for the origin of the translation system in the ancient RNA world such that each step confers a distinct advantage onto an ensemble of co-evolving genetic elements. Under this scenario, the primary cause for the emergence of translation was the ability of amino acids and peptides to stimulate reactions catalyzed by ribozymes. Thus, the translation system might have evolved as the result of selection for ribozymes capable of, initially, efficient amino acid binding, and subsequently, synthesis of increasingly versatile peptides. Several aspects of this scenario are amenable to experimental testing.
This article was reviewed by Rob Knight, Doron Lancet, Alexander Mankin (nominated by Arcady Mushegian), and Arcady Mushegian.
PMCID: PMC1894784  PMID: 17540026
11.  Dual Host-Virus Arms Races Shape an Essential Housekeeping Protein 
PLoS Biology  2013;11(5):e1001571.
Relentless selective pressures exerted by viruses trigger arms race dynamics that shape the evolution of even critical host genes like those involved in iron homeostasis.
Transferrin Receptor (TfR1) is the cell-surface receptor that regulates iron uptake into cells, a process that is fundamental to life. However, TfR1 also facilitates the cellular entry of multiple mammalian viruses. We use evolutionary and functional analyses of TfR1 in the rodent clade, where two families of viruses bind this receptor, to mechanistically dissect how essential housekeeping genes like TFR1 successfully balance the opposing selective pressures exerted by host and virus. We find that while the sequence of rodent TfR1 is generally conserved, a small set of TfR1 residue positions has evolved rapidly over the speciation of rodents. Remarkably, all of these residues correspond to the two virus binding surfaces of TfR1. We show that naturally occurring mutations at these positions block virus entry while simultaneously preserving iron-uptake functionalities, both in rodent and human TfR1. Thus, by constantly replacing the amino acids encoded at just a few residue positions, TFR1 divorces adaptation to ever-changing viruses from preservation of key cellular functions. These dynamics have driven genetic divergence at the TFR1 locus that now enforces species-specific barriers to virus transmission, limiting both the cross-species and zoonotic transmission of these viruses.
Author Summary
Genetic differences between mammalian species dictate the patterns of viral infection observed in nature. They also define how viruses must evolve in order to infect new mammalian hosts, giving rise to new and sometimes pandemic diseases. Because viruses must enter cells before they can replicate, new diseases often emerge when existing viruses evolve the ability to bind to the cell-surface receptor of a new species. At the same time, host cell receptors also evolve to counteract virus attacks. This back-and-forth evolution between virus and host can lead to an arms race that shapes the sequences of the proteins involved. In wild rodent populations, the retrovirus MMTV and New World arenaviruses both exploit Transferrin Receptor 1 (TfR1) to enter the cells of their hosts. Here we show that the physical interactions between these viruses and TfR1 have triggered evolutionary arms race dynamics that have directly modified the sequence of TfR1 and at least one of the viruses involved. Computational evolutionary analysis allowed us to identify specific residues in TfR1 that define patterns of viral infection in nature. The approach presented here can theoretically be applied to the study of any virus, through analysis of host genes known to be key to controlling viral infection. As such, this approach can expand our understanding of how viruses emerge from wildlife reservoirs, and how they drive the evolution of host genes.
PMCID: PMC3665890  PMID: 23723737
12.  Molecular characterization of the evolution of phagosomes 
First large-scale comparative proteomics/phosphoproteomics study characterizing some of the key steps that contributed to the remodeling of phagosomes that occurred during evolution. Comparison of profiling analyses of isolated phagosomes from three distant organisms (Dictyostelium, Drosophila, and mouse) revealed a protein core that defines a potential ‘ancient' phagosome and a set of 50 proteins that emerged while adaptive immunity was already well established.Gene duplication events of mouse phagosome paralogs occurred mostly in Bilateria and Euteleostomi, coinciding with the emergence of innate and adaptive immunity, and thus, provided the functional innovations needed for the establishment of these two crucial evolutionary steps of the immune system.Phosphoproteomics of isolated phagosomes from the same three distant species indicate that the phagosome phosphoproteome has been extensively modified during evolution. Still, some phosphosites have been maintained for >1.2 billion years, and thus, highlight their particular significance in the regulation of key phagosomal functions.
Phagocytosis is the process by which multiple cell types internalize large particulate material from the external milieu. The functional properties of phagosomes are acquired through a complex maturation process, referred to as phagolysosome biogenesis. This pathway involves a series of rapid interactions with organelles of the endocytic apparatus, enabling the gradual transformation of newly formed phagosomes into phagolysosomes in which proteolytic degradation occurs. The degradative environment encountered in the phagosome lumen has enabled the use of phagocytosis as a predation mechanism for feeding (phagotrophy) in amoeba, whereas multicellular organisms utilize this process as a defense mechanism to kill microbes and, in jawed vertebrates (fish), initiate a sustained immune response.
High-throughput proteomics profiling of isolated phagosomes has been tremendously helpful for the molecular comprehension of this organelle. This approach is achieved by feeding low buoyancy latex beads to phagocytic cells, enabling the subsequent isolation of latex bead-containing phagosomes, away from all the other cell organelles, by a single-isopicnic centrifugation in sucrose gradient. In order to characterize some of the key steps that contributed to the remodeling of phagosomes during evolution, we isolated this organelle from three distant organisms: the amoeba Dictyostelium discoideum, the fruit fly Drosophila melanogaster, and mouse (Mus musculus) that use phagocytosis for different purposes, and performed detailed proteomics and phosphoproteomics analyses with unparallel protein coverage for this organelle (two- to four-fold enhancements in identified proteins).
In order to establish the origin of the mouse phagosome proteome, we performed comparative analyses among 39 taxa including plants/algea, unicellular organisms, fungi, and more complex animal multicellular organisms. These genomic comparisons indicated that a large proportion of the mouse phagosome proteome is of ancient origin (73.1% of the proteome is conserved in eukaryotic organisms) (Figure 2A). This stresses the fact that phagocytosis is a very ancient process, as shown by its possible involvement in the emergence of eukaryotic cells (eukaryogenesis). Indeed, we identified close to 300 phagosome mouse proteins also present on Drosophila and Dictyostelium phagosomes, defining a potential ‘ancient' core of proteins from which the immune functions of phagosomes likely evolved. Around 16.7% of the mouse phagosome proteins appeared in organisms that use phagocytosis for innate immunity (Bilateria to Chordata), whereas 10.2% appeared in Euteleostomi or Tetrapoda where phagosomes have an important function in linking the killing of microorganisms with the development of a specific sustained immune response following antigen recognition. The phagosome is made of molecules taken from a variety of sources within the cell, including the cytoplasm, the cytoskeleton and membrane organelles. Despite the evolution and diversification of these various cellular systems, the mammalian phagosome proteome is made preferentially of ancient proteins (Figure 2B). Comparison of functional annotation during evolution highlighted the emergence of specific phagosomal functions at various steps during evolution (Figure 2C). Some of these proteins and their point of origin during evolution are highlighted in Figure 2D. Strikingly, we identified in Tetrapods a set of 50 proteins that arose while adaptive immunity was already well established in teleosts (fish), indicating that the phagocytic system is still evolving.
Our study highlights the fact that the functional properties of phagosomes emerged by the remodeling of ancient molecules, the addition of novel components, and the duplication of existing proteins (paralogs) leading to the formation of molecular machines of mixed origin. Gene duplication is a process that contributed continuously to the complexification of the mouse proteome during evolution. In sharp contrast, paralog analysis indicated that the phagosome proteome was mainly reorganized through two periods of gene duplication, in Bilateria and Euteleostomi, coinciding with the emergence of adaptive immunity (in jawed fish), and innate immunity (at the split between Metazoa and Bilateria). These results strongly suggest that selective constraints may have favored the maintenance of phagosome paralogs to ensure the establishment of novel functions associated with this organelle at these two crucial evolutionary steps of the immune system.
The emergence of genes associated to the MHC locus in mammals that appeared originally in the genome of jawed fishes, contributed to the development of complex molecular mechanisms linking innate (our immune system that defends the host from infection in a non-specific manner) and adaptive immunity (the part of the immune system triggered specifically after antigen recognition). Several of the genes of this locus encode proteins known to have important functions in antigen presentation, such as subunits of the immunoproteasome (LMP2 and LMP7), MHC class I and class II molecules, as well as tapasin and the transporter associated with antigen processing (TAP1 and TAP2), involved in the transport and loading of peptides on MHC class I molecules (Figure 6). In addition to their ability to present peptides on MHC class II molecules, phagosomes of vertebrates have been shown to be competent for the presentation of exogenous peptides on MHC class I molecules, a process referred to as cross-presentation. From a functional point of view, the involvement of phagosomes in antigen cross-presentation is the outcome of the successful integration of a wide range of multimolecular components that emerged throughout evolution (Figure 6). The trimming of exogenous proteins into small peptides that can be loaded on MHC class I molecules is inherited from the phagotrophic properties of unicellular organisms, where internalized bacteria are degraded into basic molecules and used as a source of nutrients. Ancient processes have therefore been co-opted (the use of an existing biological structure or feature for a new function) for new functionalities. A summarizing model of the various steps that enabled phagosome antigen presentation is presented in Figure 6. This model highlights the fact that although antigen presentation is unique to evolutionary recent phagosomes (starting in jawed fishes about 450 million years ago), it uses and integrates molecular machines composed of proteins that emerged throughout evolution.
In summary, we present here the first large-scale comparative proteomics/phosphoproteomics study characterizing some of the key evolutionary steps that contributed to the remodeling of phagosomes during evolution. Functional properties of this organelle emerged by the remodeling of ancient molecules, the addition of novel components, the extensive adaption of protein phosphorylation sites and the duplication of existing proteins leading to the formation of molecular machines of mixed origin.
Amoeba use phagocytosis to internalize bacteria as a source of nutrients, whereas multicellular organisms utilize this process as a defense mechanism to kill microbes and, in vertebrates, initiate a sustained immune response. By using a large-scale approach to identify and compare the proteome and phosphoproteome of phagosomes isolated from distant organisms, and by comparative analysis over 39 taxa, we identified an ‘ancient' core of phagosomal proteins around which the immune functions of this organelle have likely organized. Our data indicate that a larger proportion of the phagosome proteome, compared with the whole cell proteome, has been acquired through gene duplication at a period coinciding with the emergence of innate and adaptive immunity. Our study also characterizes in detail the acquisition of novel proteins and the significant remodeling of the phagosome phosphoproteome that contributed to modify the core constituents of this organelle in evolution. Our work thus provides the first thorough analysis of the changes that enabled the transformation of the phagosome from a phagotrophic compartment into an organelle fully competent for antigen presentation.
PMCID: PMC2990642  PMID: 20959821
evolution; immunity; phosphoproteomics; phylogeny; proteomics
13.  Viral evolution 
Mobile Genetic Elements  2012;2(5):247-252.
Explaining the origin of viruses remains an important challenge for evolutionary biology. Previous explanatory frameworks described viruses as founders of cellular life, as parasitic reductive products of ancient cellular organisms or as escapees of modern genomes. Each of these frameworks endow viruses with distinct molecular, cellular, dynamic and emergent properties that carry broad and important implications for many disciplines, including biology, ecology and epidemiology. In a recent genome-wide structural phylogenomic analysis, we have shown that large-to-medium-sized viruses coevolved with cellular ancestors and have chosen the evolutionary reductive route. Here we interpret these results and provide a parsimonious hypothesis for the origin of viruses that is supported by molecular data and objective evolutionary bioinformatic approaches. Results suggest two important phases in the evolution of viruses: (1) origin from primordial cells and coexistence with cellular ancestors, and (2) prolonged pressure of genome reduction and relatively late adaptation to the parasitic lifestyle once virions and diversified cellular life took over the planet. Under this evolutionary model, new viral lineages can evolve from existing cellular parasites and enhance the diversity of the world’s virosphere.
PMCID: PMC3575434  PMID: 23550145
giant viruses; parasitism; phylogenomics; protein domains; reductive evolution
14.  Origin and evolution of spliceosomal introns 
Biology Direct  2012;7:11.
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
PMCID: PMC3488318  PMID: 22507701
Intron sliding; Intron gain; Intron loss; Spliceosome; Splicing signals; Evolution of exon/intron structure; Alternative splicing; Phylogenetic trees; Mobile domains; Eukaryotic ancestor
15.  The fundamental units, processes and patterns of evolution, and the Tree of Life conundrum 
Biology Direct  2009;4:33.
The elucidation of the dominant role of horizontal gene transfer (HGT) in the evolution of prokaryotes led to a severe crisis of the Tree of Life (TOL) concept and intense debates on this subject.
Prompted by the crisis of the TOL, we attempt to define the primary units and the fundamental patterns and processes of evolution. We posit that replication of the genetic material is the singular fundamental biological process and that replication with an error rate below a certain threshold both enables and necessitates evolution by drift and selection. Starting from this proposition, we outline a general concept of evolution that consists of three major precepts.
1. The primary agency of evolution consists of Fundamental Units of Evolution (FUEs), that is, units of genetic material that possess a substantial degree of evolutionary independence. The FUEs include both bona fide selfish elements such as viruses, viroids, transposons, and plasmids, which encode some of the information required for their own replication, and regular genes that possess quasi-independence owing to their distinct selective value that provides for their transfer between ensembles of FUEs (genomes) and preferential replication along with the rest of the recipient genome.
2. The history of replication of a genetic element without recombination is isomorphously represented by a directed tree graph (an arborescence, in the graph theory language). Recombination within a FUE is common between very closely related sequences where homologous recombination is feasible but becomes negligible for longer evolutionary distances. In contrast, shuffling of FUEs occurs at all evolutionary distances. Thus, a tree is a natural representation of the evolution of an individual FUE on the macro scale, but not of an ensemble of FUEs such as a genome.
3. The history of life is properly represented by the "forest" of evolutionary trees for individual FUEs (Forest of Life, or FOL). Search for trends and patterns in the FOL is a productive direction of study that leads to the delineation of ensembles of FUEs that evolve coherently for a certain time span owing to a shared history of vertical inheritance or horizontal gene transfer; these ensembles are commonly known as genomes, taxa, or clades, depending on the level of analysis. A small set of genes (the universal genetic core of life) might show a (mostly) coherent evolutionary trend that transcends the entire history of cellular life forms. However, it might not be useful to denote this trend "the tree of life", or organismal, or species tree because neither organisms nor species are fundamental units of life.
A logical analysis of the units and processes of biological evolution suggests that the natural fundamental unit of evolution is a FUE, that is, a genetic element with an independent evolutionary history. Evolution of a FUE on the macro scale is naturally represented by a tree. Only the full compendium of trees for individual FUEs (the FOL) is an adequate depiction of the evolution of life. Coherent evolution of FUEs over extended evolutionary intervals is a crucial aspect of the history of life but a "species" or "organismal" tree is not a fundamental concept.
This articles was reviewed by Valerian Dolja, W. Ford Doolittle, Nicholas Galtier, and William Martin
PMCID: PMC2761301  PMID: 19788730
16.  Evolution and ecology of influenza A viruses. 
Microbiological Reviews  1992;56(1):152-179.
In this review we examine the hypothesis that aquatic birds are the primordial source of all influenza viruses in other species and study the ecological features that permit the perpetuation of influenza viruses in aquatic avian species. Phylogenetic analysis of the nucleotide sequence of influenza A virus RNA segments coding for the spike proteins (HA, NA, and M2) and the internal proteins (PB2, PB1, PA, NP, M, and NS) from a wide range of hosts, geographical regions, and influenza A virus subtypes support the following conclusions. (i) Two partly overlapping reservoirs of influenza A viruses exist in migrating waterfowl and shorebirds throughout the world. These species harbor influenza viruses of all the known HA and NA subtypes. (ii) Influenza viruses have evolved into a number of host-specific lineages that are exemplified by the NP gene and include equine Prague/56, recent equine strains, classical swine and human strains, H13 gull strains, and all other avian strains. Other genes show similar patterns, but with extensive evidence of genetic reassortment. Geographical as well as host-specific lineages are evident. (iii) All of the influenza A viruses of mammalian sources originated from the avian gene pool, and it is possible that influenza B viruses also arose from the same source. (iv) The different virus lineages are predominantly host specific, but there are periodic exchanges of influenza virus genes or whole viruses between species, giving rise to pandemics of disease in humans, lower animals, and birds. (v) The influenza viruses currently circulating in humans and pigs in North America originated by transmission of all genes from the avian reservoir prior to the 1918 Spanish influenza pandemic; some of the genes have subsequently been replaced by others from the influenza gene pool in birds. (vi) The influenza virus gene pool in aquatic birds of the world is probably perpetuated by low-level transmission within that species throughout the year. (vii) There is evidence that most new human pandemic strains and variants have originated in southern China. (viii) There is speculation that pigs may serve as the intermediate host in genetic exchange between influenza viruses in avian and humans, but experimental evidence is lacking. (ix) Once the ecological properties of influenza viruses are understood, it may be possible to interdict the introduction of new influenza viruses into humans.
PMCID: PMC372859  PMID: 1579108
17.  Influenza in Migratory Birds and Evidence of Limited Intercontinental Virus Exchange 
PLoS Pathogens  2007;3(11):e167.
Migratory waterfowl of the world are the natural reservoirs of influenza viruses of all known subtypes. However, it is unknown whether these waterfowl perpetuate highly pathogenic (HP) H5 and H7 avian influenza viruses. Here we report influenza virus surveillance from 2001 to 2006 in wild ducks in Alberta, Canada, and in shorebirds and gulls at Delaware Bay (New Jersey), United States, and examine the frequency of exchange of influenza viruses between the Eurasian and American virus clades, or superfamilies. Influenza viruses belonging to each of the subtypes H1 through H13 and N1 through N9 were detected in these waterfowl, but H14 and H15 were not found. Viruses of the HP Asian H5N1 subtypes were not detected, and serologic studies in adult mallard ducks provided no evidence of their circulation. The recently described H16 subtype of influenza viruses was detected in American shorebirds and gulls but not in ducks. We also found an unusual cluster of H7N3 influenza viruses in shorebirds and gulls that was able to replicate well in chickens and kill chicken embryos. Genetic analysis of 6,767 avian influenza gene segments and 248 complete avian influenza viruses supported the notion that the exchange of entire influenza viruses between the Eurasian and American clades does not occur frequently. Overall, the available evidence does not support the perpetuation of HP H5N1 influenza in migratory birds and suggests that the introduction of HP Asian H5N1 to the Americas by migratory birds is likely to be a rare event.
Author Summary
Influenza surveillance in wild migratory birds has been done at two sites in North America: 1) in Alberta, Canada, for the past 31 years, and 2) along Delaware Bay, United States, for the past 22 years. These studies support the concept that wild migratory birds are the reservoirs of all influenza A viruses and that the influenza viruses in the world can be divided into two distinct superfamilies, one in Eurasia and the other in the Americas. From time to time these viruses spread to domestic poultry and to humans and cause pandemics of disease. Many investigators have expanded these studies particularly in Europe, Asia, and the Americas. The emergence of highly pathogenic H5N1 in Asia a decade ago and the continuing evolution and spread of these H5N1 viruses to the whole of Eurasia is a continuing problem for veterinary and human public health. The available evidence from Eurasia is that migratory birds can be infected and may be involved in local spread of the highly pathogenic H5N1 virus. The question addressed in the present study is why the highly pathogenic H5N1 influenza virus has not yet reached the Americas despite the overlap in migratory bird pathways, particularly in Alaska. Genomic analysis of influenza viruses from our repository failed to provide evidence of influenza viruses with their whole genome originating from Eurasia. However, we found occasional influenza viruses from North America with single or multiple genes that originated in Eurasia. Our interpretation is that while influenza viruses do exchange between the two hemispheres, this is a rare occurrence. Regardless, enhanced surveillance should be continued in the Americas in case this rare event occurs.
PMCID: PMC2065878  PMID: 17997603
18.  Viral Proteins Acquired from a Host Converge to Simplified Domain Architectures 
PLoS Computational Biology  2012;8(2):e1002364.
The infection cycle of viruses creates many opportunities for the exchange of genetic material with the host. Many viruses integrate their sequences into the genome of their host for replication. These processes may lead to the virus acquisition of host sequences. Such sequences are prone to accumulation of mutations and deletions. However, in rare instances, sequences acquired from a host become beneficial for the virus. We searched for unexpected sequence similarity among the 900,000 viral proteins and all proteins from cellular organisms. Here, we focus on viruses that infect metazoa. The high-conservation analysis yielded 187 instances of highly similar viral-host sequences. Only a small number of them represent viruses that hijacked host sequences. The low-conservation sequence analysis utilizes the Pfam family collection. About 5% of the 12,000 statistical models archived in Pfam are composed of viral-metazoan proteins. In about half of Pfam families, we provide indirect support for the directionality from the host to the virus. The other families are either wrongly annotated or reflect an extensive sequence exchange between the viruses and their hosts. In about 75% of cross-taxa Pfam families, the viral proteins are significantly shorter than their metazoan counterparts. The tendency for shorter viral proteins relative to their related host proteins accounts for the acquisition of only a fragment of the host gene, the elimination of an internal domain and shortening of the linkers between domains. We conclude that, along viral evolution, the host-originated sequences accommodate simplified domain compositions. We postulate that the trimmed proteins act by interfering with the fundamental function of the host including intracellular signaling, post-translational modification, protein-protein interaction networks and cellular trafficking. We compiled a collection of hijacked protein sequences. These sequences are attractive targets for manipulation of viral infection.
Author Summary
Many studies focused on the exchange of genetic material between viruses and cellular hosts. The diversity of viruses argues that, along the evolutionary history, viruses have shaped the host genomes. While most viruses have many opportunities to exchange genetic material with their hosts, tracing such events is challenging as the origin of the sequences is masked by the high mutation rate of many viruses. On the other end, for completing a successful infection cycle the viruses must cope with the cell machinery for entry, replication and translation while hiding from the host immune system. We collected evidence for instances of viral protein sequences that were most probably “stolen” from the hosts. Additionally, a shared ancestry with metazoa is associated with 670 Pfam domain families. For half of these families, the origin of the viral proteins from its host is supported. For about 75% of the cross virus-metazoa families, the viral proteins are significantly shorter than their counterpart host proteins. Most of these cross-taxa viral proteins are single domain proteins and proteins with a simple domain composition relative to the proteins of their hosts. These viral proteins provide insights on the overlooked intimacy of viruses and their multicellular hosts.
PMCID: PMC3271019  PMID: 22319434
19.  Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya 
The discovery of giant viruses with genome and physical size comparable to cellular organisms, remnants of protein translation machinery and virus-specific parasites (virophages) have raised intriguing questions about their origin. Evidence advocates for their inclusion into global phylogenomic studies and their consideration as a distinct and ancient form of life.
Here we reconstruct phylogenies describing the evolution of proteomes and protein domain structures of cellular organisms and double-stranded DNA viruses with medium-to-very-large proteomes (giant viruses). Trees of proteomes define viruses as a ‘fourth supergroup’ along with superkingdoms Archaea, Bacteria, and Eukarya. Trees of domains indicate they have evolved via massive and primordial reductive evolutionary processes. The distribution of domain structures suggests giant viruses harbor a significant number of protein domains including those with no cellular representation. The genomic and structural diversity embedded in the viral proteomes is comparable to the cellular proteomes of organisms with parasitic lifestyles. Since viral domains are widespread among cellular species, we propose that viruses mediate gene transfer between cells and crucially enhance biodiversity.
Results call for a change in the way viruses are perceived. They likely represent a distinct form of life that either predated or coexisted with the last universal common ancestor (LUCA) and constitute a very crucial part of our planet’s biosphere.
PMCID: PMC3570343  PMID: 22920653
20.  On the Origin of DNA Genomes: Evolution of the Division of Labor between Template and Catalyst in Model Replicator Systems 
PLoS Computational Biology  2011;7(3):e1002024.
The division of labor between template and catalyst is a fundamental property of all living systems: DNA stores genetic information whereas proteins function as catalysts. The RNA world hypothesis, however, posits that, at the earlier stages of evolution, RNA acted as both template and catalyst. Why would such division of labor evolve in the RNA world? We investigated the evolution of DNA-like molecules, i.e. molecules that can function only as template, in minimal computational models of RNA replicator systems. In the models, RNA can function as both template-directed polymerase and template, whereas DNA can function only as template. Two classes of models were explored. In the surface models, replicators are attached to surfaces with finite diffusion. In the compartment models, replicators are compartmentalized by vesicle-like boundaries. Both models displayed the evolution of DNA and the ensuing division of labor between templates and catalysts. In the surface model, DNA provides the advantage of greater resistance against parasitic templates. However, this advantage is at least partially offset by the disadvantage of slower multiplication due to the increased complexity of the replication cycle. In the compartment model, DNA can significantly delay the intra-compartment evolution of RNA towards catalytic deterioration. These results are explained in terms of the trade-off between template and catalyst that is inherent in RNA-only replication cycles: DNA releases RNA from this trade-off by making it unnecessary for RNA to serve as template and so rendering the system more resistant against evolving parasitism. Our analysis of these simple models suggests that the lack of catalytic activity in DNA by itself can generate a sufficient selective advantage for RNA replicator systems to produce DNA. Given the widespread notion that DNA evolved owing to its superior chemical properties as a template, this study offers a novel insight into the evolutionary origin of DNA.
Author Summary
At the core of all biological systems lies the division of labor between the storage of genetic information and its phenotypic implementation, in other words, the functional differentiation between templates (DNA) and catalysts (proteins). This fundamental property of life is believed to have been absent at the earliest stages of evolution. The RNA world hypothesis, the most realistic current scenario for the origin of life, posits that, in primordial replicating systems, RNA functioned both as template and as catalyst. How would such division of labor emerge through Darwinian evolution? We investigated the evolution of DNA-like molecules in minimal computational models of RNA replicator systems. Two models were considered: one where molecules are adsorbed on surfaces and another one where molecules are compartmentalized by dividing cellular boundaries. Both models exhibit the evolution of DNA and the ensuing division of labor, revealing the simple governing principle of these processes: DNA releases RNA from the trade-off between template and catalyst that is inevitable in the RNA world and thereby enhances the system's resistance against parasitic templates. Hence, this study offers a novel insight into the evolutionary origin of the division of labor between templates and catalysts in the RNA world.
PMCID: PMC3063752  PMID: 21455287
21.  Early Mesozoic Coexistence of Amniotes and Hepadnaviridae 
PLoS Genetics  2014;10(12):e1004559.
Hepadnaviridae are double-stranded DNA viruses that infect some species of birds and mammals. This includes humans, where hepatitis B viruses (HBVs) are prevalent pathogens in considerable parts of the global population. Recently, endogenized sequences of HBVs (eHBVs) have been discovered in bird genomes where they constitute direct evidence for the coexistence of these viruses and their hosts from the late Mesozoic until present. Nevertheless, virtually nothing is known about the ancient host range of this virus family in other animals. Here we report the first eHBVs from crocodilian, snake, and turtle genomes, including a turtle eHBV that endogenized >207 million years ago. This genomic “fossil” is >125 million years older than the oldest avian eHBV and provides the first direct evidence that Hepadnaviridae already existed during the Early Mesozoic. This implies that the Mesozoic fossil record of HBV infection spans three of the five major groups of land vertebrates, namely birds, crocodilians, and turtles. We show that the deep phylogenetic relationships of HBVs are largely congruent with the deep phylogeny of their amniote hosts, which suggests an ancient amniote–HBV coexistence and codivergence, at least since the Early Mesozoic. Notably, the organization of overlapping genes as well as the structure of elements involved in viral replication has remained highly conserved among HBVs along that time span, except for the presence of the X gene. We provide multiple lines of evidence that the tumor-promoting X protein of mammalian HBVs lacks a homolog in all other hepadnaviruses and propose a novel scenario for the emergence of X via segmental duplication and overprinting of pre-existing reading frames in the ancestor of mammalian HBVs. Our study reveals an unforeseen host range of prehistoric HBVs and provides novel insights into the genome evolution of hepadnaviruses throughout their long-lasting association with amniote hosts.
Author Summary
Viruses are not known to leave physical fossil traces, which makes our understanding of their evolutionary prehistory crucially dependent on the detection of endogenous viruses. Ancient endogenous viruses, also known as paleoviruses, are relics of viral genomes or fragments thereof that once infiltrated their host's germline and then remained as molecular “fossils” within the host genome. The massive genome sequencing of recent years has unearthed vast numbers of paleoviruses from various animal genomes, including the first endogenous hepatitis B viruses (eHBVs) in bird genomes. We screened genomes of land vertebrates (amniotes) for the presence of paleoviruses and identified ancient eHBVs in the recently sequenced genomes of crocodilians, snakes, and turtles. We report an eHBV that is >207 million years old, making it the oldest endogenous virus currently known. Furthermore, our results provide direct evidence that the Hepadnaviridae virus family infected birds, crocodilians and turtles during the Mesozoic Era, and suggest a long-lasting coexistence of these viruses and their amniote hosts at least since the Early Mesozoic. We challenge previous views on the origin of the oncogenic X gene and provide an evolutionary explanation as to why only mammalian hepatitis B infection leads to hepatocellular carcinoma.
PMCID: PMC4263362  PMID: 25501991
22.  Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics 
eLife  2014;3:e03125.
Viruses modulate microbial communities and alter ecosystem functions. However, due to cultivation bottlenecks, specific virus–host interaction dynamics remain cryptic. In this study, we examined 127 single-cell amplified genomes (SAGs) from uncultivated SUP05 bacteria isolated from a model marine oxygen minimum zone (OMZ) to identify 69 viral contigs representing five new genera within dsDNA Caudovirales and ssDNA Microviridae. Infection frequencies suggest that ∼1/3 of SUP05 bacteria is viral-infected, with higher infection frequency where oxygen-deficiency was most severe. Observed Microviridae clonality suggests recovery of bloom-terminating viruses, while systematic co-infection between dsDNA and ssDNA viruses posits previously unrecognized cooperation modes. Analyses of 186 microbial and viral metagenomes revealed that SUP05 viruses persisted for years, but remained endemic to the OMZ. Finally, identification of virus-encoded dissimilatory sulfite reductase suggests SUP05 viruses reprogram their host's energy metabolism. Together, these results demonstrate closely coupled SUP05 virus–host co-evolutionary dynamics with the potential to modulate biogeochemical cycling in climate-critical and expanding OMZs.
eLife digest
Microorganisms help to drive a number of processes that recycle energy and nutrients, including elements such as carbon, nitrogen, and sulfur, around the Earth's ecosystems. Viruses that infect microbes can also affect these cycles by killing and breaking open microbial cells, or by reprogramming the cell's metabolism. However, as there are many different species of microbes and viruses —the vast majority of which cannot easily be grown in the laboratory— little is known about most virus–host interactions in natural ecosystems, especially in the oceans.
In the world's oceans, the concentration of oxygen dissolved in the water changes in different regions and at different depths. ‘Oxygen minimum zones’ occur globally throughout the oceans at depths of 200–1000 meters, and climate change is causing these zones to expand and intensify. Although a lack of oxygen is sometimes considered detrimental to living organisms, oxygen minimum zones appear to be rich with microbial life that is adapted to thrive under oxygen-starved conditions.
Sulfur-oxidizing bacteria are one of the most abundant groups of microbes in these oxygen minimum zones, and several of these bacteria are known to influence the recycling of chemical substances. Now, Roux et al. introduce a new method to identify viruses that infect the microbes in this environment, including those microbes that cannot be grown in the laboratory and which have previously remained largely unexplored.
The genomes of 127 individual bacterial cells —collected from an oxygen minimum zone in western Canada— were examined. Roux et al. estimate that about a third of the sulfur-oxidizing bacterial cells are infected by at least one virus, but often multiple viruses infected the same bacterium. Five new genera (groups of one or more species) of viruses were also discovered and found to infect these bacteria. Looking for these new viral sequences in the DNA of this oxygen minimum zone's microbial community revealed that these newly discovered viruses persist in this region over several years. It also revealed that these viruses appear to only be found within the oxygen minimum zone. Roux et al. uncovered that these viruses carry genes that could manipulate how an infected bacterium processes sulfur-containing compounds; this is similar to previous observations showing that other viruses also influence cellular process (such as photosynthesis) in infected bacteria. As such, these newly discovered viruses might also influence the recycling of chemical elements within oxygen minimum zones.
Together, Roux et al.'s findings provide an unprecedented look into a wild virus community using a method that can be generalized to uncover viruses in a data type that is quickly becoming more widespread: single cell genomes. This effort to understand virus–host interactions by looking in the genomes of individual cells now sets the stage for future efforts aimed to uncover the impact of viruses on bacteria in other environments across the globe.
PMCID: PMC4164917  PMID: 25171894
SUP05; bacteriophages; viruses; single cell genomics; oxygen minimum zone; viral dark matter; other
23.  Gene flow and biological conflict systems in the origin and evolution of eukaryotes 
The endosymbiotic origin of eukaryotes brought together two disparate genomes in the cell. Additionally, eukaryotic natural history has included other endosymbiotic events, phagotrophic consumption of organisms, and intimate interactions with viruses and endoparasites. These phenomena facilitated large-scale lateral gene transfer and biological conflicts. We synthesize information from nearly two decades of genomics to illustrate how the interplay between lateral gene transfer and biological conflicts has impacted the emergence of new adaptations in eukaryotes. Using apicomplexans as example, we illustrate how lateral transfer from animals has contributed to unique parasite-host interfaces comprised of adhesion- and O-linked glycosylation-related domains. Adaptations, emerging due to intense selection for diversity in the molecular participants in organismal and genomic conflicts, being dispersed by lateral transfer, were subsequently exapted for eukaryote-specific innovations. We illustrate this using examples relating to eukaryotic chromatin, RNAi and RNA-processing systems, signaling pathways, apoptosis and immunity. We highlight the major contributions from catalytic domains of bacterial toxin systems to the origin of signaling enzymes (e.g., ADP-ribosylation and small molecule messenger synthesis), mutagenic enzymes for immune receptor diversification and RNA-processing. Similarly, we discuss contributions of bacterial antibiotic/siderophore synthesis systems and intra-genomic and intra-cellular selfish elements (e.g., restriction-modification, mobile elements and lysogenic phages) in the emergence of chromatin remodeling/modifying enzymes and RNA-based regulation. We develop the concept that biological conflict systems served as evolutionary “nurseries” for innovations in the protein world, which were delivered to eukaryotes via lateral gene flow to spur key evolutionary innovations all the way from nucleogenesis to lineage-specific adaptations.
PMCID: PMC3417536  PMID: 22919680
antibiotics; biological conflict; endosymbiosis; immunity proteins; restriction-modfication; RNAi; selfish elements; toxins
24.  A Gene Transfer Agent and a Dynamic Repertoire of Secretion Systems Hold the Keys to the Explosive Radiation of the Emerging Pathogen Bartonella 
PLoS Genetics  2013;9(3):e1003393.
Gene transfer agents (GTAs) randomly transfer short fragments of a bacterial genome. A novel putative GTA was recently discovered in the mouse-infecting bacterium Bartonella grahamii. Although GTAs are widespread in phylogenetically diverse bacteria, their role in evolution is largely unknown. Here, we present a comparative analysis of 16 Bartonella genomes ranging from 1.4 to 2.6 Mb in size, including six novel genomes from Bartonella isolated from a cow, two moose, two dogs, and a kangaroo. A phylogenetic tree inferred from 428 orthologous core genes indicates that the deadly human pathogen B. bacilliformis is related to the ruminant-adapted clade, rather than being the earliest diverging species in the genus as previously thought. A gene flux analysis identified 12 genes for a GTA and a phage-derived origin of replication as the most conserved innovations. These are located in a region of a few hundred kb that also contains 8 insertions of gene clusters for type III, IV, and V secretion systems, and genes for putatively secreted molecules such as cholera-like toxins. The phylogenies indicate a recent transfer of seven genes in the virB gene cluster for a type IV secretion system from a cat-adapted B. henselae to a dog-adapted B. vinsonii strain. We show that the B. henselae GTA is functional and can transfer genes in vitro. We suggest that the maintenance of the GTA is driven by selection to increase the likelihood of horizontal gene transfer and argue that this process is beneficial at the population level, by facilitating adaptive evolution of the host-adaptation systems and thereby expansion of the host range size. The process counters gene loss and forces all cells to contribute to the production of the GTA and the secreted molecules. The results advance our understanding of the role that GTAs play for the evolution of bacterial genomes.
Author Summary
Viruses are selfish genetic elements that replicate and transfer their own DNA, often killing the host cell in the process. Unlike viruses, gene transfer agents (GTAs) transfer random pieces of the bacterial genome rather than their own DNA. GTAs are widespread in bacterial genomes, but it is not known whether they are beneficial to the bacterium. In this study, we have used the emerging pathogen Bartonella as our model to study the evolution of GTAs. We sequenced the genomes of six isolates of Bartonella, including two new strains isolated from wild moose in Sweden. Using a comparative genomics approach, we searched for innovations in the last common ancestor that could help explain the explosive radiation of the genus. Surprisingly, we found that a gene cluster for a GTA and a phage-derived origin of replication was the most conserved innovation, indicative of strong selective constraints. We argue that the reason for the remarkable stability of the GTA is that it provides a mechanism to duplicate and recombine genes for secretion systems. This leads to adaptability to a broad range of hosts.
PMCID: PMC3610622  PMID: 23555299
25.  The not so universal tree of life or the place of viruses in the living world 
Darwin provided a great unifying theory for biology; its visual expression is the universal tree of life. The tree concept is challenged by the occurrence of horizontal gene transfer and—as summarized in this review—by the omission of viruses. Microbial ecologists have demonstrated that viruses are the most numerous biological entities on earth, outnumbering cells by a factor of 10. Viral genomics have revealed an unexpected size and distinctness of the viral DNA sequence space. Comparative genomics has shown elements of vertical evolution in some groups of viruses. Furthermore, structural biology has demonstrated links between viruses infecting the three domains of life pointing to a very ancient origin of viruses. However, presently viruses do not find a place on the universal tree of life, which is thus only a tree of cellular life. In view of the polythetic nature of current life definitions, viruses cannot be dismissed as non-living material. On earth we have therefore at least two large DNA sequence spaces, one represented by capsid-encoding viruses and another by ribosome-encoding cells. Despite their probable distinct evolutionary origin, both spheres were and are connected by intensive two-way gene transfers.
PMCID: PMC2873004  PMID: 19571246
universal tree; viruses; phages

Results 1-25 (1138091)