Search tips
Search criteria

Results 26-50 (249)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
26.  “Megavirales”, a proposed new order for eukaryotic nucleocytoplasmic large DNA viruses 
Archives of virology  2013;158(12):2517-2521.
The nucleocytoplasmic large DNA viruses (NCLDVs) comprise a monophyletic group of viruses that infect animals and diverse unicellular eukaryotes. The NCLDV group includes the families Poxviridae, Asfarviridae, Iridoviridae, Ascoviridae, Phycodnaviridae, Mimiviridae and the proposed family “Marseilleviridae”. The family Mimiviridae includes the largest known viruses, with genomes in excess of one megabase, whereas the genome size in the other NCLDV families varies from 100 to 400 kilobase pairs. Most of the NCLDVs replicate in the cytoplasm of infected cells, within so-called virus factories. The NCLDVs share a common ancient origin, as demonstrated by evolutionary reconstructions that trace approximately 50 genes encoding key proteins involved in viral replication and virion formation to the last common ancestor of all these viruses. Taken together, these characteristics lead us to propose assigning an official taxonomic rank to the NCLDVs as the order “Megavirales”, in reference to the large size of the virions and genomes of these viruses.
PMCID: PMC4066373  PMID: 23812617
27.  Evolution of eukaryotic single-stranded DNA viruses of the Bidnaviridae family from genes of four other groups of widely different viruses 
Scientific Reports  2014;4:5347.
Single-stranded (ss)DNA viruses are extremely widespread, infect diverse hosts from all three domains of life and include important pathogens. Most ssDNA viruses possess small genomes that replicate by the rolling-circle-like mechanism initiated by a distinct virus-encoded endonuclease. However, viruses of the family Bidnaviridae, instead of the endonuclease, encode a protein-primed type B DNA polymerase (PolB) and hence break this pattern. We investigated the provenance of all bidnavirus genes and uncover an unexpected turbulent evolutionary history of these unique viruses. Our analysis strongly suggests that bidnaviruses evolved from a parvovirus ancestor from which they inherit a jelly-roll capsid protein and a superfamily 3 helicase. The radiation of bidnaviruses from parvoviruses was probably triggered by integration of the ancestral parvovirus genome into a large virus-derived DNA transposon of the Polinton (polintovirus) family resulting in the acquisition of the polintovirus PolB gene along with terminal inverted repeats. Bidnavirus genes for a receptor-binding protein and a potential novel antiviral defense modulator are derived from dsRNA viruses (Reoviridae) and dsDNA viruses (Baculoviridae), respectively. The unusual evolutionary history of bidnaviruses emphasizes the key role of horizontal gene transfer, sometimes between viruses with completely different genomes but occupying the same niche, in the emergence of new viral types.
PMCID: PMC4061559  PMID: 24939392
28.  Casposons: a new superfamily of self-synthesizing DNA transposons at the origin of prokaryotic CRISPR-Cas immunity 
BMC Biology  2014;12:36.
Diverse transposable elements are abundant in genomes of cellular organisms from all three domains of life. Although transposons are often regarded as junk DNA, a growing body of evidence indicates that they are behind some of the major evolutionary innovations. With the growth in the number and diversity of sequenced genomes, previously unnoticed mobile elements continue to be discovered.
We describe a new superfamily of archaeal and bacterial mobile elements which we denote casposons because they encode Cas1 endonuclease, a key enzyme of the CRISPR-Cas adaptive immunity systems of archaea and bacteria. The casposons share several features with self-synthesizing eukaryotic DNA transposons of the Polinton/Maverick class, including terminal inverted repeats and genes for B family DNA polymerases. However, unlike any other known mobile elements, the casposons are predicted to rely on Cas1 for integration and excision, via a mechanism similar to the integration of new spacers into CRISPR loci. We identify three distinct families of casposons that differ in their gene repertoires and evolutionary provenance of the DNA polymerases. Deep branching of the casposon-encoded endonuclease in the Cas1 phylogeny suggests that casposons played a pivotal role in the emergence of CRISPR-Cas immunity.
The casposons are a novel superfamily of mobile elements, the first family of putative self-synthesizing transposons discovered in prokaryotes. The likely contribution of capsosons to the evolution of CRISPR-Cas parallels the involvement of the RAG1 transposase in vertebrate immunoglobulin gene rearrangement, suggesting that recruitment of endonucleases from mobile elements as ready-made tools for genome manipulation is a general route of evolution of adaptive immunity.
PMCID: PMC4046053  PMID: 24884953
Mobile genetic elements; CRISPR-Cas system; Adaptive immunity; Transposons; Archaea; DNA polymerases
29.  Origin and Evolution of Eukaryotic Large Nucleo-Cytoplasmic DNA Viruses 
Intervirology  2010;53(5):284-292.
The nucleo-cytoplasmic large DNA viruses (NCLDV) constitute an apparently monophyletic group that consists of 6 families of viruses infecting a broad variety of eukaryotes. A comprehensive genome comparison and maximum-likelihood reconstruction of NCLDV evolution reveal a set of approximately 50 conserved genes that can be tentatively mapped to the genome of the common ancestor of this class of eukaryotic viruses. We address the origins and evolution of NCLDV.
Phylogenetic analysis indicates that some of the major clades of NCLDV infect diverse animals and protists, suggestive of early radiation of the NCLDV, possibly concomitant with eukaryogenesis. The core NCLDV genes seem to have originated from different sources including homologous genes of bacteriophages, bacteria and eukaryotes. These observations are compatible with a scenario of the origin of the NCLDV at an early stage of the evolution of eukaryotes through extensive mixing of genes from widely different genomes.
The common ancestor of the NCLDV probably evolved from a bacteriophage as a result of recruitment of numerous eukaryotic and some bacterial genes, and concomitant loss of the majority of phage genes except for a small core of genes coding for proteins essential for virus genome replication and virion formation.
PMCID: PMC2895762  PMID: 20551680
Bacteriophage; Eukaryogenesis; Nucleo-cytoplasmic large DNA viruses, evolution; Phylogenetic analysis
30.  Universal Pacemaker of Genome Evolution in Animals and Fungi and Variation of Evolutionary Rates in Diverse Organisms 
Genome Biology and Evolution  2014;6(6):1268-1278.
Gene evolution is traditionally considered within the framework of the molecular clock (MC) model whereby each gene is characterized by an approximately constant rate of evolution. Recent comparative analysis of numerous phylogenies of prokaryotic genes has shown that a different model of evolution, denoted the Universal PaceMaker (UPM), which postulates conservation of relative, rather than absolute evolutionary rates, yields a better fit to the phylogenetic data. Here, we show that the UPM model is a better fit than the MC for genome wide sets of phylogenetic trees from six species of Drosophila and nine species of yeast, with extremely high statistical significance. Unlike the prokaryotic phylogenies that include distant organisms and multiple horizontal gene transfers, these are simple data sets that cover groups of closely related organisms and consist of gene trees with the same topology as the species tree. The results indicate that both lineage-specific and gene-specific rates are important in genome evolution but the lineage-specific contribution is greater. Similar to the MC, the gene evolution rates under the UPM are strongly overdispersed, approximately 2-fold compared with the expectation from sampling error alone. However, we show that neither Drosophila nor yeast genes form distinct clusters in the tree space. Thus, the gene-specific deviations from the UPM, although substantial, are uncorrelated and most likely depend on selective factors that are largely unique to individual genes. Thus, the UPM appears to be a key feature of genome evolution across the history of cellular life.
PMCID: PMC4079209  PMID: 24812293
molecular clock; genome evolution; phylogenetic trees; relative evolution rates
31.  Evolution at protein ends: major contribution of alternative transcription initiation and termination to the transcriptome and proteome diversity in mammals 
Nucleic Acids Research  2014;42(11):7132-7144.
Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns.
PMCID: PMC4066770  PMID: 24792168
32.  Conservation of major and minor jelly-roll capsid proteins in Polinton (Maverick) transposons suggests that they are bona fide viruses 
Biology Direct  2014;9:6.
This article was reviewed by Lakshminarayan M. Iyer and I. King Jordan. For complete reviews, see the Reviewers’ Reports section.
Polintons (also known as Mavericks) and Tlr elements of Tetrahymena thermophila represent two families of large DNA transposons widespread in eukaryotes. Here, we show that both Polintons and Tlr elements encode two key virion proteins, the major capsid protein with the double jelly-roll fold and the minor capsid protein, known as the penton, with the single jelly-roll topology. This observation along with the previously noted conservation of the genes for viral genome packaging ATPase and adenovirus-like protease strongly suggests that Polintons and Tlr elements combine features of bona fide viruses and transposons. We propose the name ‘Polintoviruses’ to denote these putative viruses that could have played a central role in the evolution of several groups of DNA viruses of eukaryotes.
PMCID: PMC4028283  PMID: 24773695
Polintons; Mavericks; Transposable elements; Double jelly-roll fold; Capsid proteins; Virus evolution
33.  Open Questions on the Origin of Life at Anoxic Geothermal Fields 
We have recently reconstructed the ‘hatcheries’ of the first cells by combining geochemical analysis with phylogenomic scrutiny of the inorganic ion requirements of universal components of modern cells (Mulkidjanian et al.: Origin of first cells at terrestrial, anoxic geothermal fields. Proc Natl Acad Sci USA 2012, 109:E821–830). These ubiquitous, and by inference primordial, proteins and functional systems show affinity to and functional requirement for K+, Zn2+, Mn2+, and phosphate. Thus, protocells must have evolved in habitats with a high K+/Na+ ratio and relatively high concentrations of Zn, Mn and phosphorous compounds. Geochemical reconstruction shows that the ionic composition conducive to the origin of cells could not have existed in marine settings but is compatible with emissions of vapor-dominated zones of inland geothermal systems. Under anoxic, CO2-dominated atmosphere, the ionic composition of pools of cool, condensed vapor at anoxic geothermal fields would resemble the internal milieu of modern cells. Such pools would be lined with porous silicate minerals mixed with metal sulfides and enriched in K+ ions and phosphorous compounds.
Here we address some questions that have appeared in print after the publication of our anoxic geothermal field scenario. We argue that anoxic geothermal fields, which were identified as likely cradles of life by using a top-down approach and phylogenomics analysis as a tool, could provide geochemical conditions similar to those which were suggested as most conducive for the emergence of life by the chemists who pursuit the complementary bottom-up strategy.
PMCID: PMC3997052  PMID: 23132762
34.  CRISPR-Cas: an adaptive immunity system in prokaryotes 
Most of the archaea and numerous bacteria possess an elaborate system of adaptive immunity to mobile genetic elements known as the CRISPR (clustered regularly interspaced short palindromic repeats)-associated system (CRISPR-Cas), which consists of arrays of short repeats interspersed with unique DNA spacers and adjacent operons encompassing CRISPR-associated (cas) genes with predicted and, in some cases, experimentally validated nuclease, helicase, and polymerase activities. The system functions by integrating fragments of alien DNA between the repeats and employing their transcripts to degrade the DNA of the respective invading elements via an RNA interference-like mechanism. The CRISPR-Cas system is a case of apparent Lamarckian inheritance.
PMCID: PMC2884157  PMID: 20556198
35.  Classification and evolution of type II CRISPR-Cas systems 
Nucleic Acids Research  2014;42(10):6091-6105.
The CRISPR-Cas systems of archaeal and bacterial adaptive immunity are classified into three types that differ by the repertoires of CRISPR-associated (cas) genes, the organization of cas operons and the structure of repeats in the CRISPR arrays. The simplest among the CRISPR-Cas systems is type II in which the endonuclease activities required for the interference with foreign deoxyribonucleic acid (DNA) are concentrated in a single multidomain protein, Cas9, and are guided by a co-processed dual-tracrRNA:crRNA molecule. This compact enzymatic machinery and readily programmable site-specific DNA targeting make type II systems top candidates for a new generation of powerful tools for genomic engineering. Here we report an updated census of CRISPR-Cas systems in bacterial and archaeal genomes. Type II systems are the rarest, missing in archaea, and represented in ∼5% of bacterial genomes, with an over-representation among pathogens and commensals. Phylogenomic analysis suggests that at least three cas genes, cas1, cas2 and cas4, and the CRISPR repeats of the type II-B system were acquired via recombination with a type I CRISPR-Cas locus. Distant homologs of Cas9 were identified among proteins encoded by diverse transposons, suggesting that type II CRISPR-Cas evolved via recombination of mobile nuclease genes with type I loci.
PMCID: PMC4041416  PMID: 24728998
36.  Archaeal Ubiquitin-Like Proteins: Functional Versatility and Putative Ancestral Involvement in tRNA Modification Revealed by Comparative Genomic Analysis 
Archaea  2010;2010:710303.
The recent discovery of protein modification by SAMPs, ubiquitin-like (Ubl) proteins from the archaeon Haloferax volcanii, prompted a comprehensive comparative-genomic analysis of archaeal Ubl protein genes and the genes for enzymes thought to be functionally associated with Ubl proteins. This analysis showed that most archaea encode members of two major groups of Ubl proteins with the β-grasp fold, the ThiS and MoaD families, and indicated that the ThiS family genes are rarely linked to genes for thiamine or Mo/W cofactor metabolism enzymes but instead are most often associated with genes for enzymes of tRNA modification. Therefore it is hypothesized that the ancestral function of the archaeal Ubl proteins is sulfur insertion into modified nucleotides in tRNAs, an activity analogous to that of the URM1 protein in eukaryotes. Together with additional, previously described genomic associations, these findings indicate that systems for protein quality control operating at different levels, including tRNA modification that controls translation fidelity, protein ubiquitination that regulates protein degradation, and, possibly, mRNA degradation by the exosome, are functionally and evolutionarily linked.
PMCID: PMC2948915  PMID: 20936112
37.  Intron-Dominated Genomes of Early Ancestors of Eukaryotes 
Journal of Heredity  2009;100(5):618-623.
Evolutionary reconstructions using maximum likelihood methods point to unexpectedly high densities of introns in protein-coding genes of ancestral eukaryotic forms including the last common ancestor of all extant eukaryotes. Combined with the evidence of the origin of spliceosomal introns from invading Group II self-splicing introns, these results suggest that early ancestral eukaryotic genomes consisted of up to 80% sequences derived from Group II introns, a much greater contribution of introns than that seen in any extant genome. An organism with such an unusual genome architecture could survive only under conditions of a severe population bottleneck.
PMCID: PMC2877545  PMID: 19617525
effective population size; endosymbiosis; group II self-splicing introns; origin of eukaryotes; spliceosomal introns
38.  The Role of Energy in the Emergence of Biology from Chemistry 
Any scenario of the transition from chemistry to biology should include an “energy module” because life can exist only when supported by energy flow(s). We addressed the problem of primordial energetics by combining physico-chemical considerations with phylogenomic analysis. We propose that the first replicators could use abiotically formed, exceptionally photostable activated nucleotides both as building blocks and as the main energy source. Nucleoside triphosphates could replace cyclic nucleotides as the principal energy-rich compounds at the stage of the first cells, presumably because the metal chelates of nucleoside triphosphates penetrated membranes much better than the respective metal complexes of nucleoside monophosphates. The ability to exploit natural energy flows for biogenic production of energy-rich molecules could evolve only gradually, after the emergence of sophisticated enzymes and ion-tight membranes. We argue that, in the course of evolution, sodium-dependent membrane energetics preceded the proton-based energetics which evolved independently in bacteria and archaea.
PMCID: PMC3974900  PMID: 23100130
39.  EREM: Parameter Estimation and Ancestral Reconstruction by Expectation-Maximization Algorithm for a Probabilistic Model of Genomic Binary Characters Evolution 
Advances in Bioinformatics  2010;2010:167408.
Evolutionary binary characters are features of species or genes, indicating the absence (value zero) or presence (value one) of some property. Examples include eukaryotic gene architecture (the presence or absence of an intron in a particular locus), gene content, and morphological characters. In many studies, the acquisition of such binary characters is assumed to represent a rare evolutionary event, and consequently, their evolution is analyzed using various flavors of parsimony. However, when gain and loss of the character are not rare enough, a probabilistic analysis becomes essential. Here, we present a comprehensive probabilistic model to describe the evolution of binary characters on a bifurcating phylogenetic tree. A fast software tool, EREM, is provided, using maximum likelihood to estimate the parameters of the model and to reconstruct ancestral states (presence and absence in internal nodes) and events (gain and loss events along branches).
PMCID: PMC2866244  PMID: 20467467
40.  Horizontal Gene Transfer Can Rescue Prokaryotes from Muller’s Ratchet: Benefit of DNA from Dead Cells and Population Subdivision 
G3: Genes|Genomes|Genetics  2013;4(2):325-339.
Horizontal gene transfer (HGT) is a major factor in the evolution of prokaryotes. An intriguing question is whether HGT is maintained during evolution of prokaryotes owing to its adaptive value or is a byproduct of selection driven by other factors such as consumption of extracellular DNA (eDNA) as a nutrient. One hypothesis posits that HGT can restore genes inactivated by mutations and thereby prevent stochastic, irreversible deterioration of genomes in finite populations known as Muller’s ratchet. To examine this hypothesis, we developed a population genetic model of prokaryotes undergoing HGT via homologous recombination. Analysis of this model indicates that HGT can prevent the operation of Muller’s ratchet even when the source of transferred genes is eDNA that comes from dead cells and on average carries more deleterious mutations than the DNA of recipient live cells. Moreover, if HGT is sufficiently frequent and eDNA diffusion sufficiently rapid, a subdivided population is shown to be more resistant to Muller’s ratchet than an undivided population of an equal overall size. Thus, to maintain genomic information in the face of Muller’s ratchet, it is more advantageous to partition individuals into multiple subpopulations and let them “cross-reference” each other’s genetic information through HGT than to collect all individuals in one population and thereby maximize the efficacy of natural selection. Taken together, the results suggest that HGT could be an important condition for the long-term maintenance of genomic information in prokaryotes through the prevention of Muller’s ratchet.
PMCID: PMC3931566  PMID: 24347631
environmental DNA; evolution of transformation; competence; structured population; soil bacteria
Methods in molecular biology (Clifton, N.J.)  2012;856:10.1007/978-1-61779-585-5_3.
Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance (SD) method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the applications methods used to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a ‘species tree’.
PMCID: PMC3842619  PMID: 22399455
Forest of life; tree of life; phylogenomic methods; tree comparison; map of quartets
42.  Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems 
Nucleic Acids Research  2013;42(4):2577-2590.
The CRISPR-Cas-derived RNA-guided Cas9 endonuclease is the key element of an emerging promising technology for genome engineering in a broad range of cells and organisms. The DNA-targeting mechanism of the type II CRISPR-Cas system involves maturation of tracrRNA:crRNA duplex (dual-RNA), which directs Cas9 to cleave invading DNA in a sequence-specific manner, dependent on the presence of a Protospacer Adjacent Motif (PAM) on the target. We show that evolution of dual-RNA and Cas9 in bacteria produced remarkable sequence diversity. We selected eight representatives of phylogenetically defined type II CRISPR-Cas groups to analyze possible coevolution of Cas9 and dual-RNA. We demonstrate that these two components are interchangeable only between closely related type II systems when the PAM sequence is adjusted to the investigated Cas9 protein. Comparison of the taxonomy of bacterial species that harbor type II CRISPR-Cas systems with the Cas9 phylogeny corroborates horizontal transfer of the CRISPR-Cas loci. The reported collection of dual-RNA:Cas9 with associated PAMs expands the possibilities for multiplex genome editing and could provide means to improve the specificity of the RNA-programmable Cas9 tool.
PMCID: PMC3936727  PMID: 24270795
43.  Pandoraviruses are highly derived phycodnaviruses 
Biology Direct  2013;8:25.
The recently discovered Pandoraviruses are by far the largest viruses known, with their 2 megabase genomes exceeding in size the genomes of numerous bacteria and archaea. Pandoraviruses show a distant relationship with other nucleocytoplasmic large DNA viruses (NCLDV) of eukaryotes, lack some of the NCLDV core genes and in particular do not appear to be specifically related to the other, better characterized family of giant viruses, the Mimiviridae. Here we report phylogenetic analysis of 6 core NCLDV genes that confidently places Pandoraviruses within the family Phycodnaviridae, with an apparent specific affinity with Coccolithoviruses. We conclude that, despite their many unusual characteristics, Pandoraviruses are highly derived phycodnaviruses. These findings imply that giant viruses have independently evolved from smaller NCLDV on at least two occasions.
This article was reviewed by Patrick Forterre and Lakshminarayan Iyer. For the full reviews, see the Reviewers’ reports section.
PMCID: PMC3924356  PMID: 24148757
44.  Planctomycetes and eukaryotes: a case of analogy not homology 
Planctomycetes, Verrucomicrobia and Chlamydia are prokaryotic phyla that are sometimes grouped together as the PVC superphylum of eubacteria. Some PVC species possess interesting attributes, in particular, internal membranes that superficially resemble eukaryotic endomembranes. Some biologists now claim that PVC bacteria are nucleus-bearing prokaryotes and that they are evolutionary intermediates in the transition from prokaryote to eukaryote. PVC prokaryotes do not possess a nucleus and are not intermediates in the prokaryote-to-eukaryote transition. All of the PVC traits that are currently cited as evidence for aspiring eukaryoticity are either analogous (the result of convergent evolution), not homologous, to eukaryotic traits; or else they are the result of lateral gene transfers. Here we summarize the evidence that shows why most of the purported similarities between the PVC bacteria and eukaryotes are analogous and the rest are consequence of lateral gene acquisition.
PMCID: PMC3795523  PMID: 21858844
45.  Differences in DNA methylation between human neuronal and glial cells are concentrated in enhancers and non-CpG sites 
Nucleic Acids Research  2013;42(1):109-127.
We applied Illumina Human Methylation450K array to perform a genomic-scale single-site resolution DNA methylation analysis in neuronal and nonneuronal (primarily glial) nuclei separated from the orbitofrontal cortex of postmortem human brain. The findings were validated using enhanced reduced representation bisulfite sequencing. We identified thousands of sites differentially methylated (DM) between neuronal and nonneuronal cells. The DM sites were depleted within CpG-island–containing promoters but enriched in predicted enhancers. Classification of the DM sites into those undermethylated in neurons (neuronal type) and those undermethylated in nonneuronal cells (glial type), combined with findings of others that methylation within control elements typically negatively correlates with gene expression, yielded large sets of predicted neuron-specific and non–neuron-specific genes. These sets of predicted genes were in excellent agreement with the available direct measurements of gene expression in human and mouse. We also found a distinct set of DNA methylation patterns that were unique for neuronal cells. In particular, neuronal-type differential methylation was overrepresented in CpG island shores, enriched within gene bodies but not in intergenic regions, and preferentially harbored binding motifs for a distinct set of transcription factors, including neuron-specific activity-dependent factors. Finally, non-CpG methylation was substantially more prevalent in neurons than in nonneuronal cells.
PMCID: PMC3874157  PMID: 24057217
46.  Evolution of gene fusions: horizontal transfer versus independent events 
Genome Biology  2002;3(5):research0024.1-research0024.13.
Gene fusions can be used as tools for functional prediction and also as evolutionary markers. Fused genes often show a scattered phyletic distribution, which suggests a role for processes other than vertical inheritance in their evolution.
The evolutionary history of gene fusions was studied by phylogenetic analysis of the domains in the fused proteins and the orthologous domains that form stand-alone proteins. Clustering of fusion components from phylogenetically distant species was construed as evidence of dissemination of the fused genes by horizontal transfer. Of the 51 examined gene fusions that are represented in at least two of the three primary kingdoms (Bacteria, Archaea and Eukaryota), 31 were most probably disseminated by cross-kingdom horizontal gene transfer, whereas 14 appeared to have evolved independently in different kingdoms and two were probably inherited from the common ancestor of modern life forms. On many occasions, the evolutionary scenario also involves one or more secondary fissions of the fusion gene. For approximately half of the fusions, stand-alone forms of the fusion components are encoded by juxtaposed genes, which are known or predicted to belong to the same operon in some of the prokaryotic genomes. This indicates that evolution of gene fusions often, if not always, involves an intermediate stage, during which the future fusion components exist as juxtaposed and co-regulated, but still distinct, genes within operons.
These findings suggest a major role for horizontal transfer of gene fusions in the evolution of protein-domain architectures, but also indicate that independent fusions of the same pair of domains in distant species is not uncommon, which suggests positive selection for the multidomain architectures.
PMCID: PMC115226  PMID: 12049665
47.  Quod erat demonstrandum? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences 
Genome Biology  2001;2(12):research0051.1-research0051.11.
Computational predictions are critical for directing the experimental study of protein functions. Therefore it is paradoxical when an apparently erroneous computational prediction seems to be supported by experiment.
We analyzed six cases where application of novel or conventional computational methods for protein sequence and structure analysis led to non-trivial predictions that were subsequently supported by direct experiments. We show that, on all six occasions, the original prediction was unjustified, and in at least three cases, an alternative, well-supported computational prediction, incompatible with the original one, could be derived. The most unusual cases involved the identification of an archaeal cysteinyl-tRNA synthetase, a dihydropteroate synthase and a thymidylate synthase, for which experimental verifications of apparently erroneous computational predictions were reported. Using sequence-profile analysis, multiple alignment and secondary-structure prediction, we have identified the unique archaeal 'cysteinyl-tRNA synthetase' as a homolog of extracellular polygalactosaminidases, and the 'dihydropteroate synthase' as a member of the β-lactamase-like superfamily of metal-dependent hydrolases.
In each of the analyzed cases, the original computational predictions could be refuted and, in some instances, alternative strongly supported predictions were obtained. The nature of the experimental evidence that appears to support these predictions remains an open question. Some of these experiments might signify discovery of extremely unusual forms of the respective enzymes, whereas the results of others could be due to artifacts.
PMCID: PMC64836  PMID: 11790254
48.  Two C or not two C: recurrent disruption of Zn-ribbons, gene duplication, lineage-specific gene loss, and horizontal gene transfer in evolution of bacterial ribosomal proteins 
Genome Biology  2001;2(9):research0033.1-research0033.14.
Ribosomal proteins are encoded in all genomes of cellular life forms and are, generally, well conserved during evolution. In prokaryotes, the genes for most ribosomal proteins are clustered in several highly conserved operons, which ensures efficient co-regulation of their expression. Duplications of ribosomal-protein genes are infrequent, and given their coordinated expression and functioning, it is generally assumed that ribosomal-protein genes are unlikely to undergo horizontal transfer. However, with the accumulation of numerous complete genome sequences of prokaryotes, several paralogous pairs of ribosomal protein genes have been identified. Here we analyze all such cases and attempt to reconstruct the evolutionary history of these ribosomal proteins.
Complete bacterial genomes were searched for duplications of ribosomal proteins. Ribosomal proteins L36, L33, L31, S14 are each duplicated in several bacterial genomes and ribosomal proteins L11, L28, L7/L12, S1, S15, S18 are so far duplicated in only one genome each. Sequence analysis of the four ribosomal proteins, for which paralogs were detected in several genomes, two of the ribosomal proteins duplicated in one genome (L28 and S18), and the ribosomal protein L32 showed that each of them comes in two distinct versions. One form contains a predicted metal-binding Zn-ribbon that consists of four conserved cysteines (in some cases replaced by histidines), whereas, in the second form, these metal-chelating residues are completely or partially replaced. Typically, genomes containing paralogous genes for these ribosomal proteins encode both versions, designated C+ and C-, respectively. Analysis of phylogenetic trees for these seven ribosomal proteins, combined with comparison of genomic contexts for the respective genes, indicates that in most, if not all cases, their evolution involved a duplication of the ancestral C+ form early in bacterial evolution, with subsequent alternative loss of the C+ and C- forms in different lineages. Additionally, evidence was obtained for a role of horizontal gene transfer in the evolution of these ribosomal proteins, with multiple cases of gene displacement 'in situ', that is, without a change of the gene order in the recipient genome.
A more complex picture of evolution of bacterial ribosomal proteins than previously suspected is emerging from these results, with major contributions of lineage-specific gene loss and horizontal gene transfer. The recurrent theme of emergence and disruption of Zn-ribbons in bacterial ribosomal proteins awaits a functional interpretation.
PMCID: PMC56895  PMID: 11574053
49.  Parabolic replicator dynamics and the principle of minimum Tsallis information gain 
Biology Direct  2013;8:19.
Non-linear, parabolic (sub-exponential) and hyperbolic (super-exponential) models of prebiological evolution of molecular replicators have been proposed and extensively studied. The parabolic models appear to be the most realistic approximations of real-life replicator systems due primarily to product inhibition. Unlike the more traditional exponential models, the distribution of individual frequencies in an evolving parabolic population is not described by the Maximum Entropy (MaxEnt) Principle in its traditional form, whereby the distribution with the maximum Shannon entropy is chosen among all the distributions that are possible under the given constraints. We sought to identify a more general form of the MaxEnt principle that would be applicable to parabolic growth.
We consider a model of a population that reproduces according to the parabolic growth law and show that the frequencies of individuals in the population minimize the Tsallis relative entropy (non-additive information gain) at each time moment. Next, we consider a model of a parabolically growing population that maintains a constant total size and provide an “implicit” solution for this system. We show that in this case, the frequencies of the individuals in the population also minimize the Tsallis information gain at each moment of the ‘internal time” of the population.
The results of this analysis show that the general MaxEnt principle is the underlying law for the evolution of a broad class of replicator systems including not only exponential but also parabolic and hyperbolic systems. The choice of the appropriate entropy (information) function depends on the growth dynamics of a particular class of systems. The Tsallis entropy is non-additive for independent subsystems, i.e. the information on the subsystems is insufficient to describe the system as a whole. In the context of prebiotic evolution, this “non-reductionist” nature of parabolic replicator systems might reflect the importance of group selection and competition between ensembles of cooperating replicators.
This article was reviewed by Viswanadham Sridhara (nominated by Claus Wilke), Puushottam Dixit (nominated by Sergei Maslov), and Nick Grishin. For the complete reviews, see the Reviewers’ Reports section.
PMCID: PMC3765284  PMID: 23937956
Replicator equation; Parabolic growth; Tsallis entropy; Non-extensive statistical mechanics; MaxEnt principle

Results 26-50 (249)