Ciliates are an ancient and diverse group of microbial eukaryotes that have emerged as powerful models for RNA-mediated epigenetic inheritance. They possess extensive sets of both tiny and long noncoding RNAs that, together with a suite of proteins that includes transposases, orchestrate a broad cascade of genome rearrangements during somatic nuclear development. This Review emphasizes three important themes: the remarkable role of RNA in shaping genome structure, recent discoveries that unify many deeply diverged ciliate genetic systems, and a surprising evolutionary “sign change” in the role of small RNAs between major species groups.
Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (http://eeb.princeton.edu/lucapedia), a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.
Genome duality in ciliated protozoa offers a unique system to showcase their epigenome as a model of inheritance. In Oxytricha, the somatic genome is responsible for vegetative growth, while the germline contributes DNA to the next sexual generation. Somatic nuclear development removes all transposons and other so-called “junk DNA”, which comprise ~95% of the germline. We demonstrate that Piwi-interacting small RNAs (piRNAs) from the maternal nucleus can specify genomic regions for retention in this process. Oxytricha piRNAs map primarily to the somatic genome, representing the ~5% of the germline that is retained. Furthermore, injection of synthetic piRNAs corresponding to normally-deleted regions leads to their retention in later generations. Our findings highlight small RNAs (sRNAs) as powerful transgenerational carriers of epigenetic information for genome programming.
Several independent lines of evidence suggest that the modern genetic system was preceded by the ‘RNA world’ in which RNA genes encoded RNA catalysts. Current gaps in our conceptual framework of early genetic systems make it difficult to imagine how a stable RNA genome may have functioned and how the transition to a DNA genome could have taken place. Here we use the single-celled ciliate, Oxytricha, as an analog to some of the genetic and genomic traits that may have been present in organisms before and during the establishment of a DNA genome. Oxytricha and its close relatives have a unique genome architecture involving two differentiated nuclei, one of which encodes the genome on small, linear nanochromosomes. While its unique genomic characteristics are relatively modern, some physiological processes related to the genomes and nuclei of Oxytricha may exemplify primitive states of the developing genetic system.
Ciliated protists rearrange their genomes dramatically during nuclear development via chromosome fragmentation and DNA deletion to produce a trimmer and highly reorganized somatic genome. The deleted portion of the genome includes potentially active transposons or transposon-like sequences that reside in the germline. Three independent studies recently showed that transposase proteins of the DDE/DDD superfamily are indispensible for DNA processing in three distantly related ciliates. In the spirotrich Oxytricha trifallax, high copy-number germline-limited transposons mediate their own excision from the somatic genome but also contribute to programmed genome rearrangement through a remarkable transposon mutualism with the host. By contrast, the genomes of two oligohymenophorean ciliates, Tetrahymena thermophila and Paramecium tetraurelia, encode homologous PiggyBac-like transposases as single-copy genes in both their germline and somatic genomes. These domesticated transposases are essential for deletion of thousands of different internal sequences in these species. This review contrasts the events underlying somatic genome reduction in three different ciliates and considers their evolutionary origins and the relationships among their distinct mechanisms for genome remodeling.
Oxytricha trifallax — an established model organism for studying genome rearrangements, chromosome structure, scrambled genes, RNA-mediated epigenetic inheritance, and other phenomena — has been the subject of a nomenclature controversy for several years. Originally isolated as a sibling species of O. fallax, O. trifallax was reclassified in 1999 as Sterkiella histriomuscorum, a previously identified species, based on morphological similarity. The proper identification of O. trifallax is crucial to resolve in order to prevent confusion in both the comparative genomics and the general scientific communities. We analyzed nine conserved nuclear gene sequences between the two given species and several related ciliates. Phylogenetic analyses suggest that O. trifallax and a bona fide S. histriomuscorum have accumulated significant evolutionary divergence from each other relative to other ciliates such that they should be unequivocally classified as separate species. We also describe the original isolation of O. trifallax, including its comparison to O. fallax, and we provide criteria to identify future isolates of O. trifallax.
Oxytricha fallax; Oxytricha trifallax; Sterkiella histriomuscorum; ciliate; spirotrich; hypotrich; evolution; phylogeny; concatenated tree
Interchromosomal chimeric RNA molecules are often transcription products from genomic rearrangement in cancerous cells. Here we report the computational detection of an interchromosomal RNA fusion between ZC3HAV1L and CHMP1A from RNA-seq data of normal human mammary epithelial cells, and experimental confirmation of the chimeric transcript in multiple human cells and tissues. Our experimental characterization also detected three variants of the ZC3HAV1L-CHMP1A chimeric RNA, suggesting that these genes are involved in complex splicing. The fusion sequence at the novel exon-exon boundary, and the absence of corresponding DNA rearrangement suggest that this chimeric RNA is likely produced by trans-splicing in human cells.
This article was reviewed by Rory Johnson (nominated by Fyodor Kondrashov); Gal Avital and Itai Yanai
Chimeric transcripts; RNA fusion; trans-splicing; Genome rearrangement
RNA, normally thought of as a conduit in gene expression, has a novel mode of action in ciliated protozoa. Maternal RNA templates provide both an organizing guide for DNA rearrangements and a template that can transport somatic mutations to the next generation. This opportunity for RNA-mediated genome rearrangement and DNA repair is profound in the ciliate Oxytricha, which deletes 95% of its germline genome during development in a process that severely fragments its chromosomes and then sorts and reorders the hundreds of thousands of pieces remaining. Oxytricha’s somatic nuclear genome is therefore an epigenome formed through RNA templates and signals arising from the previous generation. Furthermore, this mechanism of RNA-mediated epigenetic inheritance can function across multiple generations, and the discovery of maternal template RNA molecules has revealed new biological roles for RNA and has hinted at the power of RNA molecules to sculpt genomic information in cells.
noncoding RNA; maternal inheritance; Lamarckian inheritance; scrambled genes; ciliates; Oxytricha
Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition.
Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure.
Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.
In a process similar to exon splicing, ciliates use DNA splicing to produce a new somatic macronuclear genome from their germline micronuclear genome after sexual reproduction. This extra layer of DNA rearrangement permits novel mechanisms to create genetic complexity during both evolution and development. Here we describe a chimeric macronuclear chromosome in Oxytricha trifallax constructed from two smaller macronuclear chromosomes. To determine how the chimera was generated, we cloned and sequenced the corresponding germline loci. The chimera derives from a novel locus in the micronucleus that arose by partial duplication of the loci for the two smaller chromosomes. This suggests that an exon shuffling-like process, which we call MDS shuffling, enables ciliates to generate novel genetic material and gene products using different combinations of genomic DNA segments.
Ciliate; Duplication; Gene shuffling; Chimeric chromosome; Micronucleus; Oxytricha
Despite comprising much of the eukaryotic genome, few transposons are active, and they usually confer no benefit to the host. Through an exaggerated process of genome rearrangement, Oxytricha trifallax destroys 95% of its germline genome during development. This includes the elimination of all transposon DNA. We show that germline-limited transposase genes play key roles in this process of genome-wide DNA excision, which suggests that transposases function in large eukaryotic genomes containing thousands of active transposons. We show that transposase gene expression occurs during germline-soma differentiation and that silencing of transposase by RNA interference leads to abnormal DNA rearrangement in the offspring. This study suggests a new important role in Oxytricha for this large portion of genomic DNA that was previously thought of as junk.
Cytosine methylation of DNA is conserved across eukaryotes and plays important functional roles regulating gene expression during differentiation and development in animals, plants and fungi. Hydroxymethylation was recently identified as another epigenetic modification marking genes important for pluripotency in embryonic stem cells.
Here we describe de novo cytosine methylation and hydroxymethylation in the ciliate Oxytricha trifallax. These DNA modifications occur only during nuclear development and programmed genome rearrangement. We detect methylcytosine and hydroxymethylcytosine directly by high-resolution nano-flow UPLC mass spectrometry, and indirectly by immunofluorescence, methyl-DNA immunoprecipitation and bisulfite sequencing. We describe these modifications in three classes of eliminated DNA: germline-limited transposons and satellite repeats, aberrant DNA rearrangements, and DNA from the parental genome undergoing degradation. Methylation and hydroxymethylation generally occur on the same sequence elements, modifying cytosines in all sequence contexts. We show that the DNA methyltransferase-inhibiting drugs azacitidine and decitabine induce demethylation of both somatic and germline sequence elements during genome rearrangements, with consequent elevated levels of germline-limited repetitive elements in exconjugant cells.
These data strongly support a functional link between cytosine DNA methylation/hydroxymethylation and DNA elimination. We identify a motif strongly enriched in methylated/hydroxymethylated regions, and we propose that this motif recruits DNA modification machinery to specific chromosomes in the parental macronucleus. No recognizable methyltransferase enzyme has yet been described in O. trifallax, raising the possibility that it might employ a novel cytosine methylation machinery to mark DNA sequences for elimination during genome rearrangements.
epigenetics; DNA degradation; heterochromatin; methyltransferase; 5-Aza-2'-deoxycitidine; 5-azacytidine; azacitidine; decitabine
The Oxytricha trifallax mitochondrial genome contains the largest sequenced ciliate mitochondrial chromosome (∼70 kb) plus a ∼5-kb linear plasmid bearing mitochondrial telomeres. We identify two new ciliate split genes (rps3 and nad2) as well as four new mitochondrial genes (ribosomal small subunit protein genes: rps- 2, 7, 8, 10), previously undetected in ciliates due to their extreme divergence. The increased size of the Oxytricha mitochondrial genome relative to other ciliates is primarily a consequence of terminal expansions, rather than the retention of ancestral mitochondrial genes. Successive segmental duplications, visible in one of the two Oxytricha mitochondrial subterminal regions, appear to have contributed to the genome expansion. Consistent with pseudogene formation and decay, the subtermini possess shorter, more loosely packed open reading frames than the remainder of the genome. The mitochondrial plasmid shares a 251-bp region with 82% identity to the mitochondrial chromosome, suggesting that it most likely integrated into the chromosome at least once. This region on the chromosome is also close to the end of the most terminal member of a series of duplications, hinting at a possible association between the plasmid and the duplications. The presence of mitochondrial telomeres on the mitochondrial plasmid suggests that such plasmids may be a vehicle for lateral transfer of telomeric sequences between mitochondrial genomes. We conjecture that the extreme divergence observed in ciliate mitochondrial genomes may be due, in part, to repeated invasions by relatively error-prone DNA polymerase-bearing mobile elements.
split genes; segmental duplication; genome expansion; linear mitochondrial plasmid; mobile elements; extreme mitochondrial divergences
We took advantage of the unusual genomic organization of the ciliate Oxytricha trifallax to screen for eukaryotic non-coding RNA (ncRNA) genes. Ciliates have two types of nuclei: a germ line micronucleus that is usually transcriptionally inactive, and a somatic macronucleus that contains a reduced, fragmented and rearranged genome that expresses all genes required for growth and asexual reproduction. In some ciliates including Oxytricha, the macronuclear genome is particularly extreme, consisting of thousands of tiny ‘nanochromosomes’, each of which usually contains only a single gene. Because the organism itself identifies and isolates most of its genes on single-gene nanochromosomes, nanochromosome structure could facilitate the discovery of unusual genes or gene classes, such as ncRNA genes. Using a draft Oxytricha genome assembly and a custom-written protein-coding genefinding program, we identified a subset of nanochromosomes that lack any detectable protein-coding gene, thereby strongly enriching for nanochromosomes that carry ncRNA genes. We found only a small proportion of non-coding nanochromosomes, suggesting that Oxytricha has few independent ncRNA genes besides homologs of already known RNAs. Other than new members of known ncRNA classes including C/D and H/ACA snoRNAs, our screen identified one new family of small RNA genes, named the Arisong RNAs, which share some of the features of small nuclear RNAs.
2009 marks not only the 200th anniversary of Darwin's birth but also publication of the first scientific evolutionary theory, Lamarck's Philosophie Zoologique. While Lamarck embraced the notion of the inheritance of acquired characters, he did not invent it . New phenomena discovered recently offer molecular pathways for the transmission of several acquired characters. Ciliates have long provided model systems to study phenomena that bypass traditional modes of inheritance. RNA, normally thought of as a conduit in gene expression, displays a novel mode of action in ciliated protozoa. For example, maternal RNA templates provide both an organizing guide for DNA rearrangements in Oxytricha and a template that can transmit spontaneous mutations that may arise during somatic growth to the next generation, providing two such mechanisms of so-called Lamarckian inheritance. This suggests that the somatic ciliate genome is really an "epigenome", formed through templates and signals arising from the previous generation. This review will discuss these new biological roles for RNA, including noncoding "template" RNA molecules. The evolutionary consequences of viable mechanisms in ciliates to transmit acquired characters may create an additional store of heritable variation that contributes to the cosmopolitan success of this diverse lineage of microbial eukaryotes.
Genome-wide DNA rearrangements occur in many eukaryotes but are most exaggerated in ciliates, making them ideal model systems for epigenetic phenomena. During development of the somatic macronucleus, Oxytricha trifallax destroys 95% of its germ line, severely fragmenting its chromosomes, and then unscrambles hundreds of thousands of remaining fragments by permutation or inversion. Here we demonstrate that DNA or RNA templates can orchestrate these genome rearrangements in Oxytricha, supporting an epigenetic model for sequence-dependent comparison between germline and somatic genomes. A complete RNA cache of the maternal somatic genome may be available at a specific stage during development to provide a template for correct and precise DNA rearrangement. We show the existence of maternal RNA templates that could guide DNA assembly, and that disruption of specific RNA molecules disables rearrangement of the corresponding gene. Injection of artificial templates reprogrammes the DNA rearrangement pathway, suggesting that RNA molecules guide genome rearrangement.
We present BLAST on Orthologous groups (BLASTO), a modified BLAST tool for searching orthologous group data. It treats each orthologous group as a unit and outputs a ranked list of orthologous groups instead of single sequences. By filtering out redundancy and putative paralogs, sequence comparisons to orthologous groups, instead of to single sequences in the database, can improve both functional prediction and phylogenetic inference. BLASTO computes the significance score of each orthologous group based on the individual BLAST hits in the orthologous group, using the number of taxa in the group as an optional weight. This allows users to control the species diversity of the orthologous groups. BLASTO incorporates the best-known multispecies ortholog databases, including NCBI Clusters of Orthologous Group, NCBI euKaryotic Orthologous Group database, OrthoMCL, MultiParanoid and TIGR Eukaryotic Gene Orthologues database, and offers a useful platform to integrate orthology information into functional inference and evolutionary studies of individual sequences. BLASTO is accessible online at http://oxytricha.princeton.edu/BlastO.
The somatic DNA molecules of spirotrichous ciliates are present as linear chromosomes containing mostly single-gene coding sequences with short 5' and 3' flanking regions. Only a few conserved motifs have been found in the flanking DNA. Motifs that may play roles in promoting and/or regulating transcription have not been consistently detected. Moreover, comparing subtelomeric regions of 1,356 end-sequenced somatic chromosomes failed to identify more putatively conserved motifs.
We sequenced and compared DNA and RNA versions of the DNA polymerase α (pol α) gene from nine diverged spirotrichous ciliates. We identified a G-C rich motif aaTACCGC(G/C/T) upstream from transcription start sites in all nine pol α orthologs. Furthermore, we consistently found likely polyadenylation signals, similar to the eukaryotic consensus AAUAAA, within 35 nt upstream of the polyadenylation sites. Numbers of introns differed among orthologs, suggesting independent gain or loss of some introns during the evolution of this gene. Finally, we discuss the occurrence of short direct repeats flanking some introns in the DNA pol α genes. These introns flanked by direct repeats resemble a class of DNA sequences called internal eliminated sequences (IES) that are deleted from ciliate chromosomes during development.
Our results suggest that conserved motifs are present at both 5' and 3' untranscribed regions of the DNA pol α genes in nine spirotrichous ciliates. We also show that several independent gains and losses of introns in the DNA pol α genes have occurred in the spirotrichous ciliate lineage. Finally, our statistical results suggest that proven introns might also function in an IES removal pathway. This could strengthen a recent hypothesis that introns evolve into IESs, explaining the scarcity of introns in spirotrichs. Alternatively, the analysis suggests that ciliates might occasionally use intron splicing to correct, at the RNA level, failures in IES excision during developmental DNA elimination.
This article was reviewed by Dr. Alexei Fedorov (referred by Dr. Manyuan Long), Dr. Martin A. Huynen and Dr. John M. Logsdon.
We present a bioinformatic web server (SWAKK) for detecting amino acid sites or regions of a protein under positive selection. It estimates the ratio of non-synonymous to synonymous substitution rates (KA/KS) between a pair of protein-coding DNA sequences, by sliding a 3D window, or sphere, across one reference structure. The program displays the results on the 3D protein structure. In addition, for comparison or when a reference structure is unavailable, the server can also perform a sliding window analysis on the primary sequence. The SWAKK web server is available at .
This study shows that a statistical excess of stop codons has evolved at the third codon downstream of the real stop codon UAA in yeasts. Comparative analysis indicates that stop codons at this location are considerably more conserved than sense codons, suggesting that these tandem stop codons are maintained by selection.
It has been long thought that the stop codon in a gene is followed by another stop codon that acts as a backup if the real one is read through by a near-cognate tRNA. The existence of such 'tandem stop codons', however, remains elusive.
Here we show that a statistical excess of stop codons has evolved at the third codon downstream of the real stop codon UAA in yeasts. Comparative analysis indicates that stop codons at this location are considerably more conserved than sense codons, suggesting that these tandem stop codons are maintained by selection. We evaluated the influence of expression levels of genes and other biological factors on the distribution of tandem stop codons. Our results suggest that expression level is an important factor influencing the presence of tandem stop codons.
Our study demonstrates the existence of tandem stop codons, which represent one of many meaningful genomic features that are driven by relatively weak selective forces.