A draft sequence of the compact genome of the sea squirt Ciona intestinalis illuminates how chordates originated and how vertebrate developmental innovations evolved.
A draft sequence of the compact genome of the sea squirt Ciona intestinalis, a non-vertebrate chordate that diverged very early from other chordates, including vertebrates, illuminates how chordates originated and how vertebrate developmental innovations evolved.
Achievement of transposon mediated germline transgenesis in a basal chordate, Ciona intestinalis, is discussed. A Tc1/mariner superfamily transposon, Minos, has excision and transposition activities in Ciona. Minos enables the creation of stable transgenic lines, enhancer detection, and insertional mutagenesis.
The vitamin D receptor (VDR) and pregnane X receptor (PXR) are nuclear hormone receptors of the NR1I subfamily that show contrasting patterns of cross-species variation. VDR and PXR are thought to have arisen from duplication of an ancestral gene, evident now as a single gene in the genome of the chordate invertebrate Ciona intestinalis (sea squirt). VDR genes have been detected in a wide range of vertebrates including jawless fish. To date, PXR genes have not been found in cartilaginous fish. In this study, the ligand selectivities of VDRs were compared in detail across a range of vertebrate species and compared with those of the Ciona VDR/PXR. In addition, several assays were used to search for evidence of PXR-mediated hepatic effects in three model non-mammalian species: sea lamprey (Petromyzon marinus), zebrafish (Danio rerio), and African clawed frog (Xenopus laevis).
Human, mouse, frog, zebrafish, and lamprey VDRs were found to have similar ligand selectivities for vitamin D derivatives. In contrast, using cultured primary hepatocytes, only zebrafish showed evidence of PXR-mediated induction of enzyme expression, with increases in testosterone 6β-hydroxylation activity (a measure of cytochrome P450 3A activity in other species) and flurbiprofen 4-hydroxylation activity (measure of cytochrome P450 2C activity) following exposure to known PXR activators. A separate assay in vivo using zebrafish demonstrated increased hepatic transcription of another PXR target, multidrug resistance gene (ABCB5), following injection of the major zebrafish bile salt, 5α-cyprinol 27-sulfate. The PXR target function, testosterone hydroxylation, was detected in frog and sea lamprey primary hepatocytes, but was not inducible in these two species by a wide range of PXR activators in other animals. Analysis of the sea lamprey draft genome also did not show evidence of a PXR gene.
Our results show tight conservation of ligand selectivity of VDRs across vertebrate species from Agnatha to mammals. Using a functional approach, we demonstrate classic PXR-mediated effects in zebrafish, but not in sea lamprey or African clawed frog liver cells. Using a genomic approach, we failed to find evidence of a PXR gene in lamprey, suggesting that VDR may be the original NR1I gene.
The primitive chordate Ciona intestinalis has emerged as a significant model system for the study of heart development. The Ciona embryo employs a conserved heart gene network in the context of extremely low cell numbers and reduced genetic redundancy. Here, I review recent studies on the molecular genetics of Ciona cardiogenesis as well as classic work on heart anatomy and physiology. I also discuss the potential of employing Ciona to decipher a comprehensive chordate gene network and to determine how this network controls heart morphogenesis.
Heart development; Organogenesis; Gene regulation; Chordate evolution
The notochord is a defining feature of the chordate clade, and invertebrate chordates, such as tunicates, are uniquely suited for studies of this structure. Here we used a well-characterized set of 50 notochord genes known to be targets of the notochord-specific Brachyury transcription factor in one tunicate, Ciona intestinalis (Class Ascidiacea), to begin determining whether the same genetic toolkit is employed to build the notochord in another tunicate, Oikopleura dioica (Class Larvacea). We identified Oikopleura orthologs of the Ciona notochord genes, as well as lineage-specific duplicates for which we determined the phylogenetic relationships with related genes from other chordates, and we analyzed their expression patterns in Oikopleura embryos.
Of the 50 Ciona notochord genes that were used as a reference, only 26 had clearly identifiable orthologs in Oikopleura. Two of these conserved genes appeared to have undergone Oikopleura- and/or tunicate-specific duplications, and one was present in three copies in Oikopleura, thus bringing the number of genes to test to 30. We were able to clone and test 28 of these genes. Thirteen of the 28 Oikopleura orthologs of Ciona notochord genes showed clear expression in all or in part of the Oikopleura notochord, seven were diffusely expressed throughout the tail, six were expressed in tissues other than the notochord, while two probes did not provide a detectable signal at any of the stages analyzed. One of the notochord genes identified, Oikopleura netrin, was found to be unevenly expressed in notochord cells, in a pattern reminiscent of that previously observed for one of the Oikopleura Hox genes.
A surprisingly high number of Ciona notochord genes do not have apparent counterparts in Oikopleura, and only a fraction of the evolutionarily conserved genes show clear notochord expression. This suggests that Ciona and Oikopleura, despite the morphological similarities of their notochords, have developed rather divergent sets of notochord genes after their split from a common tunicate ancestor. This study demonstrates that comparisons between divergent tunicates can lead to insights into the basic complement of genes sufficient for notochord development, and elucidate the constraints that control its composition.
To gain insight into the evolutionary features of the huntingtin (htt) gene in Chordata, we have sequenced and characterized the full-length htt mRNA in the ascidian Ciona intestinalis, a basal chordate emerging as new invertebrate model organism. Moreover, taking advantage of the availability of genomic and EST sequences, the htt gene structure of a number of chordate species, including the cogeneric ascidian Ciona savignyi, and the vertebrates Xenopus and Gallus was reconstructed.
The C. intestinalis htt transcript exhibits some peculiar features, such as spliced leader trans-splicing in the 98 nt-long 5' untranslated region (UTR), an alternative splicing in the coding region, eight alternative polyadenylation sites, and no similarities of both 5' and 3'UTRs compared to homologs of the cogeneric C. savignyi. The predicted protein is 2946 amino acids long, shorter than its vertebrate homologs, and lacks the polyQ and the polyP stretches found in the the N-terminal regions of mammalian homologs. The exon-intron organization of the htt gene is almost identical among vertebrates, and significantly conserved between Ciona and vertebrates, allowing us to hypothesize an ancestral chordate gene consisting of at least 40 coding exons.
During chordate diversification, events of gain/loss, sliding, phase changes, and expansion of introns occurred in both vertebrate and ascidian lineages predominantly in the 5'-half of the htt gene, where there is also evidence of lineage-specific evolutionary dynamics in vertebrates. On the contrary, the 3'-half of the gene is highly conserved in all chordates at the level of both gene structure and protein sequence. Between the two Ciona species, a fast evolutionary rate and/or an early divergence time is suggested by the absence of significant similarity between UTRs, protein divergence comparable to that observed between mammals and fishes, and different distribution of repetitive elements.
To reconstruct a minimum complement of notochord genes evolutionarily conserved across chordates, we scanned the Ciona intestinalis genome using the sequences of 182 genes reported to be expressed in the notochord of different vertebrates and identified 139 candidate notochord genes. For 66 of these Ciona genes expression data were already available, hence we analyzed the expression of the remaining 73 genes and found notochord expression for 20. The predicted products of the newly identified notochord genes range from the transcription factors Ci-XBPa and Ci-miER1 to extracellular matrix proteins. We examined the expression of the newly identified notochord genes in embryos ectopically expressing Ciona Brachyury (Ci-Bra) and in embryos expressing a repressor form of this transcription factor in the notochord, and we found that while a subset of the genes examined are clearly responsive to Ci-Bra, other genes are not affected by alterations in its levels. We provide a first description of notochord genes that are not evidently influenced by the ectopic expression of Ci-Bra and we propose alternative regulatory mechanisms that might control their transcription.
Ciona; ascidian; notochord; chordate; evolution; Brachyury
The genomes of many marine invertebrates, including the purple sea urchin and the solitary ascidians Ciona intestinalis and Ciona savignyi, show exceptionally high levels of heterozygosity, implying that these populations are highly polymorphic. Analysis of the C. savignyi genome found little evidence to support an elevated mutation rate, but rather points to a large population size contributing to the polymorphism level. In the present study, the relative genetic polymorphism levels in sampled populations of ten different ascidian species were determined using a similarity index generated by AFLP analysis. The goal was to determine the range of polymorphism within the populations of different species, and to uncover factors that may contribute to the high level of polymorphism. We observe that, surprisingly, the levels of polymorphism within these species show a negative correlation with the reported age of invasive populations, and that closely related species show substantially different levels of genetic polymorphism. These findings show exceptions to the assumptions that invasive species start with a low level of genetic polymorphism that increases over time and that closely related species have similar levels of genetic polymorphism.
The past few years have seen a vast increase in the amount of genomic data available for a growing number of taxa, including sets of full length cDNA clones and cis-regulatory sequences. Large scale cross-species comparisons of protein function and cis-regulatory sequences may help to understand the emergence of specific traits during evolution.
To facilitate such comparisons, we developed a Gateway compatible vector set, which can be used to systematically dissect cis-regulatory sequences, and overexpress wild type or tagged proteins in a variety of chordate systems. It was developed and first characterised in the embryos of the ascidian Ciona intestinalis, in which large scale analyses are easier to perform than in vertebrates, owing to the very efficient embryo electroporation protocol available in this organism. Its use was then extended to fish embryos and cultured mammalian cells.
This versatile vector set opens the way to the mid- to large-scale comparative analyses of protein function and cis-regulatory sequences across chordate evolution. A complete user manual is provided as supplemental material.
The Ensembl () project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.
Genome comparisons are behind the powerful new annotation methods being developed to find all human genes, as well as genes from other genomes. Genomes are now frequently being studied in pairs to provide cross-comparison datasets. This 'Noah's Ark' approach often reveals unsuspected genes and may support the deletion of false-positive predictions. Joining mouse and human as the cross-comparison dataset for the first two mammals are: two Drosophila species, D. melanogaster and D. pseudoobscura; two sea squirts, Ciona intestinalis and Ciona savignyi; four yeast (Saccharomyces) species; two nematodes, Caenorhabditis elegans and Caenorhabditis briggsae; and two pufferfish (Takefugu rubripes and Tetraodon nigroviridis). Even genomes like yeast and C. elegans, which have been known for more than five years, are now being significantly improved. Methods developed for yeast or nematodes will now be applied to mouse and human, and soon to additional mammals such as rat and dog, to identify all the mammalian protein-coding genes. Current large disparities between human Unigene predictions (127,835 genes) and gene-scanning methods (45,000 genes) still need to be resolved. This will be the challenge during the next few years.
human genome; mouse genome; Caenorhabditis elegans genome; Caenorhabditis briggsae genome; Saccharomyces genomes; comparative genomics; gene discovery; gene-prediction algorithms
A novel method for prediction of miRs from deep sequencing data. Its utility is demonstrated when applied to Ciona data.
MicroRNAs (miRs) have been broadly implicated in animal development and disease. We developed a novel computational strategy for the systematic, whole-genome identification of miRs from high throughput sequencing information. This method, miRTRAP, incorporates the mechanisms of miR biogenesis and includes additional criteria regarding the prevalence and quality of small RNAs arising from the antisense strand and neighboring loci. This program was applied to the simple chordate Ciona intestinalis and identified nearly 400 putative miR loci.
This study reports the first collection of validated microRNA genes in the sea squirt, Ciona intestinalis. MicroRNAs are processed from hairpin precursors to ~22 nucleotide RNAs that base pair to target mRNAs and inhibit expression. As a member of the subphylum Urochordata (Tunicata) whose larval form has a notochord, the sea squirt is situated at the emergence of vertebrates, and therefore may provide information about the evolution of molecular regulators of early development.
In this study, computational methods were used to predict 14 microRNA gene families in Ciona intestinalis. The microRNA prediction algorithm utilizes configurable microRNA sequence conservation and stem-loop specificity parameters, grouping by miRNA family, and phylogenetic conservation to the related species, Ciona savignyi. The expression for 8, out of 9 attempted, of the putative microRNAs in the adult tissue of Ciona intestinalis was validated by Northern blot analyses. Additionally, a target prediction algorithm was implemented, which identified a high confidence list of 240 potential target genes. Over half of the predicted targets can be grouped into the gene ontology categories of metabolism, transport, regulation of transcription, and cell signaling.
The computational techniques implemented in this study can be applied to other organisms and serve to increase the understanding of the origins of non-coding RNAs, embryological and cellular developmental pathways, and the mechanisms for microRNA-controlled gene regulatory networks.
Vertebrate embryos exploit the mutual inhibition between the RA and FGF signalling pathways to coordinate the proliferative elongation of the main body axis with the progressive patterning and differentiation of its neuroectodermal and paraxial mesodermal structures. The evolutionary history of this patterning system is still poorly understood. Here, we investigate the role played by the RA and FGF/MAPK signals during the development of the tail structures in the tunicate Ciona intestinalis, an invertebrate chordate belonging to the sister clade of vertebrates, in which the prototypical chordate body plan is established through very derived morphogenetic processes. Ciona embryos are constituted of few cells and develop according to a fixed lineage; elongation of the tail occurs largely by rearrangement of postmitotic cells; mesoderm segmentation and somitogenesis are absent. We show that in the Ciona embryo, the antagonism of the RA and FGF/MAPK signals is required to control the anteroposterior patterning of the tail epidermis. We also demonstrate that the RA, FGF/MAPK and canonical Wnt pathways control the anteroposterior patterning of the tail peripheral nervous system, and reveal the existence of distinct subpopulations of caudal epidermal neurons with different responsiveness to the RA, FGF/MAPK and canonical Wnt signals. Our data provide the first demonstration that the use of the antagonism between the RA and FGF signals to pattern the main body axis predates the emergence of vertebrates and highlight the evolutionary plasticity of this patterning strategy, showing that in different chordates it can be used to pattern different tissues within the same homologous body region.
The pregnane X receptor (PXR) shows the highest degree of cross-species sequence diversity of any of the vertebrate nuclear hormone receptors. In this study, we determined the pharmacophores for activation of human, mouse, rat, rabbit, chicken, and zebrafish PXRs, using a common set of sixteen ligands. In addition, we compared in detail the selectivity of human and zebrafish PXRs for steroidal compounds and xenobiotics. The ligand activation properties of the Western clawed frog (Xenopus tropicalis) PXR and that of a putative vitamin D receptor (VDR)/PXR cloned in this study from the chordate invertebrate sea squirt (Ciona intestinalis) were also investigated.
Using a common set of ligands, human, mouse, and rat PXRs share structurally similar pharmacophores consisting of hydrophobic features and widely spaced excluded volumes indicative of large binding pockets. Zebrafish PXR has the most sterically constrained pharmacophore of the PXRs analyzed, suggesting a smaller ligand-binding pocket than the other PXRs. Chicken PXR possesses a symmetrical pharmacophore with four hydrophobes, a hydrogen bond acceptor, as well as excluded volumes. Comparison of human and zebrafish PXRs for a wide range of possible activators revealed that zebrafish PXR is activated by a subset of human PXR agonists. The Ciona VDR/PXR showed low sequence identity to vertebrate VDRs and PXRs in the ligand-binding domain and was preferentially activated by planar xenobiotics including 6-formylindolo-[3,2-b]carbazole. Lastly, the Western clawed frog (Xenopus tropicalis) PXR was insensitive to vitamins and steroidal compounds and was activated only by benzoates.
In contrast to other nuclear hormone receptors, PXRs show significant differences in ligand specificity across species. By pharmacophore analysis, certain PXRs share similar features such as human, mouse, and rat PXRs, suggesting overlap of function and perhaps common evolutionary forces. The Western clawed frog PXR, like that described for African clawed frog PXRs, has diverged considerably in ligand selectivity from fish, bird, and mammalian PXRs.
Integrins are a functionally significant family of metazoan cell surface adhesion receptors. The receptors are dimers composed of an alpha and a beta chain. Vertebrate genomes encode an expanded set of integrin alpha and beta chains in comparison with protostomes such as drosophila or the nematode worm. The publication of the genome of a basal chordate, Ciona intestinalis, provides a unique opportunity to gain further insight into how and when the expanded integrin supergene family found in vertebrates evolved.
The Ciona genome encodes eleven α and five β chain genes that are highly homologous to their vertebrate homologues. Eight of the α chains contain an A-domain that lacks the short alpha helical region present in the collagen-binding vertebrate alpha chains. Phylogenetic analyses indicate the eight A-domain containing α chains cluster to form an ascidian-specific clade that is related to but, distinct from, the vertebrate A-domain clade. Two Ciona α chains cluster in laminin-binding clade and the remaining chain clusters in the clade that binds the RGD tripeptide sequence. Of the five Ciona β chains, three form an ascidian-specific clade, one clusters in the vertebrate β1 clade and the remaining Ciona chain is the orthologue of the vertebrate β4 chain.
The Ciona repertoire of integrin genes provides new insight into the basic set of these receptors available at the beginning of vertebrate evolution. The ascidian and vertebrate α chain A-domain clades originated from a common precursor but radiated separately in each lineage. It would appear that the acquisition of collagen binding capabilities occurred in the chordate lineage after the divergence of ascidians.
Synapsins are neuronal phosphoproteins involved in several functions correlated with both neurotransmitter release and synaptogenesis. The comprehension of the basal role of the synapsin family is hampered in vertebrates by the existence of multiple synapsin genes. Therefore, studying homologous genes in basal chordates, devoid of genome duplication, could help to achieve a better understanding of the complex functions of these proteins.
In this study we report the cloning and characterization of the Ciona intestinalis and amphioxus Branchiostoma floridae synapsin transcripts and the definition of their gene structure using available C. intestinalis and B. floridae genomic sequences. We demonstrate the occurrence, in both model organisms, of a single member of the synapsin gene family. Full-length synapsin genes were identified in the recently sequenced genomes of phylogenetically diverse metazoans. Comparative genome analysis reveals extensive conservation of the SYN locus in several metazoans. Moreover, developmental expression studies underline that synapsin is a neuronal-specific marker in basal chordates and is expressed in several cell types of PNS and in many, if not all, CNS neurons.
Our study demonstrates that synapsin genes are metazoan genes present in a single copy per genome, except for vertebrates. Moreover, we hypothesize that, during the evolution of synapsin proteins, new domains are added at different stages probably to cope up with the increased complexity in the nervous system organization. Finally, we demonstrate that protochordate synapsin is restricted to the post-mitotic phase of CNS development and thereby is a good marker of postmitotic neurons.
Although spliced leader (SL) trans-splicing in the chordates was discovered in the tunicate Ciona intestinalis there has been no genomic overview analysis of the extent of trans-splicing or the make-up of the trans-spliced and non-trans-spliced gene populations of this model organism. Here we report such an analysis for Ciona based on the oligo-capping full-length cDNA approach. We randomly sampled 2078 5′-full-length ESTs representing 668 genes, or 4.2% of the entire genome. Our results indicate that Ciona contains a single major SL, which is efficiently trans-spliced to mRNAs transcribed from a specific set of genes representing ∼50% of the total number of expressed genes, and that individual trans-spliced mRNA species are, on average, 2–3-fold less abundant than non-trans-spliced mRNA species. Our results also identify a relationship between trans-splicing status and gene functional classification; ribosomal protein genes fall predominantly into the non-trans-spliced category. In addition, our data provide the first evidence for the occurrence of polycistronic transcription in Ciona. An interesting feature of the Ciona polycistronic transcription units is that the great majority entirely lack intercistronic sequences.
Molecular chaperones play crucial roles in various aspects of the biogenesis and maintenance of proteins in the cell. The heat shock protein 70 (HSP70) chaperone system, in which HSP70 proteins act as chaperones, is one of the major molecular chaperone systems conserved among a variety of organisms. To shed light on the evolutionary history of the constituents of the chordate HSP70 chaperone system and to identify all of the components of the HSP70 chaperone system in ascidians, we carried out a comprehensive survey for HSP70s and their cochaperones in the genome of Ciona intestinalis. We characterized all members of the Ciona HSP70 superfamily, J-proteins, BAG family, and some other types of cochaperones. The Ciona genome contains 8 members of the HSP70 superfamily, all of which have human and protostome counterparts. Members of the STCH subfamily of the HSP70 family and members of the HSPA14 subfamily of the HSP110 family are conserved between humans and protostomes but were not found in Ciona. The Ciona genome encodes 36 J-proteins, 32 of which belong to groups conserved in humans and protostomes. Three proteins seem to be unique to Ciona. J-proteins of the RBJ group are conserved between humans and Ciona but were not found in protostomes, whereas J-proteins of the DNAJC14, ZCSL3, FLJ13236, and C21orf55 groups are conserved between humans and protostomes but were not found in Ciona. J-proteins of the sacsin group seem to be specific to vertebrates. There is also a J-like protein without a conserved HPD tripeptide motif in the Ciona genome. The Ciona genome encodes 3 types of BAG family proteins, all of which have human and protostome counterparts (BAG1, BAG3, and BAT3). BAG2 group is conserved between humans and protostomes but was not found in Ciona, and BAG4 and BAG5 groups seem to be specific to vertebrates. Members for SIL1, UBQLN, UBADC1, TIMM44, GRPEL, and Magmas groups, which are conserved between humans and protostomes, were also found in Ciona. No Ciona member was retrieved for HSPBP1 group, which is conserved between humans and protostomes. For several groups of the HSP70 superfamily, J-proteins, and other types of cochaperones, multiple members in humans are represented by a single counterpart in Ciona. These results show that genes of the HSP70 chaperone system can be distinguished into groups that are shared by vertebrates, Ciona, and protostomes, ones shared by vertebrates and protostomes, ones shared by vertebrates and Ciona, and ones specific to vertebrates, Ciona, or protostomes. These results also demonstrate that the components of the HSP70 chaperone system in Ciona are similar to but simpler than those in humans and suggest that changes of the genome in the lineage leading to humans after the separation from that leading to Ciona increased the number and diversity of members of the HSP70 chaperone system. Changes of the genome in the lineage leading to Ciona also seem to have made the HSP70 chaperone system in this species slightly simpler than that in the common ancestor of humans and Ciona.
In silico and experimental approaches have been used to identify the non-long terminal repeat retrotransposons of the urochordate Ciona intestinalis providing valuable data for understanding the evolution of early chordate genomes.
Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes.
Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families.
The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Fractional DNA methylation in sea squirts evolved to global DNA methylation in fish. The impact of global DNA methylation is reflected by more CpG depletions and/or more A/T to G/C changes at CpG flanking positions due to context-dependent mutations of methylated CpG sites.
Methods and Findings
In this report, we demonstrate that the sea squirt genes have undergone more CpG to TpG/CpA substitutions than the fish orthologs using homologous fragments from orthologous genes among Ciona intestinalis, Ciona savignyi, fugufish and zebrafish. To avoid premature transcription, the TGA sites derived from CGA were largely converted to TGG in sea squirt genes. By contrast, a significant increment of GC content at CpG flanking positions was shown in fish genes. The positively selected A/T to G/C substitutions, in combination with the CpG to TpG/CpA substitutions, are the sources of the extremely low CpG observed/expected ratios in vertebrates. The nonsynonymous substitutions caused by the GC content increase have resulted in frequent amino acid replacements in the directions that were not noticed previously.
The increased GC content at CpG flanking positions can reduce CpG loss in fish genes and attenuate the impact of DNA methylation on CpG-containing codons, probably accounting for evolution towards vertebrates.
Phylogenomics has revealed the existence of fast-evolving animal phyla in which the amino acid substitution rate, averaged across many proteins, is consistently higher than in other lineages. The reasons for such differences in proteome-wide evolutionary rates are still unknown, largely because only a handful of species offer within-species genomic data from which molecular evolutionary processes can be deduced. In this study, we use next-generation sequencing technologies and individual whole-transcriptome sequencing to gather extensive polymorphism sequence data sets from Ciona intestinalis. Ciona is probably the best-characterized member of the fast-evolving Urochordata group (tunicates), which was recently identified as the sister group of the slow-evolving vertebrates. We introduce and validate a maximum-likelihood framework for single-nucleotide polymorphism and genotype calling, based on high-throughput short-read typing. We report that the C. intestinalis proteome is characterized by a high level of within-species diversity, efficient purifying selection, and a substantial percentage of adaptive amino acid substitutions. We conclude that the increased rate of amino acid sequence evolution in tunicates, when compared with vertebrates, is the consequence of both a 2–6 times higher per-year mutation rate and prevalent adaptive evolution.
substitution rate; population size; mutation rate; next-generation sequencing; transcriptome
The Ensembl () project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data.
An improved assembly of the Ciona intestinalis genome reveals that it contains non-canonical introns and that about 20% of Ciona genes reside in operons.
The draft genome sequence of the ascidian Ciona intestinalis, along with associated gene models, has been a valuable research resource. However, recently accumulated expressed sequence tag (EST)/cDNA data have revealed numerous inconsistencies with the gene models due in part to intrinsic limitations in gene prediction programs and in part to the fragmented nature of the assembly.
We have prepared a less-fragmented assembly on the basis of scaffold-joining guided by paired-end EST and bacterial artificial chromosome (BAC) sequences, and BAC chromosomal in situ hybridization data. The new assembly (115.2 Mb) is similar in length to the initial assembly (116.7 Mb) but contains 1,272 (approximately 50%) fewer scaffolds. The largest scaffold in the new assembly incorporates 95 initial-assembly scaffolds. In conjunction with the new assembly, we have prepared a greatly improved global gene model set strictly correlated with the extensive currently available EST data. The total gene number (15,254) is similar to that of the initial set (15,582), but the new set includes 3,330 models at genomic sites where none were present in the initial set, and 1,779 models that represent fusions of multiple previously incomplete models. In approximately half, 5'-ends were precisely mapped using 5'-full-length ESTs, an important refinement even in otherwise unchanged models.
Using these new resources, we identify a population of non-canonical (non-GT-AG) introns and also find that approximately 20% of Ciona genes reside in operons and that operons contain a high proportion of single-exon genes. Thus, the present dataset provides an opportunity to analyze the Ciona genome much more precisely than ever.
In conventionally-expressed eukaryotic genes, transcription start sites (TSSs) can be identified by mapping the mature mRNA 5′-terminal sequence onto the genome. However, this approach is not applicable to genes that undergo pre-mRNA 5′-leader trans-splicing (SL trans-splicing) because the original 5′-segment of the primary transcript is replaced by the spliced leader sequence during the trans-splicing reaction and is discarded. Thus TSS mapping for trans-spliced genes requires different approaches. We describe two such approaches and show that they generate precisely agreeing results for an SL trans-spliced gene encoding the muscle protein troponin I in the ascidian tunicate chordate Ciona intestinalis. One method is based on experimental deletion of trans-splice acceptor sites and the other is based on high-throughput mRNA 5′-RACE sequence analysis of natural RNA populations in order to detect minor transcripts containing the pre-mRNA’s original 5′-end. Both methods identified a single major troponin I TSS located ∼460 nt upstream of the trans-splice acceptor site. Further experimental analysis identified a functionally important TATA element 31 nt upstream of the start site. The two methods employed have complementary strengths and are broadly applicable to mapping promoters/TSSs for trans-spliced genes in tunicates and in trans-splicing organisms from other phyla.