Xenarthra (armadillos, sloths, and anteaters) constitutes one of the four major clades of placental mammals. Despite their phylogenetic distinctiveness in mammals, a reference phylogeny is still lacking for the 31 described species. Here we used Illumina shotgun sequencing to assemble 33 new complete mitochondrial genomes, establishing Xenarthra as the first major placental clade to be fully sequenced at the species level for mitogenomes. The resulting data set allowed the reconstruction of a robust phylogenetic framework and timescale that are consistent with previous studies conducted at the genus level using nuclear genes. Incorporating the full species diversity of extant xenarthrans points to a number of inconsistencies in xenarthran systematics and species definition. We propose to split armadillos into two distinct families Dasypodidae (dasypodines) and Chlamyphoridae (euphractines, chlamyphorines, and tolypeutines) to better reflect their ancient divergence, estimated around 42 Ma. Species delimitation within long-nosed armadillos (genus Dasypus) appeared more complex than anticipated, with the discovery of a divergent lineage in French Guiana. Diversification analyses showed Xenarthra to be an ancient clade with a constant diversification rate through time with a species turnover driven by high but constant extinction. We also detected a significant negative correlation between speciation rate and past temperature fluctuations with an increase in speciation rate corresponding to the general cooling observed during the last 15 My. Biogeographic reconstructions identified the tropical rainforest biome of Amazonia and the Guiana Shield as the cradle of xenarthran evolutionary history with subsequent dispersions into more open and dry habitats.
mammals; Xenarthra; shotgun Illumina sequencing; molecular phylogenetics; mitochondrial genomes; molecular dating
Ascidians belong to the tunicates, the sister group of vertebrates and are recognized model organisms in the field of embryonic development, regeneration and stem cells. ANISEED is the main information system in the field of ascidian developmental biology. This article reports the development of the system since its initial publication in 2010. Over the past five years, we refactored the system from an initial custom schema to an extended version of the Chado schema and redesigned all user and back end interfaces. This new architecture was used to improve and enrich the description of Ciona intestinalis embryonic development, based on an improved genome assembly and gene model set, refined functional gene annotation, and anatomical ontologies, and a new collection of full ORF cDNAs. The genomes of nine ascidian species have been sequenced since the release of the C. intestinalis genome. In ANISEED 2015, all nine new ascidian species can be explored via dedicated genome browsers, and searched by Blast. In addition, ANISEED provides full functional gene annotation, anatomical ontologies and some gene expression data for the six species with highest quality genomes. ANISEED is publicly available at: http://www.aniseed.cnrs.fr.
Ameloblastin (AMBN) is a phosphorylated, proline/glutamine-rich protein secreted during enamel formation. Previous studies have revealed that this enamel matrix protein was present early in vertebrate evolution and certainly plays important roles during enamel formation although its precise functions remain unclear. We performed evolutionary analyses of AMBN in order to (i) identify residues and motifs important for the protein function, (ii) predict mutations responsible for genetic diseases, and (iii) understand its molecular evolution in mammals.
In silico searches retrieved 56 complete sequences in public databases that were aligned and analyzed computationally. We showed that AMBN is globally evolving under moderate purifying selection in mammals and contains a strong phylogenetic signal. In addition, our analyses revealed codons evolving under significant positive selection. Evidence for positive selection acting on AMBN was observed in catarrhine primates and the aye-aye. We also found that (i) an additional translation initiation site was recruited in the ancestral placental AMBN, (ii) a short exon was duplicated several times in various species including catarrhine primates, and (iii) several polyadenylation sites are present.
AMBN possesses many positions, which have been subjected to strong selective pressure for 200 million years. These positions correspond to several cleavage sites and hydroxylated, O-glycosylated, and phosphorylated residues. We predict that these conserved positions would be potentially responsible for enamel disorder if substituted. Some motifs that were previously identified as potentially important functionally were confirmed, and we found two, highly conserved, new motifs, the function of which should be tested in the near future. This study illustrates the power of evolutionary analyses for characterizing the functional constraints acting on proteins with yet uncharacterized structure.
Electronic supplementary material
The online version of this article (doi:10.1186/s12862-015-0431-0) contains supplementary material, which is available to authorized users.
Ameloblastin; Evolution; Enamel; Mammals; Purifying selection; Positive selection; Phylomedicine
Genomes of animals as different as sponges and humans show conservation of global architecture. Here we show that multiple genomic features including transposon diversity, developmental gene repertoire, physical gene order, and intron-exon organization are shattered in the tunicate Oikopleura, belonging to the sister group of vertebrates and retaining chordate morphology. Ancestral architecture of animal genomes can be deeply modified and may therefore be largely nonadaptive. This rapidly evolving animal lineage thus offers unique perspectives on the level of genome plasticity. It also illuminates issues as fundamental as the mechanisms of intron gain.
Ascidians or sea squirts form a diverse group within chordates, which includes a few thousand members of marine sessile filter-feeding animals. Their mitochondrial genomes are characterized by particularly high evolutionary rates and rampant gene rearrangements. This extreme variability complicates standard polymerase chain reaction (PCR) based techniques for molecular characterization studies, and consequently only a few complete Ascidian mitochondrial genome sequences are available. Using the standard PCR and Sanger sequencing approach, we produced the mitochondrial genome of Ascidiella aspersa only after a great effort. In contrast, we produced five additional mitogenomes (Botrylloides aff. leachii, Halocynthia spinosa, Polycarpa mytiligera, Pyura gangelion, and Rhodosoma turcicum) with a novel strategy, consisting in sequencing the pooled total DNA samples of these five species using one Illumina HiSeq 2000 flow cell lane. Each mitogenome was efficiently assembled in a single contig using de novo transcriptome assembly, as de novo genome assembly generally performed poorly for this task. Each of the new six mitogenomes presents a different and novel gene order, showing that no syntenic block has been conserved at the ordinal level (in Stolidobranchia and in Phlebobranchia). Phylogenetic analyses support the paraphyly of both Ascidiacea and Phlebobranchia, with Thaliacea nested inside Phlebobranchia, although the deepest nodes of the Phlebobranchia–Thaliacea clade are not well resolved. The strategy described here thus provides a cost-effective approach to obtain complete mitogenomes characterized by a highly plastic gene order and a fast nucleotide/amino acid substitution rate.
Tunicates; Ascidians; mitochondrial genome; mitogenomics; next-generation sequencing; Illumina; gene order; rearrangements; phylogeny; mixture models; genome assembly
SAMHD1 has recently been identified as an HIV-1 restriction factor operating in myeloid cells. As a countermeasure, the Vpx accessory protein from HIV-2 and certain lineages of SIV have evolved to antagonize SAMHD1 by inducing its ubiquitin-proteasome-dependent degradation. Here, we show that SAMHD1 experienced strong positive selection episodes during primate evolution that occurred in the Catarrhini ancestral branch prior to the separation between hominoids (gibbons and great apes) and Old World monkeys. The identification of SAMHD1 residues under positive selection led to mapping the Vpx-interaction domain of SAMHD1 to its C-terminal region. Importantly, we found that while SAMHD1 restriction activity toward HIV-1 is evolutionarily maintained, antagonism of SAMHD1 by Vpx is species-specific. The distinct evolutionary signature of SAMHD1 sheds light on the development of its antiviral specificity.
The morphological peculiarities of turtles have, for a long time, impeded their accurate placement in the phylogeny of amniotes. Molecular data used to address this major evolutionary question have so far been limited to a handful of markers and/or taxa. These studies have supported conflicting topologies, positioning turtles as either the sister group to all other reptiles, to lepidosaurs (tuatara, lizards and snakes), to archosaurs (birds and crocodiles), or to crocodilians. Genome-scale data have been shown to be useful in resolving other debated phylogenies, but no such adequate dataset is yet available for amniotes.
In this study, we used next-generation sequencing to obtain seven new transcriptomes from the blood, liver, or jaws of four turtles, a caiman, a lizard, and a lungfish. We used a phylogenomic dataset based on 248 nuclear genes (187,026 nucleotide sites) for 16 vertebrate taxa to resolve the origins of turtles. Maximum likelihood and Bayesian concatenation analyses and species tree approaches performed under the most realistic models of the nucleotide and amino acid substitution processes unambiguously support turtles as a sister group to birds and crocodiles. The use of more simplistic models of nucleotide substitution for both concatenation and species tree reconstruction methods leads to the artefactual grouping of turtles and crocodiles, most likely because of substitution saturation at third codon positions. Relaxed molecular clock methods estimate the divergence between turtles and archosaurs around 255 million years ago. The most recent common ancestor of living turtles, corresponding to the split between Pleurodira and Cryptodira, is estimated to have occurred around 157 million years ago, in the Upper Jurassic period. This is a more recent estimate than previously reported, and questions the interpretation of controversial Lower Jurassic fossils as being part of the extant turtles radiation.
These results provide a phylogenetic framework and timescale with which to interpret the evolution of the peculiar morphological, developmental, and molecular features of turtles within the amniotes.
When simple sequence repeats are integrated into functional genes, they can potentially act as evolutionary ‘tuning knobs’, supplying abundant genetic variation with minimal risk of pleiotropic deleterious effects. The genetic basis of variation in facial shape and length represents a possible example of this phenomenon. Runt-related transcription factor 2 (RUNX2), which is involved in osteoblast differentiation, contains a functionally-important tandem repeat of glutamine and alanine amino acids. The ratio of glutamines to alanines (the QA ratio) in this protein seemingly influences the regulation of bone development. Notably, in domestic breeds of dog, and in carnivorans in general, the ratio of glutamines to alanines is strongly correlated with facial length.
In this study we examine whether this correlation holds true across placental mammals, particularly those mammals for which facial length is highly variable and related to adaptive behavior and lifestyle (e.g., primates, afrotherians, xenarthrans). We obtained relative facial length measurements and RUNX2 sequences for 41 mammalian species representing 12 orders. Using both a phylogenetic generalized least squares model and a recently-developed Bayesian comparative method, we tested for a correlation between genetic and morphometric data while controlling for phylogeny, evolutionary rates, and divergence times. Non-carnivoran taxa generally had substantially lower glutamine-alanine ratios than carnivorans (primates and xenarthrans with means of 1.34 and 1.25, respectively, compared to a mean of 3.1 for carnivorans), and we found no correlation between RUNX2 sequence and face length across placental mammals.
Results of our diverse comparative phylogenetic analyses indicate that QA ratio does not consistently correlate with face length across the 41 mammalian taxa considered. Thus, although RUNX2 might function as a ‘tuning knob’ modifying face length in carnivorans, this relationship is not conserved across mammals in general.
Mammalian evolution; Prognathism; Molecular evolution; Primates; Afrotheria; Xenarthra; Morphology
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment.
We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence.
MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.
Tunicates represent a key metazoan group as the sister-group of vertebrates within chordates. The six complete mitochondrial genomes available so far for tunicates have revealed distinctive features. Extensive gene rearrangements and particularly high evolutionary rates have been evidenced with regard to other chordates. This peculiar evolutionary dynamics has hampered the reconstruction of tunicate phylogenetic relationships within chordates based on mitogenomic data.
In order to further understand the atypical evolutionary dynamics of the mitochondrial genome of tunicates, we determined the complete sequence of the solitary ascidian Herdmania momus. This genome from a stolidobranch ascidian presents the typical tunicate gene content with 13 protein-coding genes, 2 rRNAs and 24 tRNAs which are all encoded on the same strand. However, it also presents a novel gene arrangement, highlighting the extreme plasticity of gene order observed in tunicate mitochondrial genomes. Probabilistic phylogenetic inferences were conducted on the concatenation of the 13 mitochondrial protein-coding genes from representatives of major metazoan phyla. We show that whereas standard homogeneous amino acid models support an artefactual sister position of tunicates relative to all other bilaterians, the CAT and CAT+BP site- and time-heterogeneous mixture models place tunicates as the sister-group of vertebrates within monophyletic chordates. Moreover, the reference phylogeny indicates that tunicate mitochondrial genomes have experienced a drastic acceleration in their evolutionary rate that equally affects protein-coding and ribosomal-RNA genes.
This is the first mitogenomic study supporting the new chordate phylogeny revealed by recent phylogenomic analyses. It illustrates the beneficial effects of an increased taxon sampling coupled with the use of more realistic amino acid substitution models for the reconstruction of animal phylogeny.
Tunicates have been recently revealed to be the closest living relatives of vertebrates. Yet, with more than 2500 described species, details of their evolutionary history are still obscure. From a molecular point of view, tunicate phylogenetic relationships have been mostly studied based on analyses of 18S rRNA sequences, which indicate several major clades at odds with the traditional class-level arrangements. Nonetheless, substantial uncertainty remains about the phylogenetic relationships and taxonomic status of key groups such as the Aplousobranchia, Appendicularia, and Thaliacea.
Thirty new complete 18S rRNA sequences were acquired from previously unsampled tunicate species, with special focus on groups presenting high evolutionary rate. The updated 18S rRNA dataset has been aligned with respect to the constraint on homology imposed by the rRNA secondary structure. A probabilistic framework of phylogenetic reconstruction was adopted to accommodate the particular evolutionary dynamics of this ribosomal marker. Detailed Bayesian analyses were conducted under the non-parametric CAT mixture model accounting for site-specific heterogeneity of the evolutionary process, and under RNA-specific doublet models accommodating the occurrence of compensatory substitutions in stem regions. Our results support the division of tunicates into three major clades: 1) Phlebobranchia + Thaliacea + Aplousobranchia, 2) Appendicularia, and 3) Stolidobranchia, but the position of Appendicularia could not be firmly resolved. Our study additionally reveals that most Aplousobranchia evolve at extremely high rates involving changes in secondary structure of their 18S rRNA, with the exception of the family Clavelinidae, which appears to be slowly evolving. This extreme rate heterogeneity precluded resolving with certainty the exact phylogenetic placement of Aplousobranchia. Finally, the best fitting secondary-structure and CAT-mixture models suggest a sister-group relationship between Salpida and Pyrosomatida within Thaliacea.
An updated phylogenetic framework for tunicates is provided based on phylogenetic analyses using the most realistic evolutionary models currently available for ribosomal molecules and an unprecedented taxonomic sampling. Detailed analyses of the 18S rRNA gene allowed a clear definition of the major tunicate groups and revealed contrasting evolutionary dynamics among major lineages. The resolving power of this gene nevertheless appears limited within the clades composed of Phlebobranchia + Thaliacea + Aplousobranchia and Pyuridae + Styelidae, which were delineated as spots of low resolution. These limitations underline the need to develop new nuclear markers in order to further resolve the phylogeny of this keystone group in chordate evolution.
Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees.
We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database.
PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: .
Molecular sequence data have become the standard in modern day phylogenetics. In particular, several long-standing questions of mammalian evolutionary history have been recently resolved thanks to the use of molecular characters. Yet, most studies have focused on only a handful of standard markers. The availability of an ever increasing number of whole genome sequences is a golden mine for modern systematics. Genomic data now provide the opportunity to select new markers that are potentially relevant for further resolving branches of the mammalian phylogenetic tree at various taxonomic levels.
The EnsEMBL database was used to determine a set of orthologous genes from 12 available complete mammalian genomes. As targets for possible amplification and sequencing in additional taxa, more than 3,000 exons of length > 400 bp have been selected, among which 118, 368, 608, and 674 are respectively retrieved for 12, 11, 10, and 9 species. A bioinformatic pipeline has been developed to provide evolutionary descriptors for these candidate markers in order to assess their potential phylogenetic utility. The resulting OrthoMaM (Orthologous Mammalian Markers) database can be queried and alignments can be downloaded through a dedicated web interface .
The importance of marker choice in phylogenetic studies has long been stressed. Our database centered on complete genome information now makes possible to select promising markers to a given phylogenetic question or a systematic framework by querying a number of evolutionary descriptors. The usefulness of the database is illustrated with two biological examples. First, two potentially useful markers were identified for rodent systematics based on relevant evolutionary parameters and sequenced in additional species. Second, a complete, gapless 94 kb supermatrix of 118 orthologous exons was assembled for 12 mammals. Phylogenetic analyses using probabilistic methods unambiguously supported the new placental phylogeny by retrieving the monophyly of Glires, Euarchontoglires, Laurasiatheria, and Boreoeutheria. Muroid rodents thus do not represent a basal placental lineage as it was mistakenly reasserted in some recent phylogenomic analyses based on fewer taxa. We expect the OrthoMaM database to be useful for further resolving the phylogenetic tree of placental mammals and for better understanding the evolutionary dynamics of their genomes, i.e., the forces that shaped coding sequences in terms of selective constraints.
Probabilistic methods have progressively supplanted the Maximum Parsimony (MP) method for inferring phylogenetic trees. One of the major reasons for this shift was that MP is much more sensitive to the Long Branch Attraction (LBA) artefact than is Maximum Likelihood (ML). However, recent work by Kolaczkowski and Thornton suggested, on the basis of simulations, that MP is less sensitive than ML to tree reconstruction artefacts generated by heterotachy, a phenomenon that corresponds to shifts in site-specific evolutionary rates over time. These results led these authors to recommend that the results of ML and MP analyses should be both reported and interpreted with the same caution. This specific conclusion revived the debate on the choice of the most accurate phylogenetic method for analysing real data in which various types of heterogeneities occur. However, variation of evolutionary rates across species was not explicitly incorporated in the original study of Kolaczkowski and Thornton, and in most of the subsequent heterotachous simulations published to date, where all terminal branch lengths were kept equal, an assumption that is biologically unrealistic.
In this report, we performed more realistic simulations to evaluate the relative performance of MP and ML methods when two kinds of heterogeneities are considered: (i) within-site rate variation (heterotachy), and (ii) rate variation across lineages. Using a similar protocol as Kolaczkowski and Thornton to generate heterotachous datasets, we found that heterotachy, which constitutes a serious violation of existing models, decreases the accuracy of ML whatever the level of rate variation across lineages. In contrast, the accuracy of MP can either increase or decrease when the level of heterotachy increases, depending on the relative branch lengths. This result demonstrates that MP is not insensitive to heterotachy, contrary to the report of Kolaczkowski and Thornton. Finally, in the case of LBA (i.e. when two non-sister lineages evolved faster than the others), ML outperforms MP over a wide range of conditions, except for unrealistic levels of heterotachy.
For realistic combinations of both heterotachy and variation of evolutionary rates across lineages, ML is always more accurate than MP. Therefore, ML should be preferred over MP for analysing real data, all the more so since parametric methods also allow one to handle other types of biological heterogeneities much better, such as among sites rate variation. The confounding effects of heterotachy on tree reconstruction methods do exist, but can be eschewed by the development of mixture models in a probabilistic framework, as proposed by Kolaczkowski and Thornton themselves.
Comparative genomic data among organisms allow the reconstruction of their phylogenies and evolutionary time scales. Molecular timings have been recently used to suggest that environmental global change have shaped the evolutionary history of diverse terrestrial organisms. Living xenarthrans (armadillos, anteaters and sloths) constitute an ideal model for studying the influence of past environmental changes on species diversification. Indeed, extant xenarthran species are relicts from an evolutionary radiation enhanced by their isolation in South America during the Tertiary era, a period for which major climate variations and tectonic events are relatively well documented.
We applied a Bayesian approach to three nuclear genes in order to relax the molecular clock assumption while accounting for differences in evolutionary dynamics among genes and incorporating paleontological uncertainties. We obtained a molecular time scale for the evolution of extant xenarthrans and other placental mammals. Divergence time estimates provide substantial evidence for contemporaneous diversification events among independent xenarthran lineages. This correlated pattern of diversification might possibly relate to major environmental changes that occurred in South America during the Cenozoic.
The observed synchronicity between planetary and biological events suggests that global change played a crucial role in shaping the evolutionary history of extant xenarthrans. Our findings open ways to test this hypothesis further in other South American mammalian endemics like hystricognath rodents, platyrrhine primates, and didelphid marsupials.
Mammals; Xenarthrans; Evolution; Palaeontology; Phylogeny; Relaxed molecular clock; Bayesian dating; Global change; Tertiary; South America
Nothofagus (southern beech), with an 80-million-year-old fossil record, has become iconic as a plant genus whose ancient Gondwanan relationships reach back into the Cretaceous era. Closely associated with Wegener's theory of “Kontinentaldrift”, Nothofagus has been regarded as the “key genus in plant biogeography”. This paradigm has the New Zealand species as passengers on a Moa's Ark that rafted away from other landmasses following the breakup of Gondwana. An alternative explanation for the current transoceanic distribution of species seems almost inconceivable given that Nothofagus seeds are generally thought to be poorly suited for dispersal across large distances or oceans. Here we test the Moa's Ark hypothesis using relaxed molecular clock methods in the analysis of a 7.2-kb fragment of the chloroplast genome. Our analyses provide the first unequivocal molecular clock evidence that, whilst some Nothofagus transoceanic distributions are consistent with vicariance, trans-Tasman Sea distributions can only be explained by long-distance dispersal. Thus, our analyses support the interpretation of an absence of Lophozonia and Fuscospora pollen types in the New Zealand Cretaceous fossil record as evidence for Tertiary dispersals of Nothofagus to New Zealand. Our findings contradict those from recent cladistic analyses of biogeographic data that have concluded transoceanic Nothofagus distributions can only be explained by vicariance events and subsequent extinction. They indicate that the biogeographic history of Nothofagus is more complex than envisaged under opposing polarised views expressed in the ongoing controversy over the relevance of dispersal and vicariance for explaining plant biodiversity. They provide motivation and justification for developing more complex hypotheses that seek to explain the origins of Southern Hemisphere biota.
A phylogenetic analysis of Nothofagus species provides evidence for their transoceanic dispersal during the Tertiary, and helps resolve the debate about the origins of plant biodiversity in the Southern Hemisphere
A phylogenetic study of army ants has completely changed our view of their evolutionary history, including the origin of the curious and fatal phenomenon known as circular mill formation