|Home | About | Journals | Submit | Contact Us | Français|
While a unique origin of the euarthropods is well established, relationships between the four euarthropod classes—chelicerates, myriapods, crustaceans and hexapods—are less clear. Unsolved questions include the position of myriapods, the monophyletic origin of chelicerates, and the validity of the close relationship of euarthropods to tardigrades and onychophorans. Morphology predicts that myriapods, insects and crustaceans form a monophyletic group, the Mandibulata, which has been contradicted by many molecular studies that support an alternative Myriochelata hypothesis (Myriapoda plus Chelicerata). Because of the conflicting insights from published molecular datasets, evidence from nuclear-coding genes needs corroboration from independent data to define the relationships among major nodes in the euarthropod tree. Here, we address this issue by analysing two independent molecular datasets: a phylogenomic dataset of 198 protein-coding genes including new sequences for myriapods, and novel microRNA complements sampled from all major arthropod lineages. Our phylogenomic analyses strongly support Mandibulata, and show that Myriochelata is a tree-reconstruction artefact caused by saturation and long-branch attraction. The analysis of the microRNA dataset corroborates the Mandibulata, showing that the microRNAs miR-965 and miR-282 are present and expressed in all mandibulate species sampled, but not in the chelicerates. Mandibulata is further supported by the phylogenetic analysis of a comprehensive morphological dataset covering living and fossil arthropods, and including recently proposed, putative apomorphies of Myriochelata. Our phylogenomic analyses also provide strong support for the inclusion of pycnogonids in a monophyletic Chelicerata, a paraphyletic Cycloneuralia, and a common origin of Arthropoda (tardigrades, onychophorans and arthropods), suggesting that previous phylogenies grouping tardigrades and nematodes may also have been subject to tree-reconstruction artefacts.
With over 1 million living species described and a rich 520 Myr fossil record, arthropods are the most species-rich clade of animals on Earth, accounting for nearly 80 per cent of animal biodiversity . Four main euarthropod sub-phyla are recognized: Hexapoda (including insects); Crustacea (lobsters, water fleas and others); Myriapoda (e.g. millipedes and centipedes); and Chelicerata (including arachnids, horseshoe crabs and possibly sea spiders). After many years of debate, a consensus has emerged that these four classes (or sub-phyla) form a monophyletic group called the Euarthropoda [2,3]. The relationships between the four euarthropod groups remain disputed, however, as is the validity of their close relationship to tardigrades (water bears) and onychophorans (velvet worms) in a more inclusive clade called Arthropoda (named Panarthropoda by Nielsen ).
Within the Euarthropoda, the main point of disagreement concerns the position of the myriapods, which were long thought to be most closely related to the hexapods . Myriapods and hexapods notably share a distinctive head composed of five segments distinguished by their unique appendages—the antennal, intercalary (appendage-less), mandibular, and usually two pairs of maxillae (the second being the insect labium). Molecular data, however, have shown crustaceans, which differ in having a second antennal rather than an intercalary segment, to be the closest sister group of hexapods in a clade named Pancrustacea or Tetraconata [6,7]. When compared with chelicerates, the detailed similarities of the arrangement of head segments and associated appendages in Pancrustacea and myriapods strongly support their sister group relationship within a wider clade that has been named the Mandibulata in recognition of the similarity of their biting mouthparts (see the electronic supplementary material). Considering the complex shared features of myriapod and pancrustacean head morphology, it is surprising that the majority of published molecular phylogenetic analyses do not support the Mandibulata, instead placing the myriapods as the sister group of the chelicerates in an assemblage that has been named the Myriochelata or Paradoxopoda [8,9]. Molecular support for Myriochelata was initially obtained using large and small subunit rRNAs  and later Hox genes , mitochondrial protein-coding sequences  and combined datasets of both nuclear and mitochondrial genes . Myriochelata was also supported by several phylogenomic analyses [12–15]. However, recently, a dataset of 62 nuclear protein-coding genes found support for Mandibulata . Regier et al.  did not identify the factors underpinning the difference between their new results and those of previously published phylogenies that supported Myriochelata. Consequently, and in light of the varying results from these molecular samples, the Mandibulata versus Myriochelata controversy remains an open question.
Uncertainty in deep arthropod phylogeny has recently been reinforced as Mayer & Whitington  proposed various putative synapomorphies of the Myriochelata, including a revised character polarity for the well-studied neuro-developmental pattern , and the mechanism of dorsoventral patterning. Here, debate surrounds the ancestral conditions, specifically whether nervous tissue forms from immigration of single or clusters of cells, and whether or not the neuroectoderm invaginates in each developing segment.
In a similar conflict between molecules and morphology, arthropods share features including segmentation and appendages with tardigrades and onychophorans , yet a close relationship between these three phyla has not been clearly supported by molecular analyses. A close relationship between onychophorans and euarthropods is widely accepted, but affinities of tardigrades are less clear, to the extent that they have been linked with nematodes in several phylogenomic studies [13–15]. Recently, a mitogenomic study of the Ecdysozoa supported a monophyletic origin of these three groups, although support is model-dependent .
There are two explanations for the discrepancies between different molecular datasets and between molecules and morphology. First, morphology may mislead—mandibles might have evolved independently in pancrustaceans and myriapods or been lost in chelicerates; similarly, segmentation and legs may have appeared separately in arthropods, onychophorans and tardigrades. The second explanation is that some molecular data may be affected by errors—either stochastic (unlikely with phylogenomic scale datasets) or systematic such as compositional bias or long-branch attraction (LBA) [20–22]. The possibility of systematic error is suggested by some datasets being equivocal regarding myriapod [7,9,19,23,24] or tardigrade affinities [12,19].
To resolve the phylogenetic relationships of the arthropods and their ecdysozoan outgroups, we present analyses of three independent datasets. The first is a phylogenomic dataset of 198 protein-coding genes, which includes new data from the pivotal myriapods. The second is a novel set of arthropod microRNAs (miRNAs), small non-coding regulatory genes implicated in the control of cellular differentiation and homeostasis. The third is a comprehensive dataset of 393 morphological characters, including the recently proposed morphological homologies of Myriochelata  and recent gene expression data  alongside new and traditional characters supporting the Mandibulata.
In addition, we have explored the nature of the conflict between molecular datasets supporting alternative arthropod phylogenies by assaying the potential effects of systematic error on our phylogenomic dataset using an experimental approach coupling targeted taxon-sampling, the use of alternative models of molecular evolution, and the analyses of subsets of slowly evolving sites extracted from our full dataset.
Detailed description of methods used to generate novel expressed sequence tags and, miRNA datasets, to assemble and align sets of orthologous genes, and for phylogenetic analyses of phylogenomic and morphological datasets, are available in the electronic supplementary material.
To elucidate the phylogenetic position of myriapods and the discrepancy between recent analyses [12,16], we first analysed a phylogenomic dataset of 198 genes (corresponding to 40 100 reliably aligned amino acid positions) from 30 taxa (see figure 1). The dataset contains new sequences from the centipede Strigamia maritima. Bayesian analysis using the CAT + Γ model in the software package Phylobayes  supports monophyly of Mandibulata with a posterior probability (PP) of 0.92 and a non-parametric bootstrap support (BS) value of 79 per cent. A Bayesian analysis using an even larger sampling of 59 taxa and the mixed CAT-general time reversible (GTR) + Γ model corroborates these findings (see the electronic supplementary material, figures S1 and S2). Furthermore, our analysis supports the monophyly of Chelicerata (Pycnogonida plus Arachnida), a close relationship between Branchiopoda and Hexapoda, monophyly of Arthropoda (Eurthropoda, Tardigrada and Onychophora), and a paraphyletic origin of the Cycloneuralia (Nematoda more closely related to Arthropoda than to Scalidophora). These relationships are further addressed in §3e.
Our results are in accordance with those of Regier et al. , but in contradiction of other phylogenomic studies [12,13,15]. We therefore explored whether systematic errors, in particular LBA, could have caused the discrepancy between our results and those of studies supporting Myriochelata. In this context, one notable aspect of the tree in figure 1 is the different branch lengths seen in various taxonomic groups. Pancrustacea have long branches in comparison to Myriapoda and Chelicerata, suggesting that in previous studies the fast evolving Pancrustacea could have been attracted towards the distant outgroup, resulting in the clustering of slowly evolving Myriapoda and Chelicerata owing to LBA. Because systematic errors, particularly LBA, become more apparent when the substitution model is unable to handle multiple substitutions correctly , we first asked how models such as Whelan and Goldman (WAG) + F + Γ and GTR + Γ—which assume homogeneity of the substitution process—fit our data. We find that WAG + F + Γ and GTR + Γ fit the data significantly less well than the heterogeneous CAT + Γ model (see the electronic supplementary material), and that this reduced fit is matched by reduction in support for Mandibulata over Myriochelata (figure 2a and electronic supplementary material, figure S3a).
We next explored the possible effects of LBA using a strategy of different taxon sampling. Logically, if Myriochelata is the result of an LBA artefact, exaggerating this source of error by using long-branched or evolutionarily distant outgroups will result in more support for this artefactual clade. Conversely, the use of the shortest branched outgroups should reduce the effects of LBA and result in lower support for Myriochelata. Both of these predictions are supported; when we used either the most phylogenetically distant outgroup (Lophotrochozoa, figure 2b and electronic supplementary material, figure S3b) or the fastest evolving ecdysozoan outgroup (Nematoda, figure 2c and electronic supplementary material, S3c), support decreases for Mandibulata and the artefactual group of slow evolving Myriapoda and Chelicerata (Myriochelata, in grey) increases. Equally, removal of these distant outgroups and their replacement with shorter branched taxa (e.g. Onychophora and Priapulida ) results in increased support for Mandibulata over Myriochelata (figure 2d and electronic supplementary material, figure S3d). We also performed a bootstrap analysis (under CAT + Γ) excluding the fast evolving nematodes and tardigrades, which found 90 per cent support for Mandibulata. Notably, both Lophotrochozoa and Nematoda contain species with divergent amino acid composition (see the electronic supplementary material, table S1), supporting our inference that they represent less suitable outgroups .
Using our phylogenomic dataset, we have shown that conditions which reduce LBA result in the highest support for Mandibulata, whereas conditions that increase LBA result in increased support for Myriochelata, implying the artefactual nature of the latter. We replicated these findings using the set of 150 genes of Dunn et al. , hereafter ‘Dunn’. Reanalysis of a dataset using their original taxon sampling (of 16 ecdysozoans) resulted in strong support for Myriochelata (figure 3a and electronic supplementary material, figure S4a) in accordance with their original analysis. To test if the difference between our phylogeny (which supports Mandibulata) and that of Dunn (which favoured Myriochelata) is owing to taxonomic sampling we expanded their taxonomic representation to include all of our 30 taxa. Under these conditions, modest support for Mandibulata is obtained using the CAT + Γ model while support for Myriochelata decreased under WAG + F + Γ and GTR + Γ (figure 3b and electronic supplementary material, figure S4b). However, when we remove fast evolving outgroups the support for Mandibulata increases significantly (figure 3c and electronic supplementary material, figure S4c). Removal of fast evolving characters (see the electronic supplementary material, figure S5a) also results in support for Mandibulata instead of Myriochelata. Notably, even with identical taxonomic sampling our 198 gene set provides more support for Mandibulata than do the 150 genes of Dunn et al. (compare figures 2c and and33c). The difference may be partly explained by our dataset being larger and more complete (40 100 positions, 69% complete versus 18 829 positions, 61% complete), but also by the lower substitutional saturation of our genes (see the electronic supplementary material, figure S5b).
A useful way to test between the competing Mandibulata and Myriochelata phylogenetic hypotheses is to use an independent data source. We therefore explored the miRNA complements of key arthropod taxa using a combination of genomic sequence searches coupled with the generation and analysis of multiple small-RNA libraries. Novel miRNAs appear to have accumulated in animal genomes through time, and, although short, they show a level of sequence conservation exceeding that of ribosomal DNA , making it relatively easy to identify these novel miRNAs in descendant taxa. The apparent rarity of loss of miRNAs within evolutionary lineages coupled with the low likelihood of convergent evolution  makes miRNAs a valuable class of rare genomic characters in phylogenetics.
One miRNA, miR-965, had previously been found only in Pancrustacea and had been shown to be absent from the genome of the chelicerate Ixodes scapularis . Importantly, we found reads of the mature miR-965 in the small RNA libraries of both myriapods (Glomeris marginata and Scutigera coleoptata), and also in the genome of the centipede S. maritima (figure 4). Screening our miRNA libraries also showed that in addition to being absent from the genomic sequence of the tick (I. scapularis), miR-965 could not be detected in the xiphosuran Limulus polyphemus or in the arachnid Acanthoscurria chacoana. Consequently, this distribution supports miR-965 as a genomic apomorphy (a rare genomic change) of the Mandibulata (figure 4). This same distribution is true of a second miRNA miR-282 that we have found only in insects, crustaceans and the centipedes Strigamia and Scutigera. miR-282 was not found in the Glomeris small RNA library and this may be because miR-282 is expressed at low levels in all Mandibulata sampled and the total number of reads and sequencing depth was relatively low in the Glomeris miRNA library.
In addition, upon screening the L. polyphemus and A. chacoana small-RNA libraries, we identified a novel chelicerate miRNA (Arthropod-Novel-1) that is not present in the Mandibulata, but is present in the genome of the tick I. scapularis (figure 4), and we thus suggest this miRNA to be a new genomic apomorphy for the Euchelicerata (Xiphosura and Arachnida). We have also identified a novel myriapod-specific miRNA (Arthropod-Novel-2) in the small-RNA libraries of G. marginata and S. coleoptrata, and in the genome of S. maritima, but not in the libraries or genomes of any other non-myriapod taxon analysed (figure 4). Further Myriapod-specific molecular synapomorphies have recently been described .
We assembled a large matrix of morphological data, which provides a third independent line of evidence in support of Mandibulata. While a number of possible morphological apomorphies of Myriochelata have recently been identified , inclusion of these characters in a cladistic analysis of 393 morphological characters still results in overall support for Mandibulata (Bremer support = 5) rather than Myriochelata, with or without the inclusion of fossil taxa (see figure 5 and electronic supplementary material). The Palaeozoic fossil taxa Tanazios, Martinssonia, and Trilobita (Olenoides) are resolved progressively more stemward relative to the mandibulate crown group. Although support values for the deep nodes in the mandibulate stem- and crown groups are weak when the fossils are included (Bremer values mostly 1 and jackknife frequencies mostly less than 50%), support for the mandibulate crown-group is increased when the analysis is confined to extant taxa because support is concentrated at a single node rather than broken up at series of nodes along the stem lineage.
Morphological support for Mandibulata includes complex similarities of head structure  and specifically of their mandibles, arrangements of midline neuropils in the brain, correspondences in cell numbers and specialized cell types in the ommatidia, similar sternal buds in the stomodeal region, and specific arrangements of serotonin-reactive neurons in the nerve cord (see the electronic supplementary material for a detailed compilation of morphological and developmental genetic characters).
Most of our phylogenomic analyses support the monophyly of Arthropoda (euarthropods, tardigrades, onychophorans), either using our gene sampling (figure 1) or that of Dunn (figure 3b). The position of tardigrades is more unstable, varying from being sister to the onychophorans (figure 1 using CAT + Γ model) to being sister to a group of arthropods plus onychophorans (see the electronic supplementary material, figure S2 using the CAT + GTR model). Whereas the CAT + Γ model supports Arthropoda consistently, site-homogeneous WAG + F + Γ and GTR + Γ models tend to group tardigrades with nematodes (dotted arrows in the electronic supplementary material, figures S3 and S4). Our interpretation is that site-homogeneous models, which fit our data less well than the CAT model (see §2), are unable to overcome the effect of systematic errors responsible for the grouping of fast evolving nematodes and tardigrades.
All our phylogenomic analyses support a monophyletic origin of the chelicerates in which pycnogonids are sister to a group of arachnids plus Xiphosura. This finding is significant in light of recent debates over the position of the Pycnogonida, which some studies find to be the sister group to all other arthropods, a hypothesis known as Cormogonida [23,32,33]. The possibility that systematic/stochastic errors were affecting the affinity of pycnogonids in previous studies is highlighted by their position being parameter-dependent in other studies [16,24,34].
Finally, all our phylogenomic analyses support a paraphyletic origin of the Cycloneuralia, with the Scalidophora (priapulids and kinorhynchs) sister to a group of nematodes plus arthropods. This is in accordance with ribosomal markers , but in contrast to previous phylogenomic studies [12,13], which instead supported monophyly of Cycloneuralia (Nematodoida + Scalidophora). Notably, when updating the gene selection of Dunn et al.  to our larger taxon sampling, a paraphyletic origin of the Cycloneuralia is recovered. Ultimately, the relationships of Nematodoida, Scalidophora and Arthropoda remain uncertain.
Arguably the strongest evidence of phylogenetic accuracy is the congruence of independent lines of evidence supporting the same tree topology [22,35]. In order to test current hypotheses of arthropod evolution, we have analysed three independent lines of evidence: a phylogenomic dataset of 198 genes, a new miRNA dataset and a large morphological dataset. All three datasets unambiguously support the monophyly of Mandibulata.
We have examined the possibility that previous molecular phylogenies supporting Myriochelata might have been affected by systematic error and the robustness of the result from our phylogenomic dataset is supported by experiments designed to reduce the effects of systematic errors. Increased taxon sampling, exclusion of outgroups with the longest branches, removal of the fastest evolving positions and the use of better evolutionary models systematically increase support for Mandibulata over Myriochelata.
The presence of miR-965 and miR-282 in Pancrustacea and in two groups of Myriapoda also represents compelling evidence in support of Mandibulata. These two miRNA are absent from both arachnids and horseshoe crabs as well as from all other Ecdysozoans for which the miRNA complement is known (nematodes and priapulids worms). As it is implausible for this miRNA to have been independently acquired in the different mandibulate lineages , we conclude that it constitutes a rare genomic change supporting Mandibulata. In light of congruence of these novel miRNA autapomorphies with other lines of evidence presented here (phylogenomics and morphology) and with the complementary findings of Regier et al. , we conclude that the most tenable position of the Myriapoda is as the sister group of the Pancrustacea within a monophyletic Mandibulata.
Our phylogenomic analyses suggest that studies which have grouped tardigrades with nematodes may have been similarly affected by LBA. When analysed using the CAT model, which has been shown to help in overcoming systematic errors , both our dataset and that of Dunn et al.  group Tardigrada with Euarthropoda and Onychophora in a monophyletic Arthropoda clade. Tardigrada are a sister group of the Onychophora in these trees, a topology which finds no support from a morphological point of view, but is in accordance with mitochondrial markers . Furthermore, if the paraphyletic nature of the Cycloneuralia is correct, as supported by our phylogenomic analyses, this would suggest that the ancestral Ecdysozoa was cycloneuralian-like, possessing a circumpharyngeal brain and an introvert .
The Mandibulata, which includes insects, is by far the largest clade of animals on Earth, but the origin of this successful bodyplan in terms of the evolution of its development remains obscure. The picture from palaeontology is, however, somewhat clearer. Cambrian fossils that have been identified as a grade of stem-group mandibulates  indicate a crustacean-like habitus for basal members of the Mandibulata and may shed light on how the mandible common to these groups evolved. The limb on the third cephalic segment (the mandible homologue) in Cambrian stem-group mandibulates such as Martinssonia displays a stronger development of a movable, setose process at the limb base (‘proximal endite’; ) than that on the adjacent limbs . The more elaborated proximal endite used for food manipulation is viewed as a precursor to the fully differentiated coxal chewing surface in the mandibulate crown group . Further studies of fossils and embryos in the light of what we suggest is a reliable phylogeny of arthropod classes should clarify the evolution of the mandibulate bodyplan , and consequently how anatomical novelties may have promoted their hugely successful radiation.
We thank Michael Akam and Ariel Chipman for the Strigamia maritima cDNA library, Baylor College of Medicine Human Genome Sequencing Center for access to the Strigamia genome data and Alexandros Stamatakis for a RaxML version implementing the GTR model for amino acids. D.P. and L.C. are supported by a Science Foundation Ireland RFP grant to D.P. (08/RFP/EOB1595), and by a Science Foundation Ireland Short Term Travel Fellowship to D.P. O.R.S. was supported by the Zoonet Marie Curie Training Network to M.J.T. and by a Science Foundation Ireland FRP grant (08/RFP/EOB1595) to D.P. S.J.L. is supported by an IRCSET fellowship. K.J.P. is supported by the National Science Foundation. H.P. gratefully acknowledges the financial support provided by NSERC, the Canadian Research Chair Program and the Université de Montréal, and the Réseau Québecois de Calcul de Haute Performance for computational resources.