In recent years, avian systematics has been characterized by a diminished reliance on morphological cladistics of modern taxa, intensive palaeornithogical research stimulated by new discoveries and an inundation by analyses based on DNA sequences. Unfortunately, in contrast to significant insights into basal origins, the broad picture of neornithine phylogeny remains largely unresolved. Morphological studies have emphasized characters of use in palaeontological contexts. Molecular studies, following disillusionment with the pioneering, but non-cladistic, work of Sibley and Ahlquist, have differed markedly from each other and from morphological works in both methods and findings. Consequently, at the turn of the millennium, points of robust agreement among schools concerning higher-order neornithine phylogeny have been limited to the two basalmost and several mid-level, primary groups. This paper describes a phylogenetic (cladistic) analysis of 150 taxa of Neornithes, including exemplars from all non-passeriform families, and subordinal representatives of Passeriformes. Thirty-five outgroup taxa encompassing Crocodylia, predominately theropod Dinosauria, and selected Mesozoic birds were used to root the trees. Based on study of specimens and the literature, 2954 morphological characters were defined; these characters have been described in a companion work, approximately one-third of which were multistate (i.e. comprised at least three states), and states within more than one-half of these multistate characters were ordered for analysis. Complete heuristic searches using 10 000 random-addition replicates recovered a total solution set of 97 well-resolved, most-parsimonious trees (MPTs). The set of MPTs was confirmed by an expanded heuristic search based on 10 000 random-addition replicates and a full ratchet-augmented exploration to ascertain global optima. A strict consensus tree of MPTs included only six trichotomies, i.e. nodes differing topologically among MPTs. Bootstrapping (based on 10 000 replicates) percentages and ratchet-minimized support (Bremer) indices indicated most nodes to be robust. Several fossil Neornithes (e.g. Dinornithiformes, Aepyornithiformes) were placed within the ingroup a posteriori either through unconstrained, heursitic searches based on the complete matrix augmented by these taxa separately or using backbone-constraints. Analysis confirmed the topology among outgroup Theropoda and achieved robust resolution at virtually all levels of the Neornithes. Findings included monophyly of the palaeognathous birds, comprising the sister taxa Tinamiformes and ratites, respectively, and the Anseriformes and Galliformes as monophyletic sister-groups, together forming the sister-group to other Neornithes exclusive of the Palaeognathae (Neoaves). Noteworthy inferences include: (i) the sister-group to remaining Neoaves comprises a diversity of marine and wading birds; (ii) Podicipedidae are the sister-group of Gaviidae, and not closely related to the Phoenicopteridae, as recently suggested; (iii) the traditional Pelecaniformes, including the shoebill (Balaeniceps rex) as sister-taxon to other members, are monophyletic; (iv) traditional Ciconiiformes are monophyletic; (v) Strigiformes and Falconiformes are sister-groups; (vi) Cathartidae is the sister-group of the remaining Falconiformes; (vii) Ralliformes (Rallidae and Heliornithidae) are the sister-group to the monophyletic Charadriiformes, with the traditionally composed Gruiformes and Turniciformes (Turnicidae and Mesitornithidae) sequentially paraphyletic to the entire foregoing clade; (viii) Opisthocomus hoazin is the sister-taxon to the Cuculiformes (including the Musophagidae); (ix) traditional Caprimulgiformes are monophyletic and the sister-group of the Apodiformes; (x) Trogoniformes are the sister-group of Coliiformes; (xi) Coraciiformes, Piciformes and Passeriformes are mutually monophyletic and closely related; and (xii) the Galbulae are retained within the Piciformes. Unresolved portions of the Neornithes (nodes having more than one most-parsimonious solution) comprised three parts of the tree: (a) several interfamilial nodes within the Charadriiformes; (b) a trichotomy comprising the (i) Psittaciformes, (ii) Columbiformes and (iii) Trogonomorphae (Trogoniformes, Coliiformes) + Passerimorphae (Coraciiformes, Piciformes, Passeriformes); and (c) a trichotomy comprising the Coraciiformes, Piciformes and Passeriformes. The remaining polytomies were among outgroups, although several of the highest-order nodes were only marginally supported; however, the majority of nodes were resolved and met or surpassed conventional standards of support. Quantitative comparisons with alternative hypotheses, examination of highly supportive and diagnostic characters for higher taxa, correspondences with prior studies, complementarity and philosophical differences with palaeontological phylogenetics, promises and challenges of palaeogeography and calibration of evolutionary rates of birds, and classes of promising evidence and future directions of study are reviewed. Homology, as applied to avian examples of apparent homologues, is considered in terms of recent theory, and a revised annotated classification of higher-order taxa of Neornithes and other closely related Theropoda is proposed. © 2007 The Linnean Society of London, Zoological Journal of the Linnean Society, 2007, 149, 1–95.
Aves; cladistics; classification; convergence; homology; morphology; ontogeny; palaeontology; phylogenetics; Neornithes; taxonomy
Brachiopod and phoronid phylogeny is inferred from SSU rDNA sequences of 28 articulate and nine inarticulate brachiopods, three phoronids, two ectoprocts and various outgroups, using gene trees reconstructed by weighted parsimony, distance and maximum likelihood methods. Of these sequences, 33 from brachiopods, two from phoronids and one each from an ectoproct and a priapulan are newly determined. The brachiopod sequences belong to 31 different genera and thus survey about 10% of extant genus-level diversity. Sequences determined in different laboratories and those from closely related taxa agree well, but evidence is presented suggesting that one published phoronid sequence (GenBank accession UO12648) is a brachiopod-phoronid chimaera, and this sequence is excluded from the analyses. The chiton, Acanthopleura, is identified as the phenetically proximal outgroup; other selected outgroups were chosen to allow comparison with recent, non-molecular analyses of brachiopod phylogeny. The different outgroups and methods of phylogenetic reconstruction lead to similar results, with differences mainly in the resolution of weakly supported ancient and recent nodes, including the divergence of inarticulate brachiopod sub-phyla, the position of the rhynchonellids in relation to long- and short-looped articulate brachiopod clades and the relationships of some articulate brachiopod genera and species. Attention is drawn to the problem presented by nodes that are strongly supported by non-molecular evidence but receive only low bootstrap resampling support. Overall, the gene trees agree with morphology-based brachiopod taxonomy, but novel relationships are tentatively suggested for thecideidine and megathyrid brachiopods. Articulate brachiopods are found to be monophyletic in all reconstructions, but monophyly of inarticulate brachiopods and the possible inclusion of phoronids in the inarticulate brachiopod clade are less strongly established. Phoronids are clearly excluded from a sister-group relationship with articulate brachiopods, this proposed relationship being due to the rejected, chimaeric sequence (GenBank UO12648). Lineage relative rate tests show no heterogeneity of evolutionary rate among articulate brachiopod sequences, but indicate that inarticulate brachiopod plus phoronid sequences evolve somewhat more slowly. Both brachiopods and phoronids evolve slowly by comparison with other invertebrates. A number of palaeontologically dated times of earliest appearance are used to make upper and lower estimates of the global rate of brachiopod SSU rDNA evolution, and these estimates are used to infer the likely divergence times of other nodes in the gene tree. There is reasonable agreement between most inferred molecular and palaeontological ages. The estimated rates of SSU rDNA sequence evolution suggest that the last common ancestor of brachiopods, chitons and other protostome invertebrates (Lophotrochozoa and Ecdysozoa) lived deep in Precambrian time. Results of this first DNA-based, taxonomically representative analysis of brachiopod phylogeny are in broad agreement with current morphology-based classification and systematics and are largely consistent with the hypothesis that brachiopod shell ontogeny and morphology are a good guide to phylogeny.
The evolutionary relationships of closely related species have long been of interest to biologists since these species experienced different evolutionary processes in a relatively short period of time. Comparison of phylogenies inferred from DNA sequences with differing inheritance patterns, such as mitochondrial, autosomal, and X and Y chromosomal loci, can provide more comprehensive inferences of the evolutionary histories of species. Gibbons, especially the genus Hylobates, are particularly intriguing as they consist of multiple closely related species which emerged rapidly and live in close geographic proximity. Our current understanding of relationships among Hylobates species is largely based on data from the maternally-inherited mitochondrial DNAs (mtDNAs).
To infer the paternal histories of gibbon taxa, we sequenced multiple Y chromosomal loci from 26 gibbons representing 10 species. As expected, we find levels of sequence variation some five times lower than observed for the mitochondrial genome (mtgenome). Although our Y chromosome phylogenetic tree shows relatively low resolution compared to the mtgenome tree, our results are consistent with the monophyly of gibbon genera suggested by the mtgenome tree. In a comparison of the molecular dating of divergences and on the branching patterns of phylogeny trees between mtgenome and Y chromosome data, we found: 1) the inferred divergence estimates were more recent for the Y chromosome than for the mtgenome, 2) the species H. lar and H. pileatus are monophyletic in the mtgenome phylogeny, respectively, but a H. pileatus individual falls into the H. lar Y chromosome clade.
Based on the ~6.4 kb of Y chromosomal DNA sequence data generated for each of the 26 individuals in this study, we provide molecular inferences on gibbon and particularly on Hylobates evolution complementary to those from mtDNA data. Overall, our results illustrate the utility of comparative studies of loci with different inheritance patterns for investigating potential sex specific processes on the evolutionary histories of closely related taxa, and emphasize the need for further sampling of gibbons of known provenance.
Y chromosome phylogeny; Phylogenetic relationships; Divergence times; Mitochondrial genome; Gene flow
In a 1997 seminal paper, W. Maddison proposed minimizing deep coalescences, or MDC, as an optimization criterion for inferring the species tree from a set of incongruent gene trees, assuming the incongruence is exclusively due to lineage sorting. In a subsequent paper, Maddison and Knowles provided and implemented a search heuristic for optimizing the MDC criterion, given a set of gene trees. However, the heuristic is not guaranteed to compute optimal solutions, and its hill-climbing search makes it slow in practice.
In this paper, we provide two exact solutions to the problem of inferring the species tree from a set of gene trees under the MDC criterion. In other words, our solutions are guaranteed to find the tree that minimizes the total number of deep coalescences from a set of gene trees. One solution is based on a novel integer linear programming (ILP) formulation, and another is based on a simple dynamic programming (DP) approach. Powerful ILP solvers, such as CPLEX, make the first solution appealing, particularly for very large-scale instances of the problem, whereas the DP-based solution eliminates dependence on proprietary tools, and its simplicity makes it easy to integrate with other genomic events that may cause gene tree incongruence.
Using the exact solutions, we analyze a data set of 106 loci from eight yeast species, a data set of 268 loci from eight Apicomplexan species, and several simulated data sets. We show that the MDC criterion provides very accurate estimates of the species tree topologies, and that our solutions are very fast, thus allowing for the accurate analysis of genome-scale data sets. Further, the efficiency of the solutions allow for quick exploration of sub-optimal solutions, which is important for a parsimony-based criterion such as MDC, as we show. We show that searching for the species tree in the compatibility graph of the clusters induced by the gene trees may be sufficient in practice, a finding that helps ameliorate the computational requirements of optimization solutions. Further, we study the statistical consistency and convergence rate of the MDC criterion, as well as its optimality in inferring the species tree. Finally, we show how our solutions can be used to identify potential horizontal gene transfer events that may have caused some of the incongruence in the data, thus augmenting Maddison's original framework. We have implemented our solutions in the PhyloNet software package, which is freely available at: http://bioinfo.cs.rice.edu/phylonet.
Inferring the evolutionary history of a set of species, known as the species tree, is a task of utmost significance in biology and beyond. The traditional approach to accomplishing this task from molecular sequences entails sequencing a gene in the set of species under consideration, reconstructing the gene's evolutionary history, and declaring it to be the species tree. However, recent analyses of multiple gene data sets, made available thanks to advances in sequencing technologies, have indicated that gene trees in the same group of species may disagree with each other, as well as with the species tree. Therefore, the development of methods for inferring the species tree despite such disagreements is imperative.
In this paper, we propose such a method, which seeks the tree that minimizes the amount of disagreement between the input set of gene trees and the inferred one. We have implemented our method and studied its performance, in terms of accuracy and computational efficiency, on two biological data sets and a large number of simulated data sets. Our analyses, of both the biological and synthetic data sets, indicate high accuracy of the method, as well as computationally efficient solutions in practice. Hence, our method makes a good candidate for inferring accurate species trees, despite gene tree disagreements, at a genomic scale.
The spider mite sub-family Tetranychinae includes many agricultural pests. The internal transcribed spacer (ITS) region of nuclear ribosomal RNA genes and the cytochrome c oxidase subunit I (COI) gene of mitochondrial DNA have been used for species identification and phylogenetic reconstruction within the sub-family Tetranychinae, although they have not always been successful. The 18S and 28S rRNA genes should be more suitable for resolving higher levels of phylogeny, such as tribes or genera of Tetranychinae because these genes evolve more slowly and are made up of conserved regions and divergent domains. Therefore, we used both the 18S (1,825–1,901 bp) and 28S (the 5′ end of 646–743 bp) rRNA genes to infer phylogenetic relationships within the sub-family Tetranychinae with a focus on the tribe Tetranychini. Then, we compared the phylogenetic tree of the 18S and 28S genes with that of the mitochondrial COI gene (618 bp). As observed in previous studies, our phylogeny based on the COI gene was not resolved because of the low bootstrap values for most nodes of the tree. On the other hand, our phylogenetic tree of the 18S and 28S genes revealed several well-supported clades within the sub-family Tetranychinae. The 18S and 28S phylogenetic trees suggest that the tribes Bryobiini, Petrobiini and Eurytetranychini are monophyletic and that the tribe Tetranychini is polyphyletic. At the genus level, six genera for which more than two species were sampled appear to be monophyletic, while four genera (Oligonychus, Tetranychus, Schizotetranychus and Eotetranychus) appear to be polyphyletic. The topology presented here does not fully agree with the current morphology-based taxonomy, so that the diagnostic morphological characters of Tetranychinae need to be reconsidered.
Army ants are dominant invertebrate predators in tropical and subtropical terrestrial ecosystems. Their close relatives within the dorylomorph group of ants are also highly specialized predators, although much less is known about their biology. We analyzed molecular data generated from 11 nuclear genes to infer a phylogeny for the major dorylomorph lineages, and incorporated fossil evidence to infer divergence times under a relaxed molecular clock.
Because our results indicate that one subfamily and several genera of dorylomorphs are non-monophyletic, we propose to subsume the six previous dorylomorph subfamilies into a single subfamily, Dorylinae. We find the monophyly of Dorylinae to be strongly supported and estimate the crown age of the group at 87 (74–101) million years. Our phylogenetic analyses provide only weak support for army ant monophyly and also call into question a previous hypothesis that army ants underwent a fundamental split into New World and Old World lineages. Outside the army ants, our phylogeny reveals for the first time many old, distinct lineages in the Dorylinae. The genus Cerapachys is shown to be non-monophyletic and comprised of multiple lineages scattered across the Dorylinae tree. We recover, with strong support, novel relationships among these Cerapachys-like clades and other doryline genera, but divergences in the deepest parts of the tree are not well resolved. We find the genus Sphinctomyrmex, characterized by distinctive abdominal constrictions, to consist of two separate lineages with convergent morphologies, one inhabiting the Old World and the other the New World tropics.
While we obtain good resolution in many parts of the Dorylinae phylogeny, relationships deep in the tree remain unresolved, with major lineages joining each other in various ways depending upon the analytical method employed, but always with short internodes. This may be indicative of rapid radiation in the early history of the Dorylinae, but additional molecular data and more complete species sampling are needed for confirmation. Our phylogeny now provides a basic framework for comparative biological analyses, but much additional study on the behavior and morphology of doryline species is needed, especially investigations directed at the non-army ant taxa.
Phylogenies often contain both well-supported and poorly supported nodes. Determining how much additional data might be required to eventually recover most or all nodes with high support is an important pragmatic goal, and simulations have been used to examine this question. Most simulations have been based on few empirical loci, and suggest that well supported phylogenies can be determined with a very modest amount of data. Here we report the results of an empirical phylogenetic analysis of all 10 genera and 25 of 48 species of the new world pond turtles (family Emydidae) based on one mitochondrial (1070 base pairs) and seven nuclear loci (5961 base pairs), and a more biologically realistic simulation analysis incorporating variation among gene trees, aimed at determining how much more data might be necessary to recover weakly-supported nodes with strong support.
Our mitochondrial-based phylogeny was well resolved, and congruent with some previous mitochondrial results. For example, all genera, and all species except Pseudemys concinna, P. peninsularis, and Terrapene carolina were monophyletic with strong support from at least one analytical method. The Emydinae was recovered as monophyletic, but the Deirochelyinae was not. Based on nuclear data, all genera were monophyletic with strong support except Trachemys, and all species except Graptemys pseudogeographica, P. concinna, T. carolina, and T. coahuila were monophyletic, generally with strong support. However, the branches subtending most genera were relatively short, and intergeneric relationships within subfamilies were mostly unsupported.
Our simulations showed that relatively high bootstrap support values (i.e. ≥ 70) for all nodes were reached in all datasets, but an increase in data did not necessarily equate to an increase in support values. However, simulations based on a single empirical locus reached higher overall levels of support with less data than did the simulations that were based on all seven empirical nuclear loci, and symmetric tree distances were much lower for single versus multiple gene simulation analyses.
Our empirical results provide new insights into the phylogenetics of the Emydidae, but the short branches recovered deep in the tree also indicate the need for additional work on this clade to recover all intergeneric relationships with confidence and to delimit species for some problematic groups. Our simulation results suggest that moderate (in the few-to-tens of kb range) amounts of data are necessary to recover most emydid relationships with high support values. They also suggest that previous simulations that do not incorporate among-gene tree topological variance probably underestimate the amount of data needed to recover well supported phylogenies.
Despite considerable progress in systematics, a comprehensive scenario of the evolution of phenotypic characters in the mega-diverse Holometabola based on a solid phylogenetic hypothesis was still missing. We addressed this issue by de novo sequencing transcriptome libraries of representatives of all orders of holometabolan insects (13 species in total) and by using a previously published extensive morphological dataset. We tested competing phylogenetic hypotheses by analyzing various specifically designed sets of amino acid sequence data, using maximum likelihood (ML) based tree inference and Four-cluster Likelihood Mapping (FcLM). By maximum parsimony-based mapping of the morphological data on the phylogenetic relationships we traced evolutionary transformations at the phenotypic level and reconstructed the groundplan of Holometabola and of selected subgroups.
In our analysis of the amino acid sequence data of 1,343 single-copy orthologous genes, Hymenoptera are placed as sister group to all remaining holometabolan orders, i.e., to a clade Aparaglossata, comprising two monophyletic subunits Mecopterida (Amphiesmenoptera + Antliophora) and Neuropteroidea (Neuropterida + Coleopterida). The monophyly of Coleopterida (Coleoptera and Strepsiptera) remains ambiguous in the analyses of the transcriptome data, but appears likely based on the morphological data. Highly supported relationships within Neuropterida and Antliophora are Raphidioptera + (Neuroptera + monophyletic Megaloptera), and Diptera + (Siphonaptera + Mecoptera). ML tree inference and FcLM yielded largely congruent results. However, FcLM, which was applied here for the first time to large phylogenomic supermatrices, displayed additional signal in the datasets that was not identified in the ML trees.
Our phylogenetic results imply that an orthognathous larva belongs to the groundplan of Holometabola, with compound eyes and well-developed thoracic legs, externally feeding on plants or fungi. Ancestral larvae of Aparaglossata were prognathous, equipped with single larval eyes (stemmata), and possibly agile and predacious. Ancestral holometabolan adults likely resembled in their morphology the groundplan of adult neopteran insects. Within Aparaglossata, the adult’s flight apparatus and ovipositor underwent strong modifications. We show that the combination of well-resolved phylogenies obtained by phylogenomic analyses and well-documented extensive morphological datasets is an appropriate basis for reconstructing complex morphological transformations and for the inference of evolutionary histories.
The phylogeny of the flycatcher genus Anairetes was previously inferred using short fragments of mitochondrial DNA and parsimony and distance-based methods. The resulting topology spurred taxonomic revision and influenced understanding of Andean biogeography. More than a decade later, we revisit the phylogeny of Anairetes tit-tyrants using more mtDNA characters, seven unlinked loci (3 mitochondrial genes, 6 nuclear loci), more closely related outgroup taxa, partitioned Bayesian analyses, and two coalescent species-tree approaches (Bayesian estimation of species trees, BEST; Bayesian evolutionary analysis by sampling trees, *BEAST). Of these improvements in data and analyses, the fourfold increase in mtDNA characters was both necessary and sufficient to incur a major shift in the topology and near-complete resolution. The species-tree analyses, while theoretically preferable to concatenation or single gene approaches, yielded topologies that were compatible with mtDNA but with weaker statistical resolution at nodes. The previous results that had led to taxonomic and biogeographic reappraisal were refuted, and the current results support the resurrection of the genus Uromyias as the sister clade to Anairetes. The sister relationship between these two genera corresponds to an ecological dichotomy between a depauperate humid cloudforest clade and a diverse dry-tolerant clade that has diversified along the latitudinal axis of the Andes. The species-tree results and the concatenation results each reaffirm the primacy of mtDNA to provide phylogenetic signal for avian phylogenies at the species and subspecies level. This is due in part to the abundance of informative characters in mtDNA, and in part to its lower effective population size that causes it to more faithfully track the species tree.
Effective population size; Phylogenetic methods; Emergent signal; Species-tree methods; Anairetes, Haploid specification
The circumscription of the avian superfamily Sylvioidea is a matter of long ongoing debate. While the overall inclusiveness has now been mostly agreed on and 20 families recognised, the phylogenetic relationships among the families are largely unknown. We here present a phylogenetic hypothesis for Sylvioidea based on one mitochondrial and six nuclear markers, in total ~6.3 kbp, for 79 ingroup species representing all currently recognised families and some species with uncertain affinities, making this the most comprehensive analysis of this taxon.
The resolution, especially of the deeper nodes, is much improved compared to previous studies. However, many relationships among families remain uncertain and are in need of verification. Most families themselves are very well supported based on the total data set and also by indels. Our data do not support the inclusion of Hylia in Cettiidae, but do not strongly reject a close relationship with Cettiidae either. The genera Scotocerca and Erythrocercus are closely related to Cettiidae, but separated by relatively long internodes. The families Paridae, Remizidae and Stenostiridae clustered among the outgroup taxa and not within Sylvioidea.
Although the phylogenetic position of Hylia is uncertain, we tentatively support the recognition of the family Hyliidae Bannerman, 1923 for this genus and Pholidornis. We propose new family names for the genera Scotocerca and Erythrocercus, Scotocercidae and Erythrocercidae, respectively, rather than including these in Cettiidae, and we formally propose the name Macrosphenidae, which has been in informal use for some time. We recommend that Paridae, Remizidae and Stenostiridae are not included in Sylvioidea. We also briefly discuss the problems of providing a morphological diagnosis when proposing a new family-group name (or genus-group name) based on a clade.
Phylogeny; Passerines; Taxonomic revision; International Code of Zoological Nomenclature
Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. [amalgamation; gene tree reconciliation; gene tree reconstruction; lateral gene transfer; phylogeny.]
This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.
Algorithm; amalgamation; Bayesian inference; birth–death model; coalescent; dynamic programming; gene duplication; gene loss; gene transfer; gene tree; hybridization; maximum likelihood; phylogenetics; species tree
The hippolytid genus Lysmata is characterized by simultaneous hermaphroditism, a very rare sexual system among Decapoda. Specialized cleaning behavior is reported in a few pair-living species; these life history traits vary within the genus. Unfortunately, the systematics of Lysmata and the Hippolytidae itself are in contention, making it difficult to examine these taxa for trends in life history traits. A phylogeny of Lysmata and related taxa is needed, to clarify their evolutionary relationships and the origin of their unique sexual pattern. In this study, we present a molecular phylogenetic analysis among species of Lysmata, related genera, and several putative hippolytids. The analysis is based upon DNA sequences of two genes, 16S mtDNA and nuclear 28S rRNA. Phylogenetic trees were estimated using Bayesian Inference, Maximum Likelihood, and Maximum Parsimony.
Phylogenetic analysis of 29 species of Lysmata, eight genera of Hippolytidae and two genera of Barbouriidae based on a single (16S, 28S) and combined gene approach (16S+28S) indicates that three groups of Lysmata differentiate according to antennular morphology: (1) Lysmata, having a multi-segmented accessory branch, (2) Hippolysmata (prior to Chace 1972), with a one-segmented accessory branch, and (3) a third group of Lysmata outliers, with one-segmented unguiform accessory branch, and close affinity to the genera Exhippolysmata and Lysmatella. The monophyly of the clade bearing a multi-segmented accessory branch is robust. Within the short accessory branch clade, species with specialized cleaning behaviors form a monophyletic clade, however, the integrity of the clade was sensitive to alignment criteria. Other hippolytid and barbouriid genera used in the analysis are basal to these three groups, including one displaying simultaneous hermaphroditism (Parhippolyte). The two barbouriid species occur in a separate clade, but among hippolytid taxa.
The data support the historical morphological division of Lysmata into clades based on accessory branch morphology. The position of the "cleaner" shrimps, indicates that specialized cleaning behavior is a derived trait. The topologies of the cladograms support the monophyly of the barbouriids, but do not support their elevation to familial status. Taxa ancestral to the genus Lysmata display simultaneous hermaphroditism, suggesting that this life history trait evolved outside the genus Lysmata.
Bushbabies (Galagidae) are among the most morphologically cryptic of all primates and their diversity and relationships are some of the most longstanding problems in primatology. Our knowledge of galagid evolutionary history has been limited by a lack of appropriate molecular data and a paucity of fossils. Most phylogenetic studies have produced conflicting results for many clades, and even the relationships among genera remain uncertain. To clarify galagid evolutionary history, we assembled the largest molecular dataset for galagos to date by sequencing 27 independent loci. We inferred phylogenetic relationships using concatenated maximum-likelihood and Bayesian analyses, and also coalescent-based species tree methods to account for gene tree heterogeneity due to incomplete lineage sorting.
The genus Euoticus was identified as sister taxon to the rest of the galagids and the genus Galagoides was not recovered as monophyletic, suggesting that a new generic name for the Zanzibar complex is required. Despite the amount of genetic data collected in this study, the monophyly of the family Lorisidae remained poorly supported, probably due to the short internode between the Lorisidae/Galagidae split and the origin of the African and Asian lorisid clades. One major result was the relatively old origin for the most recent common ancestor of all living galagids soon after the Eocene-Oligocene boundary.
Using a multilocus approach, our results suggest an early origin for the crown Galagidae, soon after the Eocene-Oligocene boundary, making Euoticus one of the oldest lineages within extant Primates. This result also implies that one – or possibly more – stem radiations diverged in the Late Eocene and persisted for several million years alongside members of the crown group.
Concatenation; Species tree; Divergence times; Nuclear DNA; Eocene-Oligocene boundary; Strepsirhini; Lorisoidea
Metrics of phylogenetic tree reliability, such as parametric bootstrap percentages or Bayesian posterior probabilities, represent internal measures of the topological reproducibility of a phylogenetic tree, while the recently introduced aLRT (approximate likelihood ratio test) assesses the likelihood that a branch exists on a maximum-likelihood tree. Although those values are often equated with phylogenetic tree accuracy, they do not necessarily estimate how well a reconstructed phylogeny represents cladistic relationships that actually exist in nature. The authors have therefore attempted to quantify how well bootstrap percentages, posterior probabilities, and aLRT measures reflect the probability that a deduced phylogenetic clade is present in a known phylogeny. The authors simulated the evolution of bacterial genes of varying lengths under biologically realistic conditions, and reconstructed those known phylogenies using both maximum likelihood and Bayesian methods. Then, they measured how frequently clades in the reconstructed trees exhibiting particular bootstrap percentages, aLRT values, or posterior probabilities were found in the true trees. The authors have observed that none of these values correlate with the probability that a given clade is present in the known phylogeny. The major conclusion is that none of the measures provide any information about the likelihood that an individual clade actually exists. It is also found that the mean of all clade support values on a tree closely reflects the average proportion of all clades that have been assigned correctly, and is thus a good representation of the overall accuracy of a phylogenetic tree.
The construction of phylogenetic trees, which depict past relationships between groups of DNA or protein sequences, has valuable application in many fields of study, most commonly evolutionary and population biology. Before drawing conclusions from phylogenetic trees, it is important to assess how accurate those reconstructions are. This is typically accomplished by examining measures of “clade credibility” (such as bootstrap or posterior probability values), which represent how reproducible relationships are within the tree based on the parameters of the phylogenetic analysis. However, such measures do not necessarily reflect how likely inferred relationships are to have actually occurred in nature. Therefore, using simulated data where relationships are known, we have determined how well several measures of clade credibility correlate with the likelihood that a deduced phylogenetic grouping actually exists in reality. Surprisingly, we found no such correlation, and that the inferred relationships were correctly assigned about as often in cases where clade credibility values were very low as where they were high. This finding suggests that current measures of phylogenetic tree reliability are not useful in predicting whether specific inferred relationships have actually occurred.
All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats.
adaptive convergence; incongruence; gene trees; partitioned likelihood support; phylogeny; Phyllostomidae; saturation; species trees
The fern genus Dryopteris (Dryopteridaceae) is among the most common and species rich fern genera in temperate forests in the northern hemisphere containing 225–300 species worldwide. The circumscription of Dryopteris has been controversial and various related genera have, over the time, been included in and excluded from Dryopteris. The infrageneric phylogeny has largely remained unclear, and the placement of the majority of the supraspecific taxa of Dryopteris has never been tested using molecular data.
In this study, DNA sequences of four plastid loci (rbcL gene, rps4-trnS spacer, trnL intron, trnL-F spacer) were used to reconstruct the phylogeny of Dryopteris. A total of 122 accessions are sampled in our analysis and they represent 100 species of the expanded Dryopteris including Acrophorus, Acrorumohra, Diacalpe, Dryopsis, Nothoperanema, and Peranema. All four subgenera and 19 sections currently recognized in Dryopteris s.s. are included. One species each of Arachniodes, Leptorumohra, and Lithostegia of Dryopteridaceae are used as outgroups. Our study confirms the paraphyly of Dryopteris and provides the first strong molecular evidence on the monophyly of Acrophorus, Diacalpe, Dryopsis, Nothoperanema, and Peranema. However, all these monophyletic groups together with the paraphyletic Acrorumohra are suggested to be merged into Dryopteris based on both molecular and morphological evidence. Our analysis identified 13 well-supported monophyletic groups. Each of the 13 clades is additionally supported by morphological synapomophies and is inferred to represent a major evolutionary lineage in Dryopteris. In contrast, monophyly of the four subgenera and 15 out of 19 sections currently recognized in Dryopteris s.s is not supported by plastid data.
The genera, Acrophorus, Acrorumohra, Diacalpe, Dryopsis, Nothoperanema, and Peranema, should all be merged into Dryopteris. Most species of these genera share a short rhizome and catadromic arrangement of frond segments, unlike the sister genus of Dryopteris s.l., Arachniodes, which has anadromic arrangement of frond segments. The non-monophyly of the 19 out of the 21 supraspecific taxa (sections, subgenera) in Dryopteris strongly suggests that the current taxonomy of this genus is in need of revision. The disagreement between the previous taxonomy and molecular results in Dryopteris may be due partly to interspecific hybridization and polyplodization. More morphological studies and molecular data, especially from the nuclear genome, are needed to thoroughly elucidate the evolutionary history of Dryopteris. The 13 well-supported clades identified based on our data represent 13 major evolutionary lineages in Dryopteris that are also supported by morphological synapomophies.
Debate regarding the monophyly and relationships of the avian order Pelecaniformes represents a classic example of discord between morphological and molecular estimates of phylogeny. This lack of consensus hampers interpretation of the group's fossil record, which has major implications for understanding patterns of character evolution (e.g., the evolution of wing-propelled diving) and temporal diversification (e.g., the origins of modern families). Relationships of the Pelecaniformes were inferred through parsimony analyses of an osteological dataset encompassing 59 taxa and 464 characters. The relationships of the Plotopteridae, an extinct family of wing-propelled divers, and several other fossil pelecaniforms (Limnofregata, Prophaethon, Lithoptila, ?Borvocarbo stoeffelensis) were also assessed. The antiquity of these taxa and their purported status as stem members of extant families makes them valuable for studies of higher-level avian diversification.
Pelecaniform monophyly is not recovered, with Phaethontidae recovered as distantly related to all other pelecaniforms, which are supported as a monophyletic Steganopodes. Some anatomical partitions of the dataset possess different phylogenetic signals, and partitioned analyses reveal that these discrepancies are localized outside of Steganopodes, and primarily due to a few labile taxa. The Plotopteridae are recovered as the sister taxon to Phalacrocoracoidea, and the relationships of other fossil pelecaniforms representing key calibration points are well supported, including Limnofregata (sister taxon to Fregatidae), Prophaethon and Lithoptila (successive sister taxa to Phaethontidae), and ?Borvocarbo stoeffelensis (sister taxon to Phalacrocoracidae). These relationships are invariant when ‘backbone’ constraints based on recent avian phylogenies are imposed.
Relationships of extant pelecaniforms inferred from morphology are more congruent with molecular phylogenies than previously assumed, though notable conflicts remain. The phylogenetic position of the Plotopteridae implies that wing-propelled diving evolved independently in plotopterids and penguins, representing a remarkable case of convergent evolution. Despite robust support for the placement of fossil taxa representing key calibration points, the successive outgroup relationships of several “stem fossil + crown family” clades are variable and poorly supported across recent studies of avian phylogeny. Thus, the impact these fossils have on inferred patterns of temporal diversification depends heavily on the resolution of deep nodes in avian phylogeny.
Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data  and, as such, it has become a common method for assessing tree confidence in phylogenetics . Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny.
In this article, we discuss the idea of applying weighted bootstrapping to phylogenetic reconstruction by weighting each phylogeny inferred from resampled sequences. Tree weights can be based either on the least-squares (LS) tree estimate or on the average secondary bootstrap score (SBS) associated with each resampled tree. Secondary bootstrapping consists of the estimation of bootstrap scores of the trees inferred from resampled data. The LS and SBS-based bootstrapping procedures were designed to take into account the quality of each "pseudo-replicated" phylogeny in the final tree estimation. A simulation study was carried out to evaluate the performances of the five weighting strategies which are as follows: LS and SBS-based bootstrapping, LS and SBS-based bootstrapping with data normalization and the traditional unweighted bootstrapping.
The simulations conducted with two real data sets and the five weighting strategies suggest that the SBS-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping. The high robustness of the normalized SBS could be particularly useful in situations where observed sequences have been affected by noise or have undergone massive insertion or deletion events. The results provided by the four other strategies were very similar regardless the noise level, thus also demonstrating the stability of the traditional bootstrapping method.
Reconciling traditional classifications, morphology, and the phylogenetic relationships of brown-spored agaric mushrooms has proven difficult in many groups, due to extensive convergence in morphological features. Here, we address the monophyly of the Bolbitiaceae, a family with over 700 described species and examine the higher-level relationships within the family using a newly constructed multilocus dataset (ITS, nrLSU rDNA and EF1-alpha). We tested whether the fast-evolving Internal Transcribed Spacer (ITS) sequences can be accurately aligned across the family, by comparing the outcome of two iterative alignment refining approaches (an automated and a manual) and various indel-treatment strategies. We used PRANK to align sequences in both cases. Our results suggest that – although PRANK successfully evades overmatching of gapped sites, referred previously to as alignment overmatching – it infers an unrealistically high number of indel events with natively generated guide-trees. This 'alignment undermatching' could be avoided by using more rigorous (e.g. ML) guide trees. The trees inferred in this study support the monophyly of the core Bolbitiaceae, with the exclusion of Panaeolus, Agrocybe, and some of the genera formerly placed in the family. Bolbitius and Conocybe were found monophyletic, however, Pholiotina and Galerella require redefinition. The phylogeny revealed that stipe coverage type is a poor predictor of phylogenetic relationships, indicating the need for a revision of the intrageneric relationships within Conocybe.
The genus Algansea is one of the most representative freshwater fish groups in central Mexico due to its wide geographic distribution and unusual level of endemicity. Despite the small number of species, this genus has had an unsettled taxonomic history due to high levels of intraspecific morphological variation. Moreover, several phylogenetic hypotheses among congeners have been proposed but have had the following shortcomings: the use of homoplasious morphological characters, the use of character codification and polarisation methods that lacked objectivity, and incomplete taxonomic sampling. In this study, a phylogenetic analysis among species of Algansea is presented. This analysis is based upon two molecular markers, the mitochondrial gene cytochrome b and the first intron of the ribosomal protein S7 gene.
Bayesian analysis based on a combined matrix (cytochrome b and first intron S7) showed that Algansea is a monophyletic group and that Agosia chrysogaster is the sister group. Divergence times dated the origin of the genus around 16.6 MYA, with subsequent cladogenetic events occurring between 6.4 and 2.8 MYA. When mapped onto the molecular phylogenetic hypothesis, the character states of three morphological characters did not support previous hypotheses on the evolution of morphological traits in the genus Algansea, whereas the character states of the remaining six characters partially corroborated those hypotheses.
Monophyly of the genus Algansea was corroborated in this study. Tree topology shows the genus consists of three main lineages: Central-Eastern, Western, and Southern clades. However, the relationships among these clades remained unresolved. Congruence found between the available geological and climatic history and the divergence times made it possible to infer the biogeographical history of Algansea, which suggested that vicariance events were responsible for the evolutionary history of the genus. Interestingly, this pattern was shared with other members of the freshwater fish fauna of central Mexico. In addition, molecular data also show that some morphological traits alleged to represent synapomorphies in previous studies were actually homoplasies. Others traits were corroborated as synapomorphies, particularly in those species of a subgroup corresponding with the Central-Eastern clade within Algansea; this corroboration is interpreted as a result of evolutionary adaptations.
The protistan phylum Apicomplexa contains many important pathogens and is the subject of intense genome sequencing efforts. Based upon the genome sequences from seven apicomplexan species and a ciliate outgroup, we identified 268 single-copy genes suitable for phylogenetic inference. Both concatenation and consensus approaches inferred the same species tree topology. This topology is consistent with most prior conceptions of apicomplexan evolution based upon ultrastructural and developmental characters, that is, the piroplasm genera Theileria and Babesia form the sister group to the Plasmodium species, the coccidian genera Eimeria and Toxoplasma are monophyletic and are the sister group to the Plasmodium species and piroplasm genera, and Cryptosporidium forms the sister group to the above mentioned with the ciliate Tetrahymena as the outgroup. The level of incongruence among gene trees appears to be high at first glance; only 19% of the genes support the species tree, and a total of 48 different gene-tree topologies are observed. Detailed investigations suggest that the low signal-to-noise ratio in many genes may be the main source of incongruence. The probability of being consistent with the species tree increases as a function of the minimum bootstrap support observed at tree nodes for a given gene tree. Moreover, gene sequences that generate high bootstrap support are robust to the changes in alignment parameters or phylogenetic method used. However, caution should be taken in that some genes can infer a “wrong” tree with strong support because of paralogy, model violations, or other causes. The importance of examining multiple, unlinked genes that possess a strong phylogenetic signal cannot be overstated.
Apicomplexa; genome scale; phylogeny; bootstrap; long-branch attraction; taxon sampling
The order Gruiformes, for which even familial composition remains controversial, is perhaps the least well understood avian order from a phylogenetic perspective. The history of the systematics of the order is presented, and the ecological and biogeographic characteristics of its members are summarized. Using cladistic techniques, phylogenetic relationships among fossil and modern genera of the Gruiformes were estimated based on 381 primarily osteological characters; relationships among modern species of Grues (Psophiidae, Aramidae, Gruidae, Heliornithidae and Rallidae) were assessed based on these characters augmented by 189 characters of the definitive integument. A strict consensus tree for 20,000 shortest trees compiled for the matrix of gruiform genera (length = 967, CI = 0.517) revealed a number of nodes common to the solution set, many of which were robust to bootstrapping and had substantial support (Bremer) indices. Robust nodes included those supporting: a sister relationship between the Pedionomidae and Turnicidae; monophyly of the Gruiformes exclusive of the Pedionomidae and Turnicidae; a sister relationship between the Cariamidae and Phorusrhacoidea; a sister relationship between a clade comprising Eurypyga and Messelornis and one comprising Rhynochetos and Aptornis; monophyly of the Grues (Psophiidae, Aramidae, Gruidae, Heliornithidae and Rallidae); monophyly of a clade (Gruoidea) comprising (in order of increasingly close relationship) Psophia, Aramus, Balearica and other Gruidae, with monophyly of each member in this series confirmed; a sister relationship between the Heliornithidae and Rallidae; and monophyly of the Rallidae exclusive of Himantornis. Autapomorphic divergence was comparatively high for Pedionomus, Eurypyga, Psophia, Himantornis and Fulica; extreme autapomorphy, much of which is unique for the order, characterized the extinct, flightless Aptornis. In the species-level analysis of modern Grues, special efforts were made to limit the analytical impacts of homoplasy related to flightlessness in a number of rallid lineages. A strict consensus tree of 20,000 shortest trees compiled (length = 1232, CI = 0.463) confirmed the interfamilial relationships resolved in the ordinal analysis and established a number of other, variably supported groups within the Rallidae. Groupings within the Rallidae included: monophyly of Rallidae exclusive of Himantornis and a clade comprising Porphyrio (including Notornis) and Porphyrula; a poorly resolved, basal group of genera including Gymnocrex, Habroptila, Eulabeornis, Aramides, Canirallus and Mentocrex; an intermediate grade comprising Anurolimnas, Amaurolimnas, and Rougetius; monophyly of two major subdivisions of remaining rallids, one comprising Rallina (paraphyletic), Rallicula, and Sarothrura, and the other comprising the apparently paraphyletic 'long-billed' rails (e.g. Pardirallus, Cyanolimnas, Rallus, Gallirallus and Cabalus and a variably resolved clade comprising 'crakes' (e.g. Atlantisia, Laterallus and Porzana, waterhens (Amaurornis), moorhens (Gallinula and allied genera) and coots (Fulica). Relationships among 'crakes' remain poorly resolved; Laterallus may be paraphyletic, and Porzana is evidently polyphyletic and poses substantial challenges for reconciliation with current taxonomy. Relationships among the species of waterhens, moorhens and coots, however, were comparatively well resolved, and exhaustive, fine-scale analyses of several genera (Grus, Porphyrio, Aramides, Rallus, Laterallus and Fulica) and species complexes (Porphyrio porphyrio -group,Gallirallus philippensis -group and Fulica americana -group) revealed additional topological likelihoods. Many nodes shared by a majority of the shortest trees under equal weighting were common to all shortest trees found following one or two iterations of successive weighting of characters. Provisional placements of selected subfossil rallids (e.g. Diaphorapteryx, Aphanapteryx and Capellirallus ) were based on separate heuristic searches using the strict consensus tree for modern rallids as a backbone constraint. These analyses were considered with respect to assessments of robustness, homoplasy related to flightlessness, challenges and importance of fossils in cladistic analysis, previously published studies and biogeography, and an annotated phylogenetic classification of the Gruiformes is proposed.
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.
Species trees depict how species split and diverge. Within the branches of a species tree, gene trees, which depict the evolutionary histories of different genomic regions in the species, grow. Evolutionary analyses of the genomes of closely related organisms have highlighted the phenomenon that gene trees may disagree with each other as well as with the species tree that contains them due to deep coalescence. Furthermore, for several groups of organisms, hybridization plays an important role in their evolution and diversification. This evolutionary event also results in gene tree incongruence and gives rise to a species phylogeny that is a network. Thus, inferring the evolutionary histories of groups of organisms where hybridization is known, or suspected, to play an evolutionary role requires dealing simultaneously with hybridization and other sources of gene tree incongruence. Currently, no methods exist for doing this with general scenarios of hybridization. In this paper, we propose the first method for this task and demonstrate its performance. We revisit the analysis of a set of yeast species and another of Drosophila species, and show that evolutionary histories involving hybridization have higher support than the strictly diverging evolutionary histories estimated when not incorporating hybridization in the analysis.
North American Agalinis Raf. species represent a taxonomically challenging group and there have been extensive historical revisions at the species, section, and subsection levels of classification. The genus contains many rare species, including the federally listed endangered species Agalinis acuta. In addition to evaluating the degree to which historical classifications at the section and subsection levels are supported by molecular data sampled from 79 individuals representing 29 Agalinis species, we assessed the monophyly of 27 species by sampling multiple individuals representing different populations of those species. Twenty-one of these species are of conservation concern in at least some part of their range.
Phylogenetic relationships estimated using maximum likelihood analyses of seven chloroplast DNA loci (aligned length = 11 076 base pairs (bp) and the nuclear ribosomal DNA ITS (internal transcribed spacer) locus (733 bp); indicated no support for the historically recognized sections except for Section Erectae. Our results suggest that North American members of the genus comprise six major lineages, however we were not able to resolve branching order among many of these lineages. Monophyly of 24 of the 29 sampled species was supported based on significant branch lengths of and high bootstrap support for subtending branches. However, there was no statistical support for the monophyly of A. acuta with respect to Agalinis tenella and Agalinis decemloba. Although most species were supported, deeper relationships among many species remain ambiguous.
The North American Agalinis species sampled form a well supported, monophyletic group within the family Orobanchaceae relative to the outgroups sampled. Most hypotheses regarding section- and subsection-level relationships based on morphology were not supported and taxonomic revisions are warranted. Lack of support for monophyly of Agalinis acuta leaves the important question regarding its taxonomic status unanswered. Lack of resolution is potentially due to incomplete lineage sorting of ancestral polymorphisms among recently diverged species; however the gene regions examined did distinguish among almost all other species in the genus. Due to the important policy implications of this finding we are further evaluating the evolutionary distinctiveness of A. acuta using morphological data and loci with higher mutation rates.