Search tips
Search criteria

Results 1-16 (16)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Phylogeny of the cycads based on multiple single-copy nuclear genes: congruence of concatenated parsimony, likelihood and species tree inference methods 
Annals of Botany  2013;112(7):1263-1278.
Background and aims
Despite a recent new classification, a stable phylogeny for the cycads has been elusive, particularly regarding resolution of Bowenia, Stangeria and Dioon. In this study, five single-copy nuclear genes (SCNGs) are applied to the phylogeny of the order Cycadales. The specific aim is to evaluate several gene tree–species tree reconciliation approaches for developing an accurate phylogeny of the order, to contrast them with concatenated parsimony analysis and to resolve the erstwhile problematic phylogenetic position of these three genera.
DNA sequences of five SCNGs were obtained for 20 cycad species representing all ten genera of Cycadales. These were analysed with parsimony, maximum likelihood (ML) and three Bayesian methods of gene tree–species tree reconciliation, using Cycas as the outgroup. A calibrated date estimation was developed with Bayesian methods, and biogeographic analysis was also conducted.
Key Results
Concatenated parsimony, ML and three species tree inference methods resolve exactly the same tree topology with high support at most nodes. Dioon and Bowenia are the first and second branches of Cycadales after Cycas, respectively, followed by an encephalartoid clade (Macrozamia–Lepidozamia–Encephalartos), which is sister to a zamioid clade, of which Ceratozamia is the first branch, and in which Stangeria is sister to Microcycas and Zamia.
A single, well-supported phylogenetic hypothesis of the generic relationships of the Cycadales is presented. However, massive extinction events inferred from the fossil record that eliminated broader ancestral distributions within Zamiaceae compromise accurate optimization of ancestral biogeographical areas for that hypothesis. While major lineages of Cycadales are ancient, crown ages of all modern genera are no older than 12 million years, supporting a recent hypothesis of mostly Miocene radiations. This phylogeny can contribute to an accurate infrafamilial classification of Zamiaceae.
PMCID: PMC3806525  PMID: 23997230
Biogeography; Cycadales; Bowenia; Stangeria; Dioon; gymnosperms; molecular systematics
2.  Data access for the 1,000 Plants (1KP) project 
GigaScience  2014;3:17.
The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.
PMCID: PMC4306014  PMID: 25625010
Viridiplantae; Biodiversity; Transcriptomes; Phylogenomics; Interactions; Pathways
3.  Between Two Fern Genomes 
GigaScience  2014;3:15.
Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves.
PMCID: PMC4199785  PMID: 25324969
Azolla; Ceratopteris; Comparative analyses; Ferns; Genomics; Land plants; Monilophytes
4.  Transcriptome-Mining for Single-Copy Nuclear Markers in Ferns 
PLoS ONE  2013;8(10):e76957.
Molecular phylogenetic investigations have revolutionized our understanding of the evolutionary history of ferns—the second-most species-rich major group of vascular plants, and the sister clade to seed plants. The general absence of genomic resources available for this important group of plants, however, has resulted in the strong dependence of these studies on plastid data; nuclear or mitochondrial data have been rarely used. In this study, we utilize transcriptome data to design primers for nuclear markers for use in studies of fern evolutionary biology, and demonstrate the utility of these markers across the largest order of ferns, the Polypodiales.
Principal Findings
We present 20 novel single-copy nuclear regions, across 10 distinct protein-coding genes: ApPEFP_C, cryptochrome 2, cryptochrome 4, DET1, gapCpSh, IBR3, pgiC, SQD1, TPLATE, and transducin. These loci, individually and in combination, show strong resolving power across the Polypodiales phylogeny, and are readily amplified and sequenced from our genomic DNA test set (from 15 diploid Polypodiales species). For each region, we also present transcriptome alignments of the focal locus and related paralogs—curated broadly across ferns—that will allow researchers to develop their own primer sets for fern taxa outside of the Polypodiales. Analyses of sequence data generated from our genomic DNA test set reveal strong effects of partitioning schemes on support levels and, to a much lesser extent, on topology. A model partitioned by codon position is strongly favored, and analyses of the combined data yield a Polypodiales phylogeny that is well-supported and consistent with earlier studies of this group.
The 20 single-copy regions presented here more than triple the single-copy nuclear regions available for use in ferns. They provide a much-needed opportunity to assess plastid-derived hypotheses of relationships within the ferns, and increase our capacity to explore aspects of fern evolution previously unavailable to scientific investigation.
PMCID: PMC3792871  PMID: 24116189
5.  Evolution of the Class IV HD-Zip Gene Family in Streptophytes 
Molecular Biology and Evolution  2013;30(10):2347-2365.
Class IV homeodomain leucine zipper (C4HDZ) genes are plant-specific transcription factors that, based on phenotypes in Arabidopsis thaliana, play an important role in epidermal development. In this study, we sampled all major extant lineages and their closest algal relatives for C4HDZ homologs and phylogenetic analyses result in a gene tree that mirrors land plant evolution with evidence for gene duplications in many lineages, but minimal evidence for gene losses. Our analysis suggests an ancestral C4HDZ gene originated in an algal ancestor of land plants and a single ancestral gene was present in the last common ancestor of land plants. Independent gene duplications are evident within several lineages including mosses, lycophytes, euphyllophytes, seed plants, and, most notably, angiosperms. In recently evolved angiosperm paralogs, we find evidence of pseudogenization via mutations in both coding and regulatory sequences. The increasing complexity of the C4HDZ gene family through the diversification of land plants correlates to increasing complexity in epidermal characters.
PMCID: PMC3773374  PMID: 23894141
gene family evolution; gene duplication; transcription factor; homeodomain leucine zipper
6.  Next-generation phenomics for the Tree of Life 
PLoS Currents  2013;5:ecurrents.tol.085c713acafc8711b2ff7010a4b03733.
The phenotype represents a critical interface between the genome and the environment in which organisms live and evolve. Phenotypic characters also are a rich source of biodiversity data for tree building, and they enable scientists to reconstruct the evolutionary history of organisms, including most fossil taxa, for which genetic data are unavailable. Therefore, phenotypic data are necessary for building a comprehensive Tree of Life. In contrast to recent advances in molecular sequencing, which has become faster and cheaper through recent technological advances, phenotypic data collection remains often prohibitively slow and expensive. The next-generation phenomics project is a collaborative, multidisciplinary effort to leverage advances in image analysis, crowdsourcing, and natural language processing to develop and implement novel approaches for discovering and scoring the phenome, the collection of phentotypic characters for a species. This research represents a new approach to data collection that has the potential to transform phylogenetics research and to enable rapid advances in constructing the Tree of Life. Our goal is to assemble large phenomic datasets built using new methods and to provide the public and scientific community with tools for phenomic data assembly that will enable rapid and automated study of phenotypes across the Tree of Life.
PMCID: PMC3697239  PMID: 23827969
7.  The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses 
Plant and Cell Physiology  2012;54(2):e1.
The Plant Ontology (PO; is a publicly available, collaborative effort to develop and maintain a controlled, structured vocabulary (‘ontology’) of terms to describe plant anatomy, morphology and the stages of plant development. The goals of the PO are to link (annotate) gene expression and phenotype data to plant structures and stages of plant development, using the data model adopted by the Gene Ontology. From its original design covering only rice, maize and Arabidopsis, the scope of the PO has been expanded to include all green plants. The PO was the first multispecies anatomy ontology developed for the annotation of genes and phenotypes. Also, to our knowledge, it was one of the first biological ontologies that provides translations (via synonyms) in non-English languages such as Japanese and Spanish. As of Release #18 (July 2012), there are about 2.2 million annotations linking PO terms to >110,000 unique data objects representing genes or gene models, proteins, RNAs, germplasm and quantitative trait loci (QTLs) from 22 plant species. In this paper, we focus on the plant anatomical entity branch of the PO, describing the organizing principles, resources available to users and examples of how the PO is integrated into other plant genomics databases and web portals. We also provide two examples of comparative analyses, demonstrating how the ontology structure and PO-annotated data can be used to discover the patterns of expression of the LEAFY (LFY) and terpene synthase (TPS) gene homologs.
PMCID: PMC3583023  PMID: 23220694
Bioinformatics; Comparative genomics; Genome annotation; Ontology; Plant anatomy; Terpene synthase
8.  Ontologies as integrative tools for plant science 
American journal of botany  2012;99(8):1263-1275.
Premise of the study
Bio-ontologies are essential tools for accessing and analyzing the rapidly growing pool of plant genomic and phenomic data. Ontologies provide structured vocabularies to support consistent aggregation of data and a semantic framework for automated analyses and reasoning. They are a key component of the semantic web.
This paper provides background on what bio-ontologies are, why they are relevant to botany, and the principles of ontology development. It includes an overview of ontologies and related resources that are relevant to plant science, with a detailed description of the Plant Ontology (PO). We discuss the challenges of building an ontology that covers all green plants (Viridiplantae).
Key results
Ontologies can advance plant science in four keys areas: (1) comparative genetics, genomics, phenomics, and development; (2) taxonomy and systematics; (3) semantic applications; and (4) education.
Bio-ontologies offer a flexible framework for comparative plant biology, based on common botanical understanding. As genomic and phenomic data become available for more species, we anticipate that the annotation of data with ontology terms will become less centralized, while at the same time, the need for cross-species queries will become more common, causing more researchers in plant science to turn to ontologies.
PMCID: PMC3492881  PMID: 22847540
bio-ontologies; genome annotation; OBO Foundry; phenomics; plant anatomy; plant genomics; Plant Ontology; plant systematics; semantic web
9.  A genome triplication associated with early diversification of the core eudicots 
Genome Biology  2012;13(1):R3.
Although it is agreed that a major polyploidy event, gamma, occurred within the eudicots, the phylogenetic placement of the event remains unclear.
To determine when this polyploidization occurred relative to speciation events in angiosperm history, we employed a phylogenomic approach to investigate the timing of gene set duplications located on syntenic gamma blocks. We populated 769 putative gene families with large sets of homologs obtained from public transcriptomes of basal angiosperms, magnoliids, asterids, and more than 91.8 gigabases of new next-generation transcriptome sequences of non-grass monocots and basal eudicots. The overwhelming majority (95%) of well-resolved gamma duplications was placed before the separation of rosids and asterids and after the split of monocots and eudicots, providing strong evidence that the gamma polyploidy event occurred early in eudicot evolution. Further, the majority of gene duplications was placed after the divergence of the Ranunculales and core eudicots, indicating that the gamma appears to be restricted to core eudicots. Molecular dating estimates indicate that the duplication events were intensely concentrated around 117 million years ago.
The rapid radiation of core eudicot lineages that gave rise to nearly 75% of angiosperm species appears to have occurred coincidentally or shortly following the gamma triplication event. Reconciliation of gene trees with a species phylogeny can elucidate the timing of major events in genome evolution, even when genome sequences are only available for a subset of species represented in the gene trees. Comprehensive transcriptome datasets are valuable complements to genome sequences for high-resolution phylogenomic analysis.
PMCID: PMC3334584  PMID: 22280555
10.  Taking the First Steps towards a Standard for Reporting on Phylogenies: Minimal Information about a Phylogenetic Analysis (MIAPA) 
In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors.
PMCID: PMC3167193  PMID: 16901231
11.  Are substitution rates and RNA editing correlated? 
RNA editing is a post-transcriptional process that, in seed plants, involves a cytosine to uracil change in messenger RNA, causing the translated protein to differ from that predicted by the DNA sequence. RNA editing occurs extensively in plant mitochondria, but large differences in editing frequencies are found in some groups. The underlying processes responsible for the distribution of edited sites are largely unknown, but gene function, substitution rate, and gene conversion have been proposed to influence editing frequencies.
We studied five mitochondrial genes in the monocot order Alismatales, all showing marked differences in editing frequencies among taxa. A general tendency to lose edited sites was observed in all taxa, but this tendency was particularly strong in two clades, with most of the edited sites lost in parallel in two different areas of the phylogeny. This pattern is observed in at least four of the five genes analyzed. Except in the groups that show an unusually low editing frequency, the rate of C-to-T changes in edited sites was not significantly higher that in non-edited 3rd codon positions. This may indicate that selection is not actively removing edited sites in nine of the 12 families of the core Alismatales. In all genes but ccmB, a significant correlation was found between frequency of change in edited sites and synonymous substitution rate. In general, taxa with higher substitution rates tend to have fewer edited sites, as indicated by the phylogenetically independent correlation analyses. The elimination of edited sites in groups that lack or have reduced levels of editing could be a result of gene conversion involving a cDNA copy (retroprocessing). If so, this phenomenon could be relatively common in the Alismatales, and may have affected some groups recurrently. Indirect evidence of retroprocessing without a necessary correlation with substitution rate was found mostly in families Alismataceae and Hydrocharitaceae (e.g., groups that suffered a rapid elimination of all their edited sites, without a change in substitution rate).
The effects of substitution rate, selection, and/or gene conversion on the dynamics of edited sites in plant mitochondria remain poorly understood. Although we found an inverse correlation between substitution rate and editing frequency, this correlation is partially obscured by gene retroprocessing in lineages that have lost most of their edited sites. The presence of processed paralogs in plant mitochondria deserves further study, since most evidence of their occurrence is circumstantial.
PMCID: PMC2989974  PMID: 21070620
12.  Using Phylogenomic Patterns and Gene Ontology to Identify Proteins of Importance in Plant Evolution 
We use measures of congruence on a combined expressed sequenced tag genome phylogeny to identify proteins that have potential significance in the evolution of seed plants. Relevant proteins are identified based on the direction of partitioned branch and hidden support on the hypothesis obtained on a 16-species tree, constructed from 2,557 concatenated orthologous genes. We provide a general method for detecting genes or groups of genes that may be under selection in directions that are in agreement with the phylogenetic pattern. Gene partitioning methods and estimates of the degree and direction of support of individual gene partitions to the overall data set are used. Using this approach, we correlate positive branch support of specific genes for key branches in the seed plant phylogeny. In addition to basic metabolic functions, such as photosynthesis or hormones, genes involved in posttranscriptional regulation by small RNAs were significantly overrepresented in key nodes of the phylogeny of seed plants. Two genes in our matrix are of critical importance as they are involved in RNA-dependent regulation, essential during embryo and leaf development. These are Argonaute and the RNA-dependent RNA polymerase 6 found to be overrepresented in the angiosperm clade. We use these genes as examples of our phylogenomics approach and show that identifying partitions or genes in this way provides a platform to explain some of the more interesting organismal differences among species, and in particular, in the evolution of plants.
PMCID: PMC2997538  PMID: 20624728
phylogenomics; orthologs; partition metrics; gene ontology; micro-RNAs; small interfering RNAs
13.  Comparative Ovule and Megagametophyte Development in Hydatellaceae and Water Lilies Reveal a Mosaic of Features Among the Earliest Angiosperms 
Annals of Botany  2008;101(7):941-956.
Background and Aims
The embryo sac, nucellus and integuments of the early-divergent angiosperms Hydatellaceae and other Nymphaeales are compared with those of other seed plants, in order to evaluate the evolutionary origin of these characters in the angiosperms.
Using light microscopy, ovule and embryo sac development are described in five (of 12) species of Trithuria, the sole genus of Hydatellaceae, and compared with those of Cabombaceae and Nymphaeaceae.
Key Results
The ovule of Trithuria is bitegmic and tenuinucellate, rather than bitegmic and crassinucellate as in most other Nymphaeales. The seed is operculate and possesses a perisperm that develops precociously, which are both key features of Nymphaeales. However, in the Indian species T. konkanensis, perisperm is relatively poorly developed by the time of fertilization. Perisperm cells in Trithuria become multinucleate during development, a feature observed also in other Nymphaeales. The outer integument is semi-annular (‘hood-shaped’), as in Cabombaceae and some Nymphaeaceae, in contrast to the annular (‘cap-shaped’) outer integument of some other Nymphaeaceae (e.g. Barclaya) and Amborella. The megagametophyte in Trithuria is monosporic and four-nucleate; at the two-nucleate stage both nuclei occur in the micropylar domain. Double megagametophytes were frequently observed, probably developed from different megaspores of the same tetrad. Indirect, but strong evidence is presented for apomictic embryo development in T. filamentosa.
Most features of the ovule and embryo sac of Trithuria are consistent with a close relationship with other Nymphaeales, especially Cabombaceae. The frequent occurrence of double megagametophytes in the same ovule indicates a high degree of developmental flexibility, and could provide a clue to the evolutionary origin of the Polygonum-type of angiosperm embryo sac.
PMCID: PMC2710223  PMID: 18378513
Embryo sac; megagametophyte; ovule; Hydatellaceae; Trithuria
14.  ESTimating plant phylogeny: lessons from partitioning 
While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies.
A maximum parsimony (MP) analysis resulted in a single tree with relatively high support at all nodes in the tree despite rampant conflict among trees generated from the separate analysis of individual partitions. In a comparison of broader-scale groupings based on cellular compartment (ie: chloroplast, mitochondrial or nuclear) or function, only the nuclear partition tree (based largely on EST data) was found to be topologically identical to the tree based on the simultaneous analysis of all data. Despite topological conflict among the broader-scale groupings examined, only the tree based on morphological data showed statistically significant differences.
Based on the amount of character support contributed by EST data which make up a majority of the nuclear data set, and the lack of conflict of the nuclear data set with the simultaneous analysis tree, we conclude that the inclusion of EST data does provide a viable and efficient approach to address phylogenetic questions within a parsimony framework on a genomic scale, if problems of orthology determination and potential sequencing errors can be overcome. In addition, approaches that examine conflict and support in a simultaneous analysis framework allow for a more precise understanding of the evolutionary history of individual process partitions and may be a novel way to understand functional aspects of different kinds of cellular classes of gene products.
PMCID: PMC1564041  PMID: 16776834
15.  EST analysis in Ginkgo biloba: an assessment of conserved developmental regulators and gymnosperm specific genes 
BMC Genomics  2005;6:143.
Ginkgo biloba L. is the only surviving member of one of the oldest living seed plant groups with medicinal, spiritual and horticultural importance worldwide. As an evolutionary relic, it displays many characters found in the early, extinct seed plants and extant cycads. To establish a molecular base to understand the evolution of seeds and pollen, we created a cDNA library and EST dataset from the reproductive structures of male (microsporangiate), female (megasporangiate), and vegetative organs (leaves) of Ginkgo biloba.
RNA from newly emerged male and female reproductive organs and immature leaves was used to create three distinct cDNA libraries from which 6,434 ESTs were generated. These 6,434 ESTs from Ginkgo biloba were clustered into 3,830 unigenes. A comparison of our Ginkgo unigene set against the fully annotated genomes of rice and Arabidopsis, and all available ESTs in Genbank revealed that 256 Ginkgo unigenes match only genes among the gymnosperms and non-seed plants – many with multiple matches to genes in non-angiosperm plants. Conversely, another group of unigenes in Gingko had highly significant homology to transcription factors in angiosperms involved in development, including MADS box genes as well as post-transcriptional regulators. Several of the conserved developmental genes found in Ginkgo had top BLAST homology to cycad genes. We also note here the presence of ESTs in G. biloba similar to genes that to date have only been found in gymnosperms and an additional 22 Ginkgo genes common only to genes from cycads.
Our analysis of an EST dataset from G. biloba revealed genes potentially unique to gymnosperms. Many of these genes showed homology to fully sequenced clones from our cycad EST dataset found in common only with gymnosperms. Other Ginkgo ESTs are similar to developmental regulators in higher plants. This work sets the stage for future studies on Ginkgo to better understand seed and pollen evolution, and to resolve the ambiguous phylogenetic relationship of G. biloba among the gymnosperms.
PMCID: PMC1285361  PMID: 16225698
16.  Expressed sequence tag analysis in Cycas, the most primitive living seed plant 
Genome Biology  2003;4(12):R78.
Analysis of cycad ESTs has uncovered conserved and potentially novel genes. The presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling.
Cycads are ancient seed plants (living fossils) with origins in the Paleozoic. Cycads are sometimes considered a 'missing link' as they exhibit characteristics intermediate between vascular non-seed plants and the more derived seed plants. Cycads have also been implicated as the source of 'Guam's dementia', possibly due to the production of S(+)-beta-methyl-alpha, beta-diaminopropionic acid (BMAA), which is an agonist of animal glutamate receptors.
A total of 4,200 expressed sequence tags (ESTs) were created from Cycas rumphii and clustered into 2,458 contigs, of which 1,764 had low-stringency BLAST similarity to other plant genes. Among those cycad contigs with similarity to plant genes, 1,718 cycad 'hits' are to angiosperms, 1,310 match genes in gymnosperms and 734 match lower (non-seed) plants. Forty-six contigs were found that matched only genes in lower plants and gymnosperms. Upon obtaining the complete sequence from the clones of 37/46 contigs, 14 still matched only gymnosperms. Among those cycad contigs common to higher plants, ESTs were discovered that correspond to those involved in development and signaling in present-day flowering plants. We purified a cycad EST for a glutamate receptor (GLR)-like gene, as well as ESTs potentially involved in the synthesis of the GLR agonist BMAA.
Analysis of cycad ESTs has uncovered conserved and potentially novel genes. Furthermore, the presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling.
PMCID: PMC329417  PMID: 14659015

Results 1-16 (16)