PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  The taxonomic name resolution service: an online tool for automated standardization of plant names 
BMC Bioinformatics  2013;14:16.
Background
The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.
Results
The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.
Conclusions
We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.
doi:10.1186/1471-2105-14-16
PMCID: PMC3554605  PMID: 23324024
Biodiversity informatics; Database integration; Taxonomy; Plants
2.  A Single Origin for Nymphalid Butterfly Eyespots Followed by Widespread Loss of Associated Gene Expression 
PLoS Genetics  2012;8(8):e1002893.
Understanding how novel complex traits originate involves investigating the time of origin of the trait, as well as the origin of its underlying gene regulatory network in a broad comparative phylogenetic framework. The eyespot of nymphalid butterflies has served as an example of a novel complex trait, as multiple genes are expressed during eyespot development. Yet the origins of eyespots remain unknown. Using a dataset of more than 400 images of butterflies with a known phylogeny and gene expression data for five eyespot-associated genes from over twenty species, we tested origin hypotheses for both eyespots and eyespot-associated genes. We show that eyespots evolved once within the family Nymphalidae, approximately 90 million years ago, concurrent with expression of at least three genes associated with early eyespot development. We also show multiple losses of expression of most genes from this early three-gene cluster, without corresponding losses of eyespots. We propose that complex traits, such as eyespots, may have originated via co-option of a large pre-existing complex gene regulatory network that was subsequently streamlined of genes not required to fulfill its novel developmental function.
Author Summary
Butterfly eyespots play an essential role in natural and sexual selection, yet the evolutionary origins of eyespots and of their underlying gene regulatory network remain unknown. By scoring phenotypes and wing expression of five genes in 399 and 21 nymphalid species, respectively, we tested when eyespots and expression of their associated genes evolved. We found that the origin of eyespots was concurrent with the origin of the gene expression patterns, approximately 90 million years ago. Following this event, many genes expressed in eyespot development were lost in some lineages without a corresponding loss of eyespots, indicating substantial evolution in the cluster of genes associated with eyespots. This finding suggests that complex traits such as butterfly eyespots may initially evolve by re-deploying pre-existing gene regulatory networks, which are subsequently trimmed of genes that are unnecessary in the novel context.
doi:10.1371/journal.pgen.1002893
PMCID: PMC3420954  PMID: 22916033
3.  The iPlant Collaborative: Cyberinfrastructure for Plant Biology 
The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.
doi:10.3389/fpls.2011.00034
PMCID: PMC3355756  PMID: 22645531
cyberinfrastructure; bioinformatics; plant biology; computational biology
4.  Meeting Report from the Second “Minimum Information for Biological and Biomedical Investigations” (MIBBI) workshop 
Standards in Genomic Sciences  2010;3(3):259-266.
This report summarizes the proceedings of the second workshop of the ‘Minimum Information for Biological and Biomedical Investigations’ (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany through the sponsorship of the Beilstein-Institute. MIBBI is an umbrella organization uniting communities developing Minimum Information (MI) checklists to standardize the description of data sets, the workflows by which they were generated and the scientific context for the work. This workshop brought together representatives of more than twenty communities to present the status of their MI checklists and plans for future development. Shared challenges and solutions were identified and the role of MIBBI in MI checklist development was discussed. The meeting featured some thirty presentations, wide-ranging discussions and breakout groups. The top outcomes of the two-day workshop as defined by the participants were: 1) the chance to share best practices and to identify areas of synergy; 2) defining a series of tasks for updating the MIBBI Portal; 3) reemphasizing the need to maintain independent MI checklists for various communities while leveraging common terms and workflow elements contained in multiple checklists; and 4) revision of the concept of the MIBBI Foundry to focus on the creation of a core set of MIBBI modules intended for reuse by individual MI checklist projects while maintaining the integrity of each MI project. Further information about MIBBI and its range of activities can be found at http://mibbi.org/.
doi:10.4056/sigs.147362
PMCID: PMC3035314  PMID: 21304730
5.  The genomic response of skeletal muscle to methylprednisolone using microarrays: tailoring data mining to the structure of the pharmacogenomic time series 
Pharmacogenomics  2004;5(5):525-552.
High-throughput data collection using gene microarrays has great potential as a method for addressing the pharmacogenomics of complex biological systems. Similarly, mechanism-based pharmacokinetic/pharmacodynamic modeling provides a tool for formulating quantitative testable hypotheses concerning the responses of complex biological systems. As the response of such systems to drugs generally entails cascades of molecular events in time, a time series design provides the best approach to capturing the full scope of drug effects. A major problem in using microarrays for high-throughput data collection is sorting through the massive amount of data in order to identify probe sets and genes of interest. Due to its inherent redundancy, a rich time series containing many time points and multiple samples per time point allows for the use of less stringent criteria of expression, expression change and data quality for initial filtering of unwanted probe sets. The remaining probe sets can then become the focus of more intense scrutiny by other methods, including temporal clustering, functional clustering and pharmacokinetic/pharmacodynamic modeling, which provide additional ways of identifying the probes and genes of pharmacological interest.
doi:10.1517/14622416.5.5.525
PMCID: PMC2607486  PMID: 15212590
corticosteroids; data mining; expression profiling; gene chips; methylprednisolone; microarrays; modeling; pharmacodynamics; skeletal muscle; time series
6.  The 2006 NESCent Phyloinformatics Hackathon: A Field Report 
In December, 2006, a group of 26 software developers from some of the most widely used life science programming toolkits and phylogenetic software projects converged on Durham, North Carolina, for a Phyloinformatics Hackathon, an intense five-day collaborative software coding event sponsored by the National Evolutionary Synthesis Center (NESCent). The goal was to help researchers to integrate multiple phylogenetic software tools into automated workflows. Participants addressed deficiencies in interoperability between programs by implementing “glue code” and improving support for phylogenetic data exchange standards (particularly NEXUS) across the toolkits. The work was guided by use-cases compiled in advance by both developers and users, and the code was documented as it was developed. The resulting software is freely available for both users and developers through incorporation into the distributions of several widely-used open-source toolkits. We explain the motivation for the hackathon, how it was organized, and discuss some of the outcomes and lessons learned. We conclude that hackathons are an effective mode of solving problems in software interoperability and usability, and are underutilized in scientific software development.
PMCID: PMC2684128
phylogenetics; phyloinformatics; open source software; analysis workflow
7.  Fast Structural Search in Phylogenetic Databases 
As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P. The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising.
PMCID: PMC2658875  PMID: 19325851
Structural pattern matching; structural search and retrieval; tree search strategies; phylogenetic trees

Results 1-7 (7)