Search tips
Search criteria

Results 1-3 (3)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
1.  The taxonomic name resolution service: an online tool for automated standardization of plant names 
BMC Bioinformatics  2013;14:16.
The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.
The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.
We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at and as a RESTful web service and application programming interface. Source code is available at
PMCID: PMC3554605  PMID: 23324024
Biodiversity informatics; Database integration; Taxonomy; Plants
2.  Applications of Natural Language Processing in Biodiversity Science 
Advances in Bioinformatics  2012;2012:391574.
Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.
PMCID: PMC3364545  PMID: 22685456
3.  Site-specific mutagenesis of Drosophila proliferating cell nuclear antigen enhances its effects on calf thymus DNA polymerase δ 
BMC Biochemistry  2004;5:13.
We and others have shown four distinct and presumably related effects of mammalian proliferating cell nuclear antigen (PCNA) on DNA synthesis catalyzed by mammalian DNA polymerase δ(pol δ). In the presence of homologous PCNA, pol δ exhibits 1) increased absolute activity; 2) increased processivity of DNA synthesis; 3) stable binding of synthetic oligonucleotide template-primers (t1/2 of the pol δ•PCNA•template-primer complex ≥2.5 h); and 4) enhanced synthesis of DNA opposite and beyond template base lesions. This last effect is potentially mutagenic in vivo. Biochemical studies performed in parallel with in vivo genetic analyses, would represent an extremely powerful approach to investigate further, both DNA replication and repair in eukaryotes.
Drosophila PCNA, although highly similar in structure to mammalian PCNA (e.g., it is >70% identical to human PCNA in amino acid sequence), can only substitute poorly for either calf thymus or human PCNA (~10% as well) in affecting calf thymus pol δ. However, by mutating one or only a few amino acids in the region of Drosophila PCNA thought to interact with pol δ, all four effects can be enhanced dramatically.
Our results therefore suggest that all four above effects depend at least in part on the PCNA-pol δ interaction. Moreover unlike mammals, Drosophila offers the potential for immediate in vivo genetic analyses. Although it has proven difficult to obtain sufficient amounts of homologous pol δ for parallel in vitro biochemical studies, by altering Drosophila PCNA using site-directed mutagenesis as suggested by our results, in vitro biochemical studies may now be performed using human and/or calf thymus pol δ preparations.
PMCID: PMC515284  PMID: 15310391

Results 1-3 (3)