PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of amiasummtspLink to Publisher's site
 
AMIA Summits Transl Sci Proc. 2010; 2010: 56–60.
Published online 2010 March 1.
PMCID: PMC3041533
Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
Brett R. South, MS,1,2,5 Shuying Shen, MStat,1,2,5 Wendy W. Chapman, PhD,3 Sylvain Delisle, MD, MBA,4 Matthew H. Samore, MD,1,2,5 and Adi V. Gundlapalli, MD, PhD, MS1,2,5
1VA Salt Lake City Health Care System
2Department of Internal Medicine, University of Utah School of Medicine
3Department of Biomedical Informatics, University of Pittsburgh
4VA Maryland Health Care System and University of Maryland School of Medicine
5Department of Biomedical Informatics, University of Utah School of Medicine
Abstract
Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance.
Articles from AMIA Summits on Translational Science Proceedings are provided here courtesy of
American Medical Informatics Association