PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (25)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  BoB, a best-of-breed automated text de-identification system for VHA clinical documents 
Objective
De-identification allows faster and more collaborative clinical research while protecting patient confidentiality. Clinical narrative de-identification is a tedious process that can be alleviated by automated natural language processing methods. The goal of this research is the development of an automated text de-identification system for Veterans Health Administration (VHA) clinical documents.
Materials and methods
We devised a novel stepwise hybrid approach designed to improve the current strategies used for text de-identification. The proposed system is based on a previous study on the best de-identification methods for VHA documents. This best-of-breed automated clinical text de-identification system (aka BoB) tackles the problem as two separate tasks: (1) maximize patient confidentiality by redacting as much protected health information (PHI) as possible; and (2) leave de-identified documents in a usable state preserving as much clinical information as possible.
Results
We evaluated BoB with a manually annotated corpus of a variety of VHA clinical notes, as well as with the 2006 i2b2 de-identification challenge corpus. We present evaluations at the instance- and token-level, with detailed results for BoB's main components. Moreover, an existing text de-identification system was also included in our evaluation.
Discussion
BoB's design efficiently takes advantage of the methods implemented in its pipeline, resulting in high sensitivity values (especially for sensitive PHI categories) and a limited number of false positives.
Conclusions
Our system successfully addressed VHA clinical document de-identification, and its hybrid stepwise design demonstrates robustness and efficiency, prioritizing patient confidentiality while leaving most clinical information intact.
doi:10.1136/amiajnl-2012-001020
PMCID: PMC3555325  PMID: 22947391
2.  Using Natural Language Processing on the Free Text of Clinical Documents to Screen for Evidence of Homelessness Among US Veterans 
Information retrieval algorithms based on natural language processing (NLP) of the free text of medical records have been used to find documents of interest from databases. Homelessness is a high priority non-medical diagnosis that is noted in electronic medical records of Veterans in Veterans Affairs (VA) facilities. Using a human-reviewed reference standard corpus of clinical documents of Veterans with evidence of homelessness and those without, an open-source NLP tool (Automated Retrieval Console v2.0, ARC) was trained to classify documents. The best performing model based on document level work-flow performed well on a test set (Precision 94%, Recall 97%, F-Measure 96). Processing of a naïve set of 10,000 randomly selected documents from the VA using this best performing model yielded 463 documents flagged as positive, indicating a 4.7% prevalence of homelessness. Human review noted a precision of 70% for these flags resulting in an adjusted prevalence of homelessness of 3.3% which matches current VA estimates. Further refinements are underway to improve the performance. We demonstrate an effective and rapid lifecycle of using an off-the-shelf NLP tool for screening targets of interest from medical records.
PMCID: PMC3900197  PMID: 24551356
3.  Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure 
Objectives
Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics.
Materials and methods
We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements.
Results
System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%).
Discussion
An EF value of <40% can be accurately identified in VA echocardiogram reports.
Conclusions
An automated information extraction system can be used to accurately extract EF for quality measurement.
doi:10.1136/amiajnl-2011-000535
PMCID: PMC3422820  PMID: 22437073
Natural language processing (NLP); heart failure; left ventricular ejection fraction (EF); Improving healthcare workflow and process efficiency; applied informatics; Improving government and community policy relevant to informatics and health quality; process modeling and hypothesis generation; Informatics; Enhancing the conduct of biological/clinical research and trials; applications that link biomedical knowledge from diverse primary sources (includes automated indexing); visualization of data and knowledge; uncertain reasoning and decision theory; languages and computational methods; statistical analysis of large datasets; advanced algorithms; discovery and text and data mining methods; other methods of information extraction; automated learning; human-computer interaction and human-centered computing; cognitive study (including experiments emphasizing verbal protocol analysis and usability); knowledge representations; knowledge acquisition and knowledge management; delivering health information and knowledge to the public; processing and display; analysis; image representation; controlled terminologies and vocabularies; ontologies; knowledge bases; ejection; fraction; machine learning; simulation of complex systems (at all levels: molecules to work groups to organizations); developing/using clinical decision support (other than diagnostic) and guideline systems; detecting disease outbreaks and biological threats
4.  Evaluating the state of the art in coreference resolution for electronic medical records 
Background
The fifth i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records conducted a systematic review on resolution of noun phrase coreference in medical records. Informatics for Integrating Biology and the Bedside (i2b2) and the Veterans Affair (VA) Consortium for Healthcare Informatics Research (CHIR) partnered to organize the coreference challenge. They provided the research community with two corpora of medical records for the development and evaluation of the coreference resolution systems. These corpora contained various record types (ie, discharge summaries, pathology reports) from multiple institutions.
Methods
The coreference challenge provided the community with two annotated ground truth corpora and evaluated systems on coreference resolution in two ways: first, it evaluated systems for their ability to identify mentions of concepts and to link together those mentions. Second, it evaluated the ability of the systems to link together ground truth mentions that refer to the same entity. Twenty teams representing 29 organizations and nine countries participated in the coreference challenge.
Results
The teams' system submissions showed that machine-learning and rule-based approaches worked best when augmented with external knowledge sources and coreference clues extracted from document structure. The systems performed better in coreference resolution when provided with ground truth mentions. Overall, the systems struggled in solving coreference resolution for cases that required domain knowledge.
doi:10.1136/amiajnl-2011-000784
PMCID: PMC3422835  PMID: 22366294
5.  Extracting Surveillance Data from Templated Sections of an Electronic Medical Note: Challenges and Opportunities 
Objective
To highlight the importance of templates in extracting surveillance data from the free text of electronic medical records using natural language processing (NLP) techniques.
Introduction
The main stay of recording patient data is the free text of electronic medical records (EMR). While stating the chief complaint and history of presenting illness in the patients ‘own words’, the rest of the electronic note is written by the provider in their words. Providers often use boiler-plate templates from EMR pull-downs to document information on the patient in the form of checklists, check boxes, yes/no and free text responses to questions. When these templates are used for recording symptoms, demographic information or medical, social or travel history, they represent an important source of surveillance data [1]. There is a dearth of literature on the use of natural language processing in extracting data from templates in the EMR.
Methods
A corpus of 1000 free text medical notes from the VA integrated electronic medical record (CPRS) was reviewed to identify commonly used templates. Of these, 500 were enriched for the surveillance domain of interest for this project (homelessness). The other 500 were randomly sampled from a large corpus of electronic notes. An NLP algorithm was developed to extract concepts related to our target surveillance domain. A manual review of the notes was performed by three human reviewers to generate a document-level reference standard that classified this set of documents as either demonstrating evidence of homelessness (H) or not (NH). A rule-based NLP algorithm was developed that used a combination of key word searches and negation based on an extensive lexicon of terms developed for this purpose. A random sample of 50 documents each of H and NH documents were reviewed after each iteration of the NLP algorithm to determine the false positive rate of the extracted concepts.
Results
The corpus consisted of 48% H and 52% NH documents as determined by human review. The NLP algorithm successfully extracted concepts from these documents. The H set had an average of 8 concepts related to homelessness per document (median 8, range 1 to 34). The NH set had an average 2 concepts (median 1, range 1 to 13)”. Thirteen template patterns were identified in this set of documents. The three most common were check boxes with square brackets, Yes/No and free text answer after a question. Several positively and negatively asserted concepts were noted to be in the responses to templated questions such as “Are you currently homeless: Yes or No”; “How many times have you been homeless in the past 3 years: (free text response)”; “Have you ever been in jail? [Y] or [N]”; Are you in need of substance abuse services? Yes or No”. Human review of a random sample of documents at the concept level indicated that the NLP algorithm generated 28% false positives in extracting concepts related to homelessness when templates were ignored among the H documents. When the algorithm was refined to include templates, the false positive rate declined to 22%. For the NH documents, the corresponding false positive rates were 56% and 21%.
Conclusions
To our knowledge, this is one of the first attempts to address the problem of information extraction from templates or templated sections of the EMR. A key challenge of templates is that they will most likely lead to poor performance of NLP algorithms and cause bottlenecks in processing if they are not considered. Acknowledging the presence of templates and refining NLP algorithms to handle them improves information extraction from free text medical notes, thus creating an opportunity for improved surveillance using the EMR. Algorithms will likely need to be customized to the electronic medical record and the surveillance domain of interest. A more detailed analysis of the templated sections is underway.
PMCID: PMC3692923
natural language processing; surveillance; templates; VA
6.  How Much Does Automatic Text De-Identification Impact Clinical Problems, Tests, and Treatments?  
Clinical text de-identification can potentially overlap with clinical information such as medical problems or treatments, therefore causing this information to be lost. In this study, we focused on the analysis of the overlap between the 2010 i2b2 NLP challenge concept annotations, with the PHI annotations of our best-of-breed clinical text de-identification application. Overall, 0.81% of the annotations overlapped exactly, and 1.78% partly overlapped.
PMCID: PMC3845794  PMID: 24303260
7.  “Sitting on Pins and Needles”: Characterization of Symptom Descriptions in Clinical Notes”  
Patients report their symptoms and subjective experiences in their own words. These expressions may be clinically meaningful yet are difficult to capture using automated methods. We annotated subjective symptom expressions in 750 clinical notes from the Veterans Affairs EHR. Within each document, subjective symptom expressions were compared to mentions of symptoms in clinical terms and to the assigned ICD-9-CM codes for the encounter. A total of 543 subjective symptom expressions were identified, of which 66.5% were categorized as mental/behavioral experiences and 33.5% somatic experiences. Only two subjective expressions were coded using ICD-9-CM. Subjective expressions were restated in semantically related clinical terms in 246 (45.3%) instances. Nearly one third (31%) of subjective expressions were not coded or restated in standard terminology. The results highlight the diversity of symptom descriptions and the opportunities to further develop natural language processing to extract symptom expressions that are unobtainable by other automated methods.
PMCID: PMC3845746  PMID: 24303238
8.  Generalizability and Comparison of Automatic Clinical Text De-Identification Methods and Resources 
In this paper, we present an evaluation of the hybrid best-of-breed automated VHA (Veteran’s Health Administration) clinical text de-identification system, nicknamed BoB, developed within the VHA Consortium for Healthcare Informatics Research. We also evaluate two available machine learning-based text de-identifications systems: MIST and HIDE. Two different clinical corpora were used for this evaluation: a manually annotated VHA corpus, and the 2006 i2b2 de-identification challenge corpus. These experiments focus on the generalizability and portability of the classification models across different document sources. BoB demonstrated good recall (92.6%), satisfactorily prioritizing patient privacy, and also achieved competitive precision (83.6%) for preserving subsequent document interpretability. MIST and HIDE reached very competitive results, in most cases with high precision (92.6% and 93.6%), although recall was sometimes lower than desired for the most sensitive PHI categories.
PMCID: PMC3540471  PMID: 23304289
9.  The Relationship Between Structural Characteristics of 2010 Challenge Documents and Ratings of Document Quality 
Quality of clinical narratives has direct impact on the perceived usefulness of these documents. With the advent of electronic documentation, the quality of clinical documents has been debated. Electronic documentation is supported by features to enhance efficiency, including copy/paste, templates, multi-level headings, and inserted objects. The impact of these features on perceived document quality has been difficult to assess in real settings as compared to simulations. This study used electronic notes from the 2010 i2b2/VA Challenge to explore the impact of text characteristics on general perception of document quality. We administered a validated instrument to assess document quality, focusing on two dimensions, informativeness and readability. Text characteristics were collected from both subjective ratings and quantitative summary. The results suggested common clinical elements such as templates, headings and inserted objects had strong positive association with document quality. Understanding of such relationship may prove useful in future EHR design and informatics research.
PMCID: PMC3540529  PMID: 23304359
10.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text 
The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification.
These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.
doi:10.1136/amiajnl-2011-000203
PMCID: PMC3168320  PMID: 21685143
Information storage and retrieval (text and images); discovery; and text and data mining methods; Other methods of information extraction; Natural-language processing; Automated learning; visualization of data and knowledge; uncertain reasoning and decision theory; languages, and computational methods; statistical analysis of large datasets; advanced algorithms; discovery; other methods of information extraction; automated learning; human-computer interaction and human-centered computing; NLP; machine learning; Informatics
11.  Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents 
Background
The increased use and adoption of Electronic Health Records (EHR) causes a tremendous growth in digital information useful for clinicians, researchers and many other operational purposes. However, this information is rich in Protected Health Information (PHI), which severely restricts its access and possible uses. A number of investigators have developed methods for automatically de-identifying EHR documents by removing PHI, as specified in the Health Insurance Portability and Accountability Act “Safe Harbor” method.
This study focuses on the evaluation of existing automated text de-identification methods and tools, as applied to Veterans Health Administration (VHA) clinical documents, to assess which methods perform better with each category of PHI found in our clinical notes; and when new methods are needed to improve performance.
Methods
We installed and evaluated five text de-identification systems “out-of-the-box” using a corpus of VHA clinical documents. The systems based on machine learning methods were trained with the 2006 i2b2 de-identification corpora and evaluated with our VHA corpus, and also evaluated with a ten-fold cross-validation experiment using our VHA corpus. We counted exact, partial, and fully contained matches with reference annotations, considering each PHI type separately, or only one unique ‘PHI’ category. Performance of the systems was assessed using recall (equivalent to sensitivity) and precision (equivalent to positive predictive value) metrics, as well as the F2-measure.
Results
Overall, systems based on rules and pattern matching achieved better recall, and precision was always better with systems based on machine learning approaches. The highest “out-of-the-box” F2-measure was 67% for partial matches; the best precision and recall were 95% and 78%, respectively. Finally, the ten-fold cross validation experiment allowed for an increase of the F2-measure to 79% with partial matches.
Conclusions
The “out-of-the-box” evaluation of text de-identification systems provided us with compelling insight about the best methods for de-identification of VHA clinical documents. The errors analysis demonstrated an important need for customization to PHI formats specific to VHA documents. This study informed the planning and development of a “best-of-breed” automatic de-identification application for VHA clinical text.
doi:10.1186/1471-2288-12-109
PMCID: PMC3445850  PMID: 22839356
Confidentiality, patient data privacy [MeSH F04.096.544.335.240]; Natural language processing [L01.224.065.580]; Health insurance portability and accountability act [N03.219.521.576.343.349]; De-identification; Anonymization; Electronic health records [E05.318.308.940.968.625.500]; United States department of veterans affairs [I01.409.137.500.700]
12.  Sentiment Analysis of Suicide Notes: A Shared Task 
Biomedical informatics insights  2012;5(Suppl 1):3-16.
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
PMCID: PMC3299408  PMID: 22419877
Sentiment analysis; suicide; suicide notes; natural language processing; computational linguistics; shared task; challenge 2011
13.  Automatically Detecting Medications and the Reason for their Prescription in Clinical Narrative Text Documents 
An important proportion of the information about the medications a patient is taking is mentioned only in narrative text in the electronic health record. Automated information extraction can make this information accessible for decision-support, research, or any other automated processing. In the context of the “i2b2 medication extraction challenge,” we have developed a new NLP application called Textractor to automatically extract medications and details about them (e.g., dosage, frequency, reason for their prescription). This application and its evaluation with part of the reference standard for this “challenge” are presented here, along with an analysis of the development of this reference standard. During this evaluation, Textractor reached a system-level overall F1-measure, the reference metric for this challenge, of about 77% for exact matches. The best performance was measured with medication routes (F1-measure 86.4%), and the worst with prescription reasons (F1-measure 29%). These results are consistent with the agreement observed between human annotators when developing the reference standard, and with other published research.
PMCID: PMC3238676  PMID: 20841823
Pharmaceutical Preparations; Drug Prescriptions; Natural Language Processing; Program Evaluation; Knowledge Bases
14.  Textractor: a hybrid system for medications and reason for their prescription extraction from clinical text documents 
Objective
To describe a new medication information extraction system—Textractor—developed for the ‘i2b2 medication extraction challenge’. The development, functionalities, and official evaluation of the system are detailed.
Design
Textractor is based on the Apache Unstructured Information Management Architecture (UMIA) framework, and uses methods that are a hybrid between machine learning and pattern matching. Two modules in the system are based on machine learning algorithms, while other modules use regular expressions, rules, and dictionaries, and one module embeds MetaMap Transfer.
Measurements
The official evaluation was based on a reference standard of 251 discharge summaries annotated by all teams participating in the challenge. The metrics used were recall, precision, and the F1-measure. They were calculated with exact and inexact matches, and were averaged at the level of systems and documents.
Results
The reference metric for this challenge, the system-level overall F1-measure, reached about 77% for exact matches, with a recall of 72% and a precision of 83%. Performance was the best with route information (F1-measure about 86%), and was good for dosage and frequency information, with F1-measures of about 82–85%. Results were not as good for durations, with F1-measures of 36–39%, and for reasons, with F1-measures of 24–27%.
Conclusion
The official evaluation of Textractor for the i2b2 medication extraction challenge demonstrated satisfactory performance. This system was among the 10 best performing systems in this challenge.
doi:10.1136/jamia.2010.004028
PMCID: PMC2995680  PMID: 20819864
15.  Correction: Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections 
PLoS ONE  2011;6(1):10.1371/annotation/8a0d067d-5bf7-4e2e-b7d2-56444560d66d.
doi:10.1371/annotation/8a0d067d-5bf7-4e2e-b7d2-56444560d66d
PMCID: PMC3021490
16.  Qualitative Analysis of Workflow Modifications Used to Generate the Reference Standard for the 2010 i2b2/VA Challenge 
AMIA Annual Symposium Proceedings  2011;2011:1243-1251.
The Department of Veterans Affairs (VA) and the Informatics for Integrating Biology and the Bedside (i2b2) team partnered to generate the reference standard for the 2010 i2b2/VA challenge task on concept extraction, assertion classification, and relation classification. The purpose of this paper is to report an in-depth qualitative analysis of the experience and perceptions of human annotators for these tasks. Transcripts of semi-structured interviews were analyzed using qualitative methods to identify key constructs and themes related to these annotation tasks. Interventions were embedded with these tasks using pre-annotation of clinical concepts and a modified annotation workflow. From the human perspective, annotation tasks involve an inherent conflict between bias, accuracy, and efficiency. This analysis deepens understanding of the biases, complexities and impact of variations in the annotation process that may affect annotation task reliability and reference standard validity that are generalizable for other similar large-scale clinical corpus annotation projects.
PMCID: PMC3243132  PMID: 22195185
17.  Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections 
PLoS ONE  2010;5(10):e13377.
Background
The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI).
Methodology
A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis.
Principal Findings
An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52–68% and retained sensitivities of 69–73%.
Conclusion
Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties.
doi:10.1371/journal.pone.0013377
PMCID: PMC2954790  PMID: 20976281
18.  Automatic de-identification of textual documents in the electronic health record: a review of recent research 
Background
In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here.
Methods
This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers.
Results
The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries.
Conclusions
In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.
doi:10.1186/1471-2288-10-70
PMCID: PMC2923159  PMID: 20678228
19.  Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features 
Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance.
PMCID: PMC3041533  PMID: 21347150
20.  Natural Language Processing for Lines and Devices in Portable Chest X-Rays 
Radiology reports are unstructured free text documents that describe abnormalities in patients that are visible via imaging modalities such as X-ray. The number of imaging examinations performed in clinical care is enormous, and mining large repositories of radiology reports connected with clinical data such as patient outcomes could enable epidemiological studies, such as correlating the frequency of infections to the presence or length of time medical devices are present in patients. We developed a natural language processing (NLP) system to recognize device mentions in radiology reports and information about their state (insertion or removal) to enable epidemiological research. We tested our system using a reference standard of reports that were annotated to indicate this information. Our system performed with high accuracy (recall and precision of 97% and 99% for device mentions and 91–96% for device insertion status). Our methods are generalizable to other types of radiology reports as well as to other information extraction tasks and could provide the foundation for tools that enable epidemiological research exploration based on mining radiology reports.
PMCID: PMC3041297  PMID: 21347067
21.  Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease 
BMC Bioinformatics  2009;10(Suppl 9):S12.
Background
Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.
Methods
Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.
Results
Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.
Conclusion
Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.
doi:10.1186/1471-2105-10-S9-S12
PMCID: PMC2745683  PMID: 19761566
22.  Developing a Manually Annotated Clinical Document Corpus to Identify Phenotypic Information for Inflammatory Bowel Disease 
Background
Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.
Methods
Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.
Results
Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.
Conclusions
Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.
PMCID: PMC3041557  PMID: 21347157
23.  Inductive Creation of an Annotation Schema and a Reference Standard for De-identification of VA Electronic Clinical Notes 
Accessing both structured and unstructured clinical data is a high priority for research efforts. However, HIPAA requires that data meet or exceed a deidentification standard to assure that protected health information (PHI) is removed. This is a particularly difficult problem in the case of unstructured clinical free text and natural language processing (NLP) systems can be trained to automatically de-identify clinical text. Moreover, manual human annotation of clinical note documents for the purpose of building reference standards to evaluate NLP systems is a costly and time consuming process. Annotation schema must be created that can be used to build reliable and valid reference standards to evaluate NLP systems for the deidentification task. We describe the inductive creation of an annotation schema and subsequent reference standard. We also provide estimates of the accuracy of human annotators for this particular task.
PMCID: PMC2815367  PMID: 20351891
24.  Application of Natural Language Processing to VA Electronic Health Records to Identify Phenotypic Characteristics for Clinical and Research Purposes 
Informatics tools to extract and analyze clinical information on patients have lagged behind data-mining developments in bioinformatics. While the analyses of an individual’s partial or complete genotype is nearly a reality, the phenotypic characteristics that accompany the genotype are not well known and largely inaccessible in free-text patient health records. As the adoption of electronic medical records increases, there exists an urgent need to extract pertinent phenotypic information and make that available to clinicians and researchers. This usually requires the data to be in a structured format that is both searchable and amenable to computation. Using inflammatory bowel disease as an example, this study demonstrates the utility of a natural language processing system (MedLEE) in mining clinical notes in the paperless VA Health Care System. This adaptation of MedLEE is useful for identifying patients with specific clinical conditions, those at risk for or those with symptoms suggestive of those conditions.
PMCID: PMC3041527  PMID: 21347124
25.  Optimizing A Syndromic Surveillance Text Classifier for Influenza-like Illness: Does Document Source Matter? 
Syndromic surveillance systems that incorporate electronic free-text data have primarily focused on extracting concepts of interest from chief complaint text, emergency department visit notes, and nurse triage notes. Due to availability and access, there has been limited work in the area of surveilling the full text of all electronic note documents compared with more specific document sources. This study provides an evaluation of the performance of a text classifier for detection of influenza-like illness (ILI) by document sources that are commonly used for biosurveillance by comparing them to routine visit notes, and a full electronic note corpus approach. Evaluating the performance of an automated text classifier for syndromic surveillance by source document will inform decisions regarding electronic textual data sources for potential use by automated biosurveillance systems. Even when a full electronic medical record is available, commonly available surveillance source documents provide acceptable statistical performance for automated ILI surveillance.
PMCID: PMC2655960  PMID: 18999051

Results 1-25 (25)