PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-19 (19)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  Generalizability and Comparison of Automatic Clinical Text De-Identification Methods and Resources 
In this paper, we present an evaluation of the hybrid best-of-breed automated VHA (Veteran’s Health Administration) clinical text de-identification system, nicknamed BoB, developed within the VHA Consortium for Healthcare Informatics Research. We also evaluate two available machine learning-based text de-identifications systems: MIST and HIDE. Two different clinical corpora were used for this evaluation: a manually annotated VHA corpus, and the 2006 i2b2 de-identification challenge corpus. These experiments focus on the generalizability and portability of the classification models across different document sources. BoB demonstrated good recall (92.6%), satisfactorily prioritizing patient privacy, and also achieved competitive precision (83.6%) for preserving subsequent document interpretability. MIST and HIDE reached very competitive results, in most cases with high precision (92.6% and 93.6%), although recall was sometimes lower than desired for the most sensitive PHI categories.
PMCID: PMC3540471  PMID: 23304289
2.  The Relationship Between Structural Characteristics of 2010 Challenge Documents and Ratings of Document Quality 
Quality of clinical narratives has direct impact on the perceived usefulness of these documents. With the advent of electronic documentation, the quality of clinical documents has been debated. Electronic documentation is supported by features to enhance efficiency, including copy/paste, templates, multi-level headings, and inserted objects. The impact of these features on perceived document quality has been difficult to assess in real settings as compared to simulations. This study used electronic notes from the 2010 i2b2/VA Challenge to explore the impact of text characteristics on general perception of document quality. We administered a validated instrument to assess document quality, focusing on two dimensions, informativeness and readability. Text characteristics were collected from both subjective ratings and quantitative summary. The results suggested common clinical elements such as templates, headings and inserted objects had strong positive association with document quality. Understanding of such relationship may prove useful in future EHR design and informatics research.
PMCID: PMC3540529  PMID: 23304359
3.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text 
The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification.
These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.
doi:10.1136/amiajnl-2011-000203
PMCID: PMC3168320  PMID: 21685143
Information storage and retrieval (text and images); discovery; and text and data mining methods; Other methods of information extraction; Natural-language processing; Automated learning; visualization of data and knowledge; uncertain reasoning and decision theory; languages, and computational methods; statistical analysis of large datasets; advanced algorithms; discovery; other methods of information extraction; automated learning; human-computer interaction and human-centered computing; NLP; machine learning; Informatics
4.  Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents 
Background
The increased use and adoption of Electronic Health Records (EHR) causes a tremendous growth in digital information useful for clinicians, researchers and many other operational purposes. However, this information is rich in Protected Health Information (PHI), which severely restricts its access and possible uses. A number of investigators have developed methods for automatically de-identifying EHR documents by removing PHI, as specified in the Health Insurance Portability and Accountability Act “Safe Harbor” method.
This study focuses on the evaluation of existing automated text de-identification methods and tools, as applied to Veterans Health Administration (VHA) clinical documents, to assess which methods perform better with each category of PHI found in our clinical notes; and when new methods are needed to improve performance.
Methods
We installed and evaluated five text de-identification systems “out-of-the-box” using a corpus of VHA clinical documents. The systems based on machine learning methods were trained with the 2006 i2b2 de-identification corpora and evaluated with our VHA corpus, and also evaluated with a ten-fold cross-validation experiment using our VHA corpus. We counted exact, partial, and fully contained matches with reference annotations, considering each PHI type separately, or only one unique ‘PHI’ category. Performance of the systems was assessed using recall (equivalent to sensitivity) and precision (equivalent to positive predictive value) metrics, as well as the F2-measure.
Results
Overall, systems based on rules and pattern matching achieved better recall, and precision was always better with systems based on machine learning approaches. The highest “out-of-the-box” F2-measure was 67% for partial matches; the best precision and recall were 95% and 78%, respectively. Finally, the ten-fold cross validation experiment allowed for an increase of the F2-measure to 79% with partial matches.
Conclusions
The “out-of-the-box” evaluation of text de-identification systems provided us with compelling insight about the best methods for de-identification of VHA clinical documents. The errors analysis demonstrated an important need for customization to PHI formats specific to VHA documents. This study informed the planning and development of a “best-of-breed” automatic de-identification application for VHA clinical text.
doi:10.1186/1471-2288-12-109
PMCID: PMC3445850  PMID: 22839356
Confidentiality, patient data privacy [MeSH F04.096.544.335.240]; Natural language processing [L01.224.065.580]; Health insurance portability and accountability act [N03.219.521.576.343.349]; De-identification; Anonymization; Electronic health records [E05.318.308.940.968.625.500]; United States department of veterans affairs [I01.409.137.500.700]
5.  Sentiment Analysis of Suicide Notes: A Shared Task 
Biomedical informatics insights  2012;5(Suppl 1):3-16.
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
PMCID: PMC3299408  PMID: 22419877
Sentiment analysis; suicide; suicide notes; natural language processing; computational linguistics; shared task; challenge 2011
6.  Sentiment Analysis of Suicide Notes: A Shared Task 
Biomedical Informatics Insights  2012;5(Suppl. 1):3-16.
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
doi:10.4137/BII.S9042
PMCID: PMC3299408  PMID: 22419877
Sentiment analysis; suicide; suicide notes; natural language processing; computational linguistics; shared task; challenge 2011
7.  Automatically Detecting Medications and the Reason for their Prescription in Clinical Narrative Text Documents 
An important proportion of the information about the medications a patient is taking is mentioned only in narrative text in the electronic health record. Automated information extraction can make this information accessible for decision-support, research, or any other automated processing. In the context of the “i2b2 medication extraction challenge,” we have developed a new NLP application called Textractor to automatically extract medications and details about them (e.g., dosage, frequency, reason for their prescription). This application and its evaluation with part of the reference standard for this “challenge” are presented here, along with an analysis of the development of this reference standard. During this evaluation, Textractor reached a system-level overall F1-measure, the reference metric for this challenge, of about 77% for exact matches. The best performance was measured with medication routes (F1-measure 86.4%), and the worst with prescription reasons (F1-measure 29%). These results are consistent with the agreement observed between human annotators when developing the reference standard, and with other published research.
PMCID: PMC3238676  PMID: 20841823
Pharmaceutical Preparations; Drug Prescriptions; Natural Language Processing; Program Evaluation; Knowledge Bases
8.  Textractor: a hybrid system for medications and reason for their prescription extraction from clinical text documents 
Objective
To describe a new medication information extraction system—Textractor—developed for the ‘i2b2 medication extraction challenge’. The development, functionalities, and official evaluation of the system are detailed.
Design
Textractor is based on the Apache Unstructured Information Management Architecture (UMIA) framework, and uses methods that are a hybrid between machine learning and pattern matching. Two modules in the system are based on machine learning algorithms, while other modules use regular expressions, rules, and dictionaries, and one module embeds MetaMap Transfer.
Measurements
The official evaluation was based on a reference standard of 251 discharge summaries annotated by all teams participating in the challenge. The metrics used were recall, precision, and the F1-measure. They were calculated with exact and inexact matches, and were averaged at the level of systems and documents.
Results
The reference metric for this challenge, the system-level overall F1-measure, reached about 77% for exact matches, with a recall of 72% and a precision of 83%. Performance was the best with route information (F1-measure about 86%), and was good for dosage and frequency information, with F1-measures of about 82–85%. Results were not as good for durations, with F1-measures of 36–39%, and for reasons, with F1-measures of 24–27%.
Conclusion
The official evaluation of Textractor for the i2b2 medication extraction challenge demonstrated satisfactory performance. This system was among the 10 best performing systems in this challenge.
doi:10.1136/jamia.2010.004028
PMCID: PMC2995680  PMID: 20819864
9.  Correction: Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections 
PLoS ONE  2011;6(1):10.1371/annotation/8a0d067d-5bf7-4e2e-b7d2-56444560d66d.
doi:10.1371/annotation/8a0d067d-5bf7-4e2e-b7d2-56444560d66d
PMCID: PMC3021490
10.  Qualitative Analysis of Workflow Modifications Used to Generate the Reference Standard for the 2010 i2b2/VA Challenge 
AMIA Annual Symposium Proceedings  2011;2011:1243-1251.
The Department of Veterans Affairs (VA) and the Informatics for Integrating Biology and the Bedside (i2b2) team partnered to generate the reference standard for the 2010 i2b2/VA challenge task on concept extraction, assertion classification, and relation classification. The purpose of this paper is to report an in-depth qualitative analysis of the experience and perceptions of human annotators for these tasks. Transcripts of semi-structured interviews were analyzed using qualitative methods to identify key constructs and themes related to these annotation tasks. Interventions were embedded with these tasks using pre-annotation of clinical concepts and a modified annotation workflow. From the human perspective, annotation tasks involve an inherent conflict between bias, accuracy, and efficiency. This analysis deepens understanding of the biases, complexities and impact of variations in the annotation process that may affect annotation task reliability and reference standard validity that are generalizable for other similar large-scale clinical corpus annotation projects.
PMCID: PMC3243132  PMID: 22195185
11.  Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections 
PLoS ONE  2010;5(10):e13377.
Background
The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI).
Methodology
A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis.
Principal Findings
An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52–68% and retained sensitivities of 69–73%.
Conclusion
Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties.
doi:10.1371/journal.pone.0013377
PMCID: PMC2954790  PMID: 20976281
12.  Automatic de-identification of textual documents in the electronic health record: a review of recent research 
Background
In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here.
Methods
This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers.
Results
The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries.
Conclusions
In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.
doi:10.1186/1471-2288-10-70
PMCID: PMC2923159  PMID: 20678228
13.  Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features 
Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance.
PMCID: PMC3041533  PMID: 21347150
14.  Natural Language Processing for Lines and Devices in Portable Chest X-Rays 
Radiology reports are unstructured free text documents that describe abnormalities in patients that are visible via imaging modalities such as X-ray. The number of imaging examinations performed in clinical care is enormous, and mining large repositories of radiology reports connected with clinical data such as patient outcomes could enable epidemiological studies, such as correlating the frequency of infections to the presence or length of time medical devices are present in patients. We developed a natural language processing (NLP) system to recognize device mentions in radiology reports and information about their state (insertion or removal) to enable epidemiological research. We tested our system using a reference standard of reports that were annotated to indicate this information. Our system performed with high accuracy (recall and precision of 97% and 99% for device mentions and 91–96% for device insertion status). Our methods are generalizable to other types of radiology reports as well as to other information extraction tasks and could provide the foundation for tools that enable epidemiological research exploration based on mining radiology reports.
PMCID: PMC3041297  PMID: 21347067
15.  Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease 
BMC Bioinformatics  2009;10(Suppl 9):S12.
Background
Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.
Methods
Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.
Results
Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.
Conclusion
Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.
doi:10.1186/1471-2105-10-S9-S12
PMCID: PMC2745683  PMID: 19761566
16.  Developing a Manually Annotated Clinical Document Corpus to Identify Phenotypic Information for Inflammatory Bowel Disease 
Background
Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.
Methods
Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.
Results
Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.
Conclusions
Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.
PMCID: PMC3041557  PMID: 21347157
17.  Inductive Creation of an Annotation Schema and a Reference Standard for De-identification of VA Electronic Clinical Notes 
Accessing both structured and unstructured clinical data is a high priority for research efforts. However, HIPAA requires that data meet or exceed a deidentification standard to assure that protected health information (PHI) is removed. This is a particularly difficult problem in the case of unstructured clinical free text and natural language processing (NLP) systems can be trained to automatically de-identify clinical text. Moreover, manual human annotation of clinical note documents for the purpose of building reference standards to evaluate NLP systems is a costly and time consuming process. Annotation schema must be created that can be used to build reliable and valid reference standards to evaluate NLP systems for the deidentification task. We describe the inductive creation of an annotation schema and subsequent reference standard. We also provide estimates of the accuracy of human annotators for this particular task.
PMCID: PMC2815367  PMID: 20351891
18.  Application of Natural Language Processing to VA Electronic Health Records to Identify Phenotypic Characteristics for Clinical and Research Purposes 
Informatics tools to extract and analyze clinical information on patients have lagged behind data-mining developments in bioinformatics. While the analyses of an individual’s partial or complete genotype is nearly a reality, the phenotypic characteristics that accompany the genotype are not well known and largely inaccessible in free-text patient health records. As the adoption of electronic medical records increases, there exists an urgent need to extract pertinent phenotypic information and make that available to clinicians and researchers. This usually requires the data to be in a structured format that is both searchable and amenable to computation. Using inflammatory bowel disease as an example, this study demonstrates the utility of a natural language processing system (MedLEE) in mining clinical notes in the paperless VA Health Care System. This adaptation of MedLEE is useful for identifying patients with specific clinical conditions, those at risk for or those with symptoms suggestive of those conditions.
PMCID: PMC3041527  PMID: 21347124
19.  Optimizing A Syndromic Surveillance Text Classifier for Influenza-like Illness: Does Document Source Matter? 
Syndromic surveillance systems that incorporate electronic free-text data have primarily focused on extracting concepts of interest from chief complaint text, emergency department visit notes, and nurse triage notes. Due to availability and access, there has been limited work in the area of surveilling the full text of all electronic note documents compared with more specific document sources. This study provides an evaluation of the performance of a text classifier for detection of influenza-like illness (ILI) by document sources that are commonly used for biosurveillance by comparing them to routine visit notes, and a full electronic note corpus approach. Evaluating the performance of an automated text classifier for syndromic surveillance by source document will inform decisions regarding electronic textual data sources for potential use by automated biosurveillance systems. Even when a full electronic medical record is available, commonly available surveillance source documents provide acceptable statistical performance for automated ILI surveillance.
PMCID: PMC2655960  PMID: 18999051

Results 1-19 (19)