Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
author:("Uzuner, oslem")
1.  Annotating Temporal Information in Clinical Narratives 
Journal of biomedical informatics  2013;46(0):10.1016/j.jbi.2013.07.004.
Temporal information in clinical narratives plays an important role in patients’ diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality.
PMCID: PMC3855581  PMID: 23872518
Natural Language Processing; Temporal Reasoning; Medical Informatics; Corpus Building; Annotation
2.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge 
The Sixth Informatics for Integrating Biology and the Bedside (i2b2) Natural Language Processing Challenge for Clinical Records focused on the temporal relations in clinical narratives. The organizers provided the research community with a corpus of discharge summaries annotated with temporal information, to be used for the development and evaluation of temporal reasoning systems. 18 teams from around the world participated in the challenge. During the workshop, participating teams presented comprehensive reviews and analysis of their systems, and outlined future research directions suggested by the challenge contributions.
The challenge evaluated systems on the information extraction tasks that targeted: (1) clinically significant events, including both clinical concepts such as problems, tests, treatments, and clinical departments, and events relevant to the patient's clinical timeline, such as admissions, transfers between departments, etc; (2) temporal expressions, referring to the dates, times, durations, or frequencies phrases in the clinical text. The values of the extracted temporal expressions had to be normalized to an ISO specification standard; and (3) temporal relations, between the clinical events and temporal expressions. Participants determined pairs of events and temporal expressions that exhibited a temporal relation, and identified the temporal relation between them.
For event detection, statistical machine learning (ML) methods consistently showed superior performance. While ML and rule based methods seemed to detect temporal expressions equally well, the best systems overwhelmingly adopted a rule based approach for value normalization. For temporal relation classification, the systems using hybrid approaches that combined ML and heuristics based methods produced the best results.
PMCID: PMC3756273  PMID: 23564629
clinical language processing; sharedtask challenges; temporal reasoning; natural language processing; medical language processing
3.  Temporal reasoning over clinical text: the state of the art 
To provide an overview of the problem of temporal reasoning over clinical text and to summarize the state of the art in clinical natural language processing for this task.
Target audience
This overview targets medical informatics researchers who are unfamiliar with the problems and applications of temporal reasoning over clinical text.
We review the major applications of text-based temporal reasoning, describe the challenges for software systems handling temporal information in clinical text, and give an overview of the state of the art. Finally, we present some perspectives on future research directions that emerged during the recent community-wide challenge on text-based temporal reasoning in the clinical domain.
PMCID: PMC3756277  PMID: 23676245
Temporal reasoning; Natural language processing; Medical language processing
4.  Evaluating the state of the art in coreference resolution for electronic medical records 
The fifth i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records conducted a systematic review on resolution of noun phrase coreference in medical records. Informatics for Integrating Biology and the Bedside (i2b2) and the Veterans Affair (VA) Consortium for Healthcare Informatics Research (CHIR) partnered to organize the coreference challenge. They provided the research community with two corpora of medical records for the development and evaluation of the coreference resolution systems. These corpora contained various record types (ie, discharge summaries, pathology reports) from multiple institutions.
The coreference challenge provided the community with two annotated ground truth corpora and evaluated systems on coreference resolution in two ways: first, it evaluated systems for their ability to identify mentions of concepts and to link together those mentions. Second, it evaluated the ability of the systems to link together ground truth mentions that refer to the same entity. Twenty teams representing 29 organizations and nine countries participated in the coreference challenge.
The teams' system submissions showed that machine-learning and rule-based approaches worked best when augmented with external knowledge sources and coreference clues extracted from document structure. The systems performed better in coreference resolution when provided with ground truth mentions. Overall, the systems struggled in solving coreference resolution for cases that required domain knowledge.
PMCID: PMC3422835  PMID: 22366294
5.  Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions 
PMCID: PMC3168329  PMID: 21846785
NLP; other methods of information extraction; natural-language processing; monitoring the health of populations; knowledge representation; knowledge bases
6.  Sentiment Analysis of Suicide Notes: A Shared Task 
Biomedical informatics insights  2012;5(Suppl 1):3-16.
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
PMCID: PMC3299408  PMID: 22419877
Sentiment analysis; suicide; suicide notes; natural language processing; computational linguistics; shared task; challenge 2011
7.  Semantic Relations for Problem-Oriented Medical Records 
We describe semantic relation (SR) classification on medical discharge summaries. We focus on relations targeted to the creation of problem-oriented records. Thus, we define relations that involve the medical problems of patients.
Methods and Materials
We represent patients’ medical problems with their diseases and symptoms. We study the relations of patients’ problems with each other and with concepts that are identified as tests and treatments. We present an SR classifier that studies a corpus of patient records one sentence at a time. For all pairs of concepts that appear in a sentence, this SR classifier determines the relations between them. In doing so, the SR classifier takes advantage of surface, lexical, and syntactic features and uses these features as input to a support vector machine. We apply our SR classifier to two sets of medical discharge summaries, one obtained from the Beth Israel-Deaconess Medical Center (BIDMC), Boston, MA and the other from Partners Healthcare, Boston, MA.
On the BIDMC corpus, our SR classifier achieves micro-averaged F-measures that range from 74% to 95% on the various relation types. On the Partners corpus, the micro-averaged F-measures on the various relation types range from 68% to 91%. Our experiments show that lexical features (in particular, tokens that occur between candidate concepts, which we refer to as inter-concept tokens) are very informative for relation classification in medical discharge summaries. Using only the inter-concept tokens in the corpus, our SR classifier can recognize 84% of the relations in the BIDMC corpus and 72% of the relations in the Partners corpus.
These results are promising for semantic indexing of medical records. They imply that we can take advantage of lexical patterns in discharge summaries for relation classification at a sentence level.
PMCID: PMC2948592  PMID: 20646918
Lexical context; support vector machines; relation classification for the problem-oriented record; medical language processing
8.  Qualitative Analysis of Workflow Modifications Used to Generate the Reference Standard for the 2010 i2b2/VA Challenge 
AMIA Annual Symposium Proceedings  2011;2011:1243-1251.
The Department of Veterans Affairs (VA) and the Informatics for Integrating Biology and the Bedside (i2b2) team partnered to generate the reference standard for the 2010 i2b2/VA challenge task on concept extraction, assertion classification, and relation classification. The purpose of this paper is to report an in-depth qualitative analysis of the experience and perceptions of human annotators for these tasks. Transcripts of semi-structured interviews were analyzed using qualitative methods to identify key constructs and themes related to these annotation tasks. Interventions were embedded with these tasks using pre-annotation of clinical concepts and a modified annotation workflow. From the human perspective, annotation tasks involve an inherent conflict between bias, accuracy, and efficiency. This analysis deepens understanding of the biases, complexities and impact of variations in the annotation process that may affect annotation task reliability and reference standard validity that are generalizable for other similar large-scale clinical corpus annotation projects.
PMCID: PMC3243132  PMID: 22195185
9.  Syntactically-Informed Semantic Category Recognizer for Discharge Summaries 
Semantic category recognition (SCR) contributes to document understanding. Most approaches to SCR fail to make use of syntax. We hypothesize that syntax, if represented appropriately, can improve SCR. We present a statistical semantic category (SC) recognizer trained with syntactic and lexical contextual clues, as well as ontological information from UMLS, to identify eight semantic categories in discharge summaries. Some of our categories, e.g., test results and findings, include complex entries that span multiple phrases. We achieve classification F-measures above 90% for most categories and show that syntactic context is important for SCR.
PMCID: PMC1839398  PMID: 17238434

Results 1-9 (9)