Search tips
Search criteria

Results 1-25 (378642)

Clipboard (0)

Related Articles

1.  Parsing clinical text: how good are the state-of-the-art parsers? 
Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain.
In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank.
Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus.
Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.
PMCID: PMC4460747  PMID: 26045009
Medical language processing; natural language processing; parsing; clinical text; NLP
2.  Benchmarking natural-language parsers for biological applications using dependency graphs 
BMC Bioinformatics  2007;8:24.
Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria.
Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations.
Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques.
PMCID: PMC1797812  PMID: 17254351
3.  Large Scale Application of Neural Network Based Semantic Role Labeling for Automated Relation Extraction from Biomedical Texts 
PLoS ONE  2009;4(7):e6393.
To reduce the increasing amount of time spent on literature search in the life sciences, several methods for automated knowledge extraction have been developed. Co-occurrence based approaches can deal with large text corpora like MEDLINE in an acceptable time but are not able to extract any specific type of semantic relation. Semantic relation extraction methods based on syntax trees, on the other hand, are computationally expensive and the interpretation of the generated trees is difficult. Several natural language processing (NLP) approaches for the biomedical domain exist focusing specifically on the detection of a limited set of relation types. For systems biology, generic approaches for the detection of a multitude of relation types which in addition are able to process large text corpora are needed but the number of systems meeting both requirements is very limited. We introduce the use of SENNA (“Semantic Extraction using a Neural Network Architecture”), a fast and accurate neural network based Semantic Role Labeling (SRL) program, for the large scale extraction of semantic relations from the biomedical literature. A comparison of processing times of SENNA and other SRL systems or syntactical parsers used in the biomedical domain revealed that SENNA is the fastest Proposition Bank (PropBank) conforming SRL program currently available. 89 million biomedical sentences were tagged with SENNA on a 100 node cluster within three days. The accuracy of the presented relation extraction approach was evaluated on two test sets of annotated sentences resulting in precision/recall values of 0.71/0.43. We show that the accuracy as well as processing speed of the proposed semantic relation extraction approach is sufficient for its large scale application on biomedical text. The proposed approach is highly generalizable regarding the supported relation types and appears to be especially suited for general-purpose, broad-scale text mining systems. The presented approach bridges the gap between fast, cooccurrence-based approaches lacking semantic relations and highly specialized and computationally demanding NLP approaches.
PMCID: PMC2712690  PMID: 19636432
4.  Discriminative and informative features for biomolecular text mining with ensemble feature selection 
Bioinformatics  2010;26(18):i554-i560.
Motivation: In the field of biomolecular text mining, black box behavior of machine learning systems currently limits understanding of the true nature of the predictions. However, feature selection (FS) is capable of identifying the most relevant features in any supervised learning setting, providing insight into the specific properties of the classification algorithm. This allows us to build more accurate classifiers while at the same time bridging the gap between the black box behavior and the end-user who has to interpret the results.
Results: We show that our FS methodology successfully discards a large fraction of machine-generated features, improving classification performance of state-of-the-art text mining algorithms. Furthermore, we illustrate how FS can be applied to gain understanding in the predictions of a framework for biomolecular event extraction from text. We include numerous examples of highly discriminative features that model either biological reality or common linguistic constructs. Finally, we discuss a number of insights from our FS analyses that will provide the opportunity to considerably improve upon current text mining tools.
Availability: The FS algorithms and classifiers are available in Java-ML ( The datasets are publicly available from the BioNLP'09 Shared Task web site (
PMCID: PMC2935429  PMID: 20823321
5.  Syntactic Dependency Parsers for Biomedical-NLP 
Syntactic parsers have made a leap in accuracy and speed in recent years. The high order structural information provided by dependency parsers is useful for a variety of NLP applications. We present a biomedical model for the EasyFirst parser, a fast and accurate parser for creating Stanford Dependencies. We evaluate the models trained in the biomedical domains of EasyFirst and Clear-Parser in a number of task oriented metrics. Both parsers provide stat of the art speed and accuracy in the Genia of over 89%. We show that Clear-Parser excels at tasks relating to negation identification while EasyFirst excels at tasks relating to Named Entities and is more robust to changes in domain.
PMCID: PMC3540535  PMID: 23304280
6.  Ontology design patterns to disambiguate relations between genes and gene products in GENIA 
Journal of Biomedical Semantics  2011;2(Suppl 5):S1.
Annotated reference corpora play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies is challenging due to the inherent ambiguity of natural language. The provision of formal definitions and axioms for semantic annotations offers the means for ensuring consistency as well as enables the development of verifiable annotation guidelines. Consistent semantic annotations facilitate the automatic discovery of new information through deductive inferences.
We provide a formal characterization of the relations used in the recent GENIA corpus annotations. For this purpose, we both select existing axiom systems based on the desired properties of the relations within the domain and develop new axioms for several relations. To apply this ontology of relations to the semantic annotation of text corpora, we implement two ontology design patterns. In addition, we provide a software application to convert annotated GENIA abstracts into OWL ontologies by combining both the ontology of relations and the design patterns. As a result, the GENIA abstracts become available as OWL ontologies and are amenable for automated verification, deductive inferences and other knowledge-based applications.
Documentation, implementation and examples are available from
PMCID: PMC3239299  PMID: 22166341
7.  Event extraction for DNA methylation 
Journal of Biomedical Semantics  2011;2(Suppl 5):S2.
We consider the task of automatically extracting DNA methylation events from the biomedical domain literature. DNA methylation is a key mechanism of epigenetic control of gene expression and implicated in many cancers, but there has been little study of automatic information extraction for DNA methylation.
We present an annotation scheme for DNA methylation following the representation of the BioNLP shared task on event extraction, select a set of 200 abstracts including a representative sample of all PubMed citations relevant to DNA methylation, and introduce manual annotation for this corpus marking nearly 3000 gene/protein mentions and 1500 DNA methylation and demethylation events. We retrain a state-of-the-art event extraction system on the corpus and find that automatic extraction of DNA methylation events, the methylated genes, and their methylation sites can be performed at 78% precision and 76% recall.
Our results demonstrate that reliable extraction methods for DNA methylation events can be created through corpus annotation and straightforward retraining of a general event extraction system. The introduced resources are freely available for use in research from the GENIA project homepage
PMCID: PMC3239302  PMID: 22166595
8.  PPInterFinder—a mining tool for extracting causal relations on human proteins from literature 
One of the most common and challenging problem in biomedical text mining is to mine protein–protein interactions (PPIs) from MEDLINE abstracts and full-text research articles because PPIs play a major role in understanding the various biological processes and the impact of proteins in diseases. We implemented, PPInterFinder—a web-based text mining tool to extract human PPIs from biomedical literature. PPInterFinder uses relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts and consists of three phases. First, it identifies the relation keyword using a parser with Tregex and a relation keyword dictionary. Next, it automatically identifies the candidate PPI pairs with a set of rules related to PPI recognition. Finally, it extracts the relations by matching the sentence with a set of 11 specific patterns based on the syntactic nature of PPI pair. We find that PPInterFinder is capable of predicting PPIs with the accuracy of 66.05% on AIMED corpus and outperforms most of the existing systems.
Database URL:
PMCID: PMC3548331  PMID: 23325628
9.  Extracting noun phrases for all of MEDLINE. 
A natural language parser that could extract noun phrases for all medical texts would be of great utility in analyzing content for information retrieval. We discuss the extraction of noun phrases from MEDLINE, using a general parser not tuned specifically for any medical domain. The noun phrase extractor is made up of three modules: tokenization; part-of-speech tagging; noun phrase identification. Using our program, we extracted noun phrases from the entire MEDLINE collection, encompassing 9.3 million abstracts. Over 270 million noun phrases were generated, of which 45 million were unique. The quality of these phrases was evaluated by examining all phrases from a sample collection of abstracts. The precision and recall of the phrases from our general parser compared favorably with those from three other parsers we had previously evaluated. We are continuing to improve our parser and evaluate our claim that a generic parser can effectively extract all the different phrases across the entire medical literature.
PMCID: PMC2232564  PMID: 10566444
10.  Classifying protein-protein interaction articles using word and syntactic features 
BMC Bioinformatics  2011;12(Suppl 8):S9.
Identifying protein-protein interactions (PPIs) from literature is an important step in mining the function of individual proteins as well as their biological network. Since it is known that PPIs have distinctive patterns in text, machine learning approaches have been successfully applied to mine these patterns. However, the complex nature of PPI description makes the extraction process difficult.
Our approach utilizes both word and syntactic features to effectively capture PPI patterns from biomedical literature. The proposed method automatically identifies gene names by a Priority Model, then extracts grammar relations using a dependency parser. A large margin classifier with Huber loss function learns from the extracted features, and unknown articles are predicted using this data-driven model. For the BioCreative III ACT evaluation, our official runs were ranked in top positions by obtaining maximum 89.15% accuracy, 61.42% F1 score, 0.55306 MCC score, and 67.98% AUC iP/R score.
Even though problems still remain, utilizing syntactic information for article-level filtering helps improve PPI ranking performance. The proposed system is a revision of previously developed algorithms in our group for the ACT evaluation. Our approach is valuable in showing how to use grammatical relations for PPI article filtering, in particular, with a limited training corpus. While current performance is far from satisfactory as an annotation tool, it is already useful for a PPI article search engine since users are mainly focused on highly-ranked results.
PMCID: PMC3269944  PMID: 22151252
11.  Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon 
Objective: The aim of this study was to develop and evaluate a method of extracting noun phrases with full phrase structures from a set of clinical radiology reports using natural language processing (NLP) and to investigate the effects of using the UMLS® Specialist Lexicon to improve noun phrase identification within clinical radiology documents.
Design: The noun phrase identification (NPI) module is composed of a sentence boundary detector, a statistical natural language parser trained on a nonmedical domain, and a noun phrase (NP) tagger. The NPI module processed a set of 100 XML-represented clinical radiology reports in Health Level 7 (HL7)® Clinical Document Architecture (CDA)–compatible format. Computed output was compared with manual markups made by four physicians and one author for maximal (longest) NP and those made by one author for base (simple) NP, respectively. An extended lexicon of biomedical terms was created from the UMLS Specialist Lexicon and used to improve NPI performance.
Results: The test set was 50 randomly selected reports. The sentence boundary detector achieved 99.0% precision and 98.6% recall. The overall maximal NPI precision and recall were 78.9% and 81.5% before using the UMLS Specialist Lexicon and 82.1% and 84.6% after. The overall base NPI precision and recall were 88.2% and 86.8% before using the UMLS Specialist Lexicon and 93.1% and 92.6% after, reducing false-positives by 31.1% and false-negatives by 34.3%.
Conclusion: The sentence boundary detector performs excellently. After the adaptation using the UMLS Specialist Lexicon, the statistical parser's NPI performance on radiology reports increased to levels comparable to the parser's native performance in its newswire training domain and to that reported by other researchers in the general nonmedical domain.
PMCID: PMC1090458  PMID: 15684131
12.  Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences 
To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest.
Using random samples from a shared natural language processing challenge dataset, we developed a handbook of domain-customized syntactic parsing guidelines based on iterative annotation and adjudication between two institutions. Special considerations were incorporated into the guidelines for handling ill-formed sentences, which are common in clinical text. Intra- and inter-annotator agreement rates were used to evaluate consistency in following the guidelines. Quantitative and qualitative properties of the annotated Treebank, as well as its use to retrain a statistical parser, were reported.
A supplement to the Penn Treebank II guidelines was developed for annotating clinical sentences. After three iterations of annotation and adjudication on 450 sentences, the annotators reached an F-measure agreement rate of 0.930 (while intra-annotator rate was 0.948) on a final independent set. A total of 1100 sentences from progress notes were annotated that demonstrated domain-specific linguistic features. A statistical parser retrained with combined general English (mainly news text) annotations and our annotations achieved an accuracy of 0.811 (higher than models trained purely with either general or clinical sentences alone). Both the guidelines and syntactic annotations are made available at
We developed guidelines for parsing clinical text and annotated a corpus accordingly. The high intra- and inter-annotator agreement rates showed decent consistency in following the guidelines. The corpus was shown to be useful in retraining a statistical parser that achieved moderate accuracy.
PMCID: PMC3822122  PMID: 23907286
natural language processing; syntactic parsing; annotation guidelines; corpus development
13.  Structure before Meaning: Sentence Processing, Plausibility, and Subcategorization 
PLoS ONE  2013;8(10):e76326.
Natural language processing is a fast and automatized process. A crucial part of this process is parsing, the online incremental construction of a syntactic structure. The aim of this study was to test whether a wh-filler extracted from an embedded clause is initially attached as the object of the matrix verb with subsequent reanalysis, and if so, whether the plausibility of such an attachment has an effect on reaction time. Finally, we wanted to examine whether subcategorization plays a role. We used a method called G-Maze to measure response time in a self-paced reading design. The experiments confirmed that there is early attachment of fillers to the matrix verb. When this attachment is implausible, the off-line acceptability of the whole sentence is significantly reduced. The on-line results showed that G-Maze was highly suited for this type of experiment. In accordance with our predictions, the results suggest that the parser ignores (or has no access to information about) implausibility and attaches fillers as soon as possible to the matrix verb. However, the results also show that the parser uses the subcategorization frame of the matrix verb. In short, the parser ignores semantic information and allows implausible attachments but adheres to information about which type of object a verb can take, ensuring that the parser does not make impossible attachments. We argue that the evidence supports a syntactic parser informed by syntactic cues, rather than one guided by semantic cues or one that is blind, or completely autonomous.
PMCID: PMC3792136  PMID: 24116101
14.  ChemicalTagger: A tool for semantic text-mining in chemistry 
The primary method for scientific communication is in the form of published scientific articles and theses which use natural language combined with domain-specific terminology. As such, they contain free owing unstructured text. Given the usefulness of data extraction from unstructured literature, we aim to show how this can be achieved for the discipline of chemistry. The highly formulaic style of writing most chemists adopt make their contributions well suited to high-throughput Natural Language Processing (NLP) approaches.
We have developed the ChemicalTagger parser as a medium-depth, phrase-based semantic NLP tool for the language of chemical experiments. Tagging is based on a modular architecture and uses a combination of OSCAR, domain-specific regex and English taggers to identify parts-of-speech. The ANTLR grammar is used to structure this into tree-based phrases. Using a metric that allows for overlapping annotations, we achieved machine-annotator agreements of 88.9% for phrase recognition and 91.9% for phrase-type identification (Action names).
It is possible parse to chemical experimental text using rule-based techniques in conjunction with a formal grammar parser. ChemicalTagger has been deployed for over 10,000 patents and has identified solvents from their linguistic context with >99.5% precision.
PMCID: PMC3117806  PMID: 21575201
15.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications 
We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.
PMCID: PMC2995668  PMID: 20819853
16.  NLP techniques associated with the OpenGALEN ontology for semi-automatic textual extraction of medical knowledge: abstracting and mapping equivalent linguistic and logical constructs. 
This research project presents methodological and theoretical issues related to the inter-relationship between linguistic and conceptual semantics, analysing the results obtained by the application of a NLP parser to a set of radiology reports. Our objective is to define a technique for associating linguistic methods with domain specific ontologies for semi-automatic extraction of intermediate representation (IR) information formats and medical ontological knowledge from clinical texts. We have applied the Edinburgh LTG natural language parser to 2810 clinical narratives describing radiology procedures. In a second step, we have used medical expertise and ontology formalism for identification of semantic structures and abstraction of IR schemas related to the processed texts. These IR schemas are an association of linguistic and conceptual knowledge, based on their semantic contents. This methodology aims to contribute to the elaboration of models relating linguistic and logical constructs based on empirical data analysis. Advance in this field might lead to the development of computational techniques for automatic enrichment of medical ontologies from real clinical environments, using descriptive knowledge implicit in large text corpora sources.
PMCID: PMC2244064  PMID: 11079848
17.  Applying Semantic-based Probabilistic Context-Free Grammar to Medical Language Processing – A Preliminary Study on Parsing Medication Sentences 
Journal of biomedical informatics  2011;44(6):1068-1075.
Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4,564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser.
PMCID: PMC3226929  PMID: 21856440
natural language processing; parsing; probabilistic context free grammar; sublanguage grammar
18.  Disambiguating the species of biomedical named entities using natural language parsers 
Bioinformatics  2010;26(5):661-667.
Motivation: Text mining technologies have been shown to reduce the laborious work involved in organizing the vast amount of information hidden in the literature. One challenge in text mining is linking ambiguous word forms to unambiguous biological concepts. This article reports on a comprehensive study on resolving the ambiguity in mentions of biomedical named entities with respect to model organisms and presents an array of approaches, with focus on methods utilizing natural language parsers.
Results: We build a corpus for organism disambiguation where every occurrence of protein/gene entity is manually tagged with a species ID, and evaluate a number of methods on it. Promising results are obtained by training a machine learning model on syntactic parse trees, which is then used to decide whether an entity belongs to the model organism denoted by a neighbouring species-indicating word (e.g. yeast). The parser-based approaches are also compared with a supervised classification method and results indicate that the former are a more favorable choice when domain portability is of concern. The best overall performance is obtained by combining the strengths of syntactic features and supervised classification.
Availability: The corpus and demo are available at, and the software is freely available as U-Compare components (Kano et al., 2009): NaCTeM Species Word Detector and NaCTeM Species Disambiguator. U-Compare is available at
PMCID: PMC2828111  PMID: 20053840
19.  NLR-parser: rapid annotation of plant NLR complements 
Bioinformatics  2015;31(10):1665-1667.
Motivation: The repetitive nature of plant disease resistance genes encoding for nucleotide-binding leucine-rich repeat (NLR) proteins hampers their prediction with standard gene annotation software. Motif alignment and search tool (MAST) has previously been reported as a tool to support annotation of NLR-encoding genes. However, the decision if a motif combination represents an NLR protein was entirely manual.
Results: The NLR-parser pipeline is designed to use the MAST output from six-frame translated amino acid sequences and filters for predefined biologically curated motif compositions. Input reads can be derived from, for example, raw long-read sequencing data or contigs and scaffolds coming from plant genome projects. The output is a tab-separated file with information on start and frame of the first NLR specific motif, whether the identified sequence is a TNL or CNL, potentially full or fragmented. In addition, the output of the NB-ARC domain sequence can directly be used for phylogenetic analyses. In comparison to other prediction software, the highly complex NB-ARC domain is described in detail using several individual motifs.
Availability and implementation: The NLR-parser tool can be downloaded from Git-Hub ( It requires a valid Java installation as well as MAST as part of the MEME Suite. The tool is run from the command line.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4426836  PMID: 25586514
20.  Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers 
To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection.
We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios.
The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p<0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 ‘most influential’ findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76>0.70, p<0.05).
Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods.
PMCID: PMC4147621  PMID: 24406261
21.  Detecting modification of biomedical events using a deep parsing approach 
This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser.
To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm.
Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features.
Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.
PMCID: PMC3339397  PMID: 22595089
22.  Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies 
Journal of biomedical informatics  2011;44(5):805-814.
Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution.
PMCID: PMC3172402  PMID: 21549857
Natural language processing; Biomedical terminology; Semantic grammar; Probabilistic parsing
23.  Proton Pump Inhibitors and Hospitalization with Hypomagnesemia: A Population-Based Case-Control Study 
PLoS Medicine  2014;11(9):e1001736.
David Juurlink and colleagues evaluated the risk of hospitalization with hypomagnesemia among patients taking proton pump inhibitors.
Please see later in the article for the Editors' Summary
Some evidence suggests that proton pump inhibitors (PPIs) are an under-appreciated risk factor for hypomagnesemia. Whether hospitalization with hypomagnesemia is associated with use of PPIs is unknown.
Methods and Findings
We conducted a population-based case-control study of multiple health care databases in Ontario, Canada, from April 2002 to March 2012. Patients who were enrolled as cases were Ontarians aged 66 years or older hospitalized with hypomagnesemia. For each individual enrolled as a case, we identified up to four individuals as controls matched on age, sex, kidney disease, and use of various diuretic classes. Exposure to PPIs was categorized according to the most proximate prescription prior to the index date as current (within 90 days), recent (within 91 to 180 days), or remote (within 181 to 365 days). We used conditional logistic regression to estimate the odds ratio for the association of outpatient PPI use and hospitalization with hypomagnesemia. To test the specificity of our findings we examined use of histamine H2 receptor antagonists, drugs with no causal link to hypomagnesemia. We studied 366 patients hospitalized with hypomagnesemia and 1,464 matched controls. Current PPI use was associated with a 43% increased risk of hypomagnesemia (adjusted odds ratio, 1.43; 95% CI 1.06–1.93). In a stratified analysis, the risk was particularly increased among patients receiving diuretics, (adjusted odds ratio, 1.73; 95% CI 1.11–2.70) and not significant among patients not receiving diuretics (adjusted odds ratio, 1.25; 95% CI 0.81–1.91). We estimate that one excess hospitalization with hypomagnesemia will occur among 76,591 outpatients treated with a PPI for 90 days. Hospitalization with hypomagnesemia was not associated with the use of histamine H2 receptor antagonists (adjusted odds ratio 1.06; 95% CI 0.54–2.06). Limitations of this study include a lack of access to serum magnesium levels, uncertainty regarding diagnostic coding of hypomagnesemia, and generalizability of our findings to younger patients.
PPIs are associated with a small increased risk of hospitalization with hypomagnesemia among patients also receiving diuretics. Physicians should be aware of this association, particularly for patients with hypomagnesemia.
Please see later in the article for the Editors' Summary
Editors' Summary
To extract nutrients from food, we rely on a multi-stage process called digestion. A crucial stage in digestion occurs in the stomach where gastric juice, a mixture of mainly hydrochloric acid and the enzyme pepsin, breaks down the proteins present in food. We could not digest food without gastric juice, but the acid it contains, which is made by glands in the stomach, can damage the lining of the digestive system and cause symptoms of indigestion (dyspepsia), stomach (peptic) ulcers, and gastroesophageal reflux disease (GERD), a condition in which acid from the stomach leaks back up the esophagus (gullet), Acid-related disorders are often treated with proton pump inhibitors (PPIs), a class of drugs that reduces acid production in the stomach. Omeprazole, lansoprazole, and other PPIs are among the most widely prescribed drugs in the world. In 2010, 147 million prescriptions for PPIs were dispensed in the US alone.
Why Was This Study Done?
Like all drugs, PPIs have some unwanted side effects. They sometimes cause diarrhea, for example, and their long-term use is associated with fractures. In addition, long-term PPI use may be a risk factor for hypomagnesemia, a condition in which the magnesium level in the blood is abnormally low. If severe, hypomagnesemia can lead to life-threatening heart arrhythmias and seizures (fits). Magnesium levels are controlled by absorption of magnesium by the intestines and excretion of magnesium by the kidneys. It is thought that PPI-related hypomagnesemia involves inhibition of magnesium absorption. Given the widespread use of PPIs, it is important to know whether PPIs are a risk factor for hypomagnesemia in routine clinical practice. In this population-based case-control study, the researchers ask whether hospitalization with hypomagnesemia is associated with the use of PPIs. A case-control study compares the characteristics of individuals with a specific condition with those of matched controls without the condition.
What Did the Researchers Do and Find?
The researchers identified everyone aged 66 years or older who received a diagnosis of hypomagnesemia following hospital admission in Ontario over a 10 year period (366 cases) by searching a large database of hospital admissions. They identified up to four control patients from the general population who were matched with these case patients on age, sex, kidney disease, and the use of various diuretic classes (diuretic use is also associated with hypomagnesemia), and obtained data on PPI use by all patients from a database that records the prescription drugs dispensed to elderly Ontario residents. The researchers then used statistical methods to look for associations between current PPI use (a prescription within 90 days of the index date) and hospitalization with hypomagnesemia. After allowing for other characteristics that increase the risk of hypomagnesemia (including other illnesses), current PPI use was associated with a 43% increased risk of hypomagnesemia. Among patients receiving diuretics, PPI use increased the risk of hypomagnesemia by 73% whereas among patients not receiving diuretics, PPI use did not significantly increase the risk of hypomagnesemia. Finally, the researchers calculated that 76,591 individuals would need to be treated with a PPI as an outpatient for 90 days to result in one additional hospitalization with hypomagnesemia.
What Do These Findings Mean?
These findings show that, among elderly individuals, current (but not previous) outpatient use of PPIs is associated with an increased risk of detection of hypomagnesemia during hospitalization, particularly among patients also taking diuretics. Because this study only considered elderly patients, these findings may not apply to younger patients. Moreover, the accuracy of these findings may be affected by the validity of the hospital coding for hypomagnesemia in the database used to identify cases. Importantly, given the large number of patients that need to take PPIs to result in one additional hospitalization with hypomagnesemia, these findings should not discourage clinicians from prescribing PPIs to appropriate patients nor should they lead to calls for routine screening of magnesium levels in patients taking PPIs. Rather, these findings highlight the need for clinicians to be aware of the association between PPI use and the risk of hypomagnesemia and to reassess ongoing therapy in patients who develop hypomagnesemia while taking PPIs.
Additional Information
Please access these websites via the online version of this summary at
The UK National Health Service Choices website provides information about symptoms, causes, and treatment of indigestion, heartburn and gastroesophageal reflux disease, and stomach ulcers; a “Behind the Headlines” article from 2010 discusses an editorial about the possible over-use of PPIs
MedlinePlus provides links to information about indigestion, stomach ulcer, and gastroesophageal reflux disease (in English and Spanish); the MedlinePlus encyclopedia has pages on proton pump inhibitors (in English and Spanish) and on hypomagnesemia (in English and Spanish)
A US Federal Drug Agency warning about the possible association between proton pump inhibitors and hypomagnesemia is available
Wikipedia pages on proton pump inhibitors and on hypomagnesemia are also available (note that Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
PMCID: PMC4181956  PMID: 25268962
24.  Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to Process Medication Information in Outpatient Clinical Notes 
AMIA Annual Symposium Proceedings  2011;2011:1639-1648.
Clinical information is often coded using different terminologies, and therefore is not interoperable. Our goal is to develop a general natural language processing (NLP) system, called Medical Text Extraction, Reasoning and Mapping System (MTERMS), which encodes clinical text using different terminologies and simultaneously establishes dynamic mappings between them. MTERMS applies a modular, pipeline approach flowing from a preprocessor, semantic tagger, terminology mapper, context analyzer, and parser to structure inputted clinical notes. Evaluators manually reviewed 30 free-text and 10 structured outpatient clinical notes compared to MTERMS output. MTERMS achieved an overall F-measure of 90.6 and 94.0 for free-text and structured notes respectively for medication and temporal information. The local medication terminology had 83.0% coverage compared to RxNorm’s 98.0% coverage for free-text notes. 61.6% of mappings between the terminologies are exact match. Capture of duration was significantly improved (91.7% vs. 52.5%) from systems in the third i2b2 challenge.
PMCID: PMC3243163  PMID: 22195230
25.  Comparative analysis of five protein-protein interaction corpora 
BMC Bioinformatics  2008;9(Suppl 3):S6.
Growing interest in the application of natural language processing methods to biomedical text has led to an increasing number of corpora and methods targeting protein-protein interaction (PPI) extraction. However, there is no general consensus regarding PPI annotation and consequently resources are largely incompatible and methods are difficult to evaluate.
We present the first comparative evaluation of the diverse PPI corpora, performing quantitative evaluation using two separate information extraction methods as well as detailed statistical and qualitative analyses of their properties. For the evaluation, we unify the corpus PPI annotations to a shared level of information, consisting of undirected, untyped binary interactions of non-static types with no identification of the words specifying the interaction, no negations, and no interaction certainty.
We find that the F-score performance of a state-of-the-art PPI extraction method varies on average 19 percentage units and in some cases over 30 percentage units between the different evaluated corpora. The differences stemming from the choice of corpus can thus be substantially larger than differences between the performance of PPI extraction methods, which suggests definite limits on the ability to compare methods evaluated on different resources. We analyse a number of potential sources for these differences and identify factors explaining approximately half of the variance. We further suggest ways in which the difficulty of the PPI extraction tasks codified by different corpora can be determined to advance comparability. Our analysis also identifies points of agreement and disagreement in PPI corpus annotation that are rarely explicitly stated by the authors of the corpora.
Our comparative analysis uncovers key similarities and differences between the diverse PPI corpora, thus taking an important step towards standardization. In the course of this study we have created a major practical contribution in converting the corpora into a shared format. The conversion software is freely available at .
PMCID: PMC2349296  PMID: 18426551

Results 1-25 (378642)