|Home | About | Journals | Submit | Contact Us | Français|
In clinical practice guidelines (CPGs) the medical information is stored in a narrative way. A large part of this information occurs in a negated form. The detection of negation in CPGs is an important task since it helps medical personnel to identify not occurring symptoms and diseases as well as treatment actions that should not be accomplished. We developed algorithms capable of Negation Detection in this kind of medical documents. According to our results, we are convinced that the involvement of syntactical methods can improve Negation Detection, not only in medical writings but also in arbitrary narrative texts.
Negation is an important part of inter-human communication. It can be used to invert concepts and to show refusal of opinions. The concept of negation is a universal concept in all languages and very important in the medical field. Detecting negations in natural language is a difficult task, but in the medical scope it is easier: Medical language is much more restricted than narrative speech ; a physician will not use stylistic elements such as double negation extensively to write reports or patients histories.
In the medical scope Negation Detection is currently only applied to very simple texts (e.g., radiology reports). In our work, we primarily focus on the more complex text type of clinical practice guidelines (CPGs). These are “systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances” . In CPGs negation is crucial not only for facts that do not apply (e.g., patient has no pain), but also for actions that should not be accomplished (e.g., do not take this drug). In contrast to simpler texts, we need algorithms dealing with the syntax (e.g., tenses in active and passive voice, parts-of-speech) of the English language.
In the following section we give an overview over existing methods of Negation Detection. In the main part of the work we describe and evaluate an approach of Negation Detection using syntactical methods tailored to the special characteristics of CPGs.
Besides general work on negation in natural language (e.g., ) we will discuss relevant work addressing Negation Detection in the scope of medical language.
NegEx  is a simple algorithm for detecting negated findings and diseases in radiology reports. Negation triggers are classified in triggers with preceding negated concepts and those with succeeding concepts. After a replacement of the concepts by UMLS terms, the negated ones are detected.
In the work of Mutalik et al.  UMLS concepts are also identified in a first step. Then, a lexical scanner using regular expressions is applied for trigger detection and classification in preceding and succeeding triggers. With this information a parser provides the original concepts with the negation information leading to the output of the NegFinder algorithm.
Elkin et al.  use an algorithm with a rule base to decide which medical concepts are negated in clinical documents. Here, stop words (e.g., “other than”) determine the scope of a negation trigger.
Patrick et al.  use SNOMED CT to identify negated concepts. Thereby, pre-coordinated phrases (e.g., “no headache”, SNOMED CT concept id 162298006) and concepts explicitly asserted as negative by negation phrases are identified. To identify the latter rule-based algorithms similar to  and  were implemented.
Huang and Lowe  recently developed a hybrid approach combining regular expression matching with grammatical parsing to detect negations in clinical radiology reports.
Aronow and Feng  have developed a method for Negation Detection to be applied for document classification. Thereby, they determine the scope of negation triggers by conjunctive phrases. All phrases connected by such conjunctions are regarded as negated phrases.
CPGs differ from medical reports or discharge summaries, which are used by the algorithms presented. In CPGs the language is not as restricted as it is in these other documents. They are more like prosaic writings, which complicate the development of simple algorithms. Still, they are not as complicated as free text since sophisticated stylistic elements such as the double negation are not used. In the following section we explain our approach, called NegHunter.
Our strategy when developing our method, called NegHunter, was to classify negations in CPGs according to identified negation types. The reason for this is that negations within CPGs are strongly varying from each other. It is not easily possible to keep the number of negation triggers within manageable limits, thus a syntactical approach was used. This means that grammatical elements of the English language are used to decide whether a phrase is negated or not. For this purpose, the tenses in both active and passive voice as well as parts-of-speech are used.
The starting point of the NegHunter algorithm is the detection of negation triggers. Whereas other algorithms use a relatively high number of different negation triggers, NegHunter gets along with a rather small number of triggers. The reason for this is the way NegHunter handles the different negation types. We have selected a number of universal triggers and classified their behaviour in narrative texts.
We have come up with five negation classes according to our study of CPGs in the literature: (1) adverbial negation, (2) intra-phrase triggered negation, (3) prepositional negation, (4) adjective negation, and (5) verb negation. In the following, we will discuss these five classes in detail.
“Guideline developers do not recommend .” (active voice)
“ is not recommended.” (passive voice)
“Evidence obtained from at least one well-designed study without .”
“Patients with good performance status, … and the absence .”
“Recommendation indicates at least fair evidence that is ineffective or that harm outweighs benefit.”
“ on final patient outcomes was also lacking.”
In some cases, not the entire negated information gets tagged with the algorithms described above. For instance, prepositional phrases (which are by themselves not negated) appearing after a negated phrase need to be handled apart. We proceed with this problem by tagging all prepositional phrases that follow a negated phrase. This ensures that no information concerning the negation is lost. The following sentence shows an output result with two prepositional phrases following an intra-phrase triggered negation:
“Requires availability of well conducted clinical studies but no .” 3
For evaluation purposes we used a Java-implementation of our algorithms. To receive the syntactical information from the guideline documents necessary for our algorithms we used the MetaMap Transfer (MMTx) program. MetaMap is “a program […] to map biomedical text to the [UMLS] Metathesaurus or, equivalently, to discover concepts referred to in text” . MMTx makes this program available for researchers in an adaptive way. Besides the concept assignment it also provides us with the syntactical information such as part-of-speech. We implemented our algorithm as a Java library that can also be used by and incorporated in other programs and applications. In the following we describe the evaluation process.
We used a set of 18 CPGs from the medical speciality oncology for our development. Out of these 18 practice guidelines, we used four guidelines as training set for the analysis of occurring negations. By means of these documents we classified the negations and developed our algorithms. We used the remaining 14 CPGs for the evaluation.
We manually rated the sentences of all oncological CGPs to establish a “gold standard” against which the computerized algorithms could be compared. We processed 558 sentences containing 615 negated concepts and tagged both negation triggers and negated concepts. At the first glance, it may be irritating to have such a little more number of negated phrases than sentences containing negations. This is because there are many sentences containing a trigger, which does not aim a phrase in the same sentence, (e.g., “None available.”). We do not provide detection across sentence borders because the result is unpredictable and are not tackled in our methods conceptually.
For our evaluation, we processed the 14 guidelines with NegHunter. Afterwards, a hand reading was carried out to detect errors. We classified in true positives (TP), false positives (FP), false negatives (FN), and partially correct (PC) taggings, whereas the latter scored only 50 %.
To qualify our measurement we used the statistical parameters of recall and precision. The recall measures the number of the correctly found phrases against all relevant phrases according to the gold standard. The value of precision measures the ratio of the number of correctly detected phrases to the number of all found phrases of the system. Table 1 shows a detailed listing of the performance of our implementation.
NegHunter shows its strength in the handling of the intra-phrase triggered negation, the prepositional negation and the adjective negation. This is caused by the simple structure of these negations. In the case of the intra-phrase triggered and the prepositional negation, the negated phrase usually follows immediately after the trigger so it is nearly impossible to fail it.
The behaviour of the adverbial negation as well as the verb negation is much more complex. Here, it is possible that a phrase related with a trigger occurs at the diametrically opposite end of the sentence. In such a case, it is very difficult to identify this phrase, as NegHunter uses the range of three preceding or succeeding possible phrases for detecting the negated concepts.
Another problem is generated by MMTx itself. In some cases the part-of-speech is incorrectly assigned and this consecutively causes errors. For example, in the sentence
“… and interpreting studies that were not otherwise covered in existing syntheses or guidelines.”
MMTx recognises the noun phrase “studies” as a verb phrase, whereas “interpreting”, a verb phrase, is recognised as noun. This circumstance leads to a false tagging and the creation of both a FP and a FN.
With our presented algorithms, negated information occurring in CPGs can be detected on syntactical level using grammatical information of the English language such as tenses and parts-of-speech. This forms a basis for subsequent processing also on a semantic level. Further processing on a semantic level will be absolutely necessary, as, for instance, a negation trigger and a concept representing a symptom or disease may not imply the absence of this symptom or disease. Compare also the example of :
“We did not treat the infection.”
“We did not detect an infection.”
where the first sentence does not indicate the absence of an infection, but the absence of treating it. Anyhow, using NegHunter can support an automated structuring of the information in order to, for instance, decide which therapies or drug regimens are best applied in patients with certain diseases and which are not recommended. This helps to sort out the treatment options and supports the medical personnel as well as patients in their decision-making.
Additionally, NegHunter's negation classification allows users to augment the trigger set by themselves. Therefore, new triggers need to be assigned a negation class. NegHunter applies its rule base to these new triggers. This makes NegHunter portable to be applied on other document types as well as extensible and maintainable.
This work is supported by “Fonds zur Förderung der wissenschaftlichen Forschung FWF” (Austrian Science Fund), grant L290-N04.
2Negation triggers are underlined; signalize negated phrases.
3 signalize prepositional information.