PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (66220)

Clipboard (0)
None

Related Articles

10.  Model-based semantic dictionaries for medical language understanding. 
Semantic dictionaries are emerging as a major cornerstone towards achieving sound natural language understanding. Indeed, they constitute the main bridge between words and conceptual entities that reflect their meanings. Nowadays, more and more wide-coverage lexical dictionaries are electronically available in the public domain. However, associating a semantic content with lexical entries is not a straightforward task as it is subordinate to the existence of a fine-grained concept model of the treated domain. This paper presents the benefits and pitfalls in building and maintaining multilingual dictionaries, the semantics of which is directly established on an existing concept model. Concrete cases, handled through the GALEN-IN-USE project, illustrate the use of such semantic dictionaries for the analysis and generation of multilingual surgical procedures.
PMCID: PMC2232654  PMID: 10566333
11.  A Medical Data Dictionary for Decision Support Applications 
Building and maintaining clinically-based medical information systems is a complex task. Advances in database management technologies, including the concept of a data dictionary, have helped support this process. A medical data dictionary is described with discussions on the role of data dictionaries within a medical information system and the conceptual model supported by the AT&T CareComm (TM) data dictionary. Additionally, a high level overview of features that we deemed important to support a medical information system with integrated decision support capabilities is presented.
PMCID: PMC2245107
12.  Creating a medical English-Swedish dictionary using interactive word alignment 
Background
This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish.
Methods
The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates.
Results
A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages.
Conclusion
In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems.
doi:10.1186/1472-6947-6-35
PMCID: PMC1624822  PMID: 17034649
13.  Comparison of American medical dictionaries. 
Although American medical dictionaries are a valuable part of any medical library collection, the attributes of each of the four major dictionaries are often unknown and the reference material contained in each unused. The medical librarian should be aware of the differences and values of each dictionary and try to have at least one edition of each available to library users in order to maintain an adequate reference collection.
PMCID: PMC199491  PMID: 354707
14.  Identification of Misspelled Words without a Comprehensive Dictionary Using Prevalence Analysis 
Misspellings are common in medical documents and can be an obstacle to information retrieval. We evaluated an algorithm to identify misspelled words through analysis of their prevalence in a representative body of text.
We evaluated the algorithm’s accuracy of identifying misspellings of 200 anti-hypertensive medication names on 2,000 potentially misspelled words randomly selected from narrative medical documents. Prevalence ratios (the frequency of the potentially misspelled word divided by the frequency of the non-misspelled word) in physician notes were computed by the software for each of the words. The software results were compared to the manual assessment by an independent reviewer.
Area under the ROC curve for identification of misspelled words was 0.96. Sensitivity, specificity, and positive predictive value were 99.25%, 89.72% and 82.9% for the prevalence ratio threshold (0.32768) with the highest F-measure (0.903). Prevalence analysis can be used to identify and correct misspellings with high accuracy.
PMCID: PMC2813663  PMID: 18693937
15.  The T.M.R. Data Dictionary: A Management Tool for Data Base Design 
In January 1981, a dictionary-driven ambulatory care information system known as TMR (The Medical Record) was installed at a large private medical group practice in Los Angeles. TMR's data dictionary has enabled the medical group to adapt the software to meet changing user needs largely without programming support. For top management, the dictionary is also a tool for navigating through the system's complexity and assuring the integrity of management goals.
PMCID: PMC2578624
16.  A dictionary server for supplying context sensitive medical knowledge. 
The Giessen Data Dictionary Server (GDDS), developed at Giessen University Hospital, integrates clinical systems with on-line, context sensitive medical knowledge to help with making medical decisions. By "context" we mean the clinical information that is being presented at the moment the information need is occurring. The dictionary server makes use of a semantic network supported by a medical data dictionary to link terms from clinical applications to their proper information sources. It has been designed to analyze the network structure itself instead of knowing the layout of the semantic net in advance. This enables us to map appropriate information sources to various clinical applications, such as nursing documentation, drug prescription and cancer follow up systems. This paper describes the function of the dictionary server and shows how the knowledge stored in the semantic network is used in the dictionary service.
PMCID: PMC2243816  PMID: 11079978
17.  Unsupervised Method for Automatic Construction of a Disease Dictionary from a Large Free Text Collection 
Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting contextual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35–88%) over available, manually created disease terminologies.
PMCID: PMC2656087  PMID: 18999169
18.  A Feature Dictionary for a Multi-Domain Medical Knowledge Base 
Because different terminology is used by physicians of different specialties in different locations to refer to the same feature (signs, symptoms, test results), it is essential that our knowledge development tools provide a means to access a common pool of terms. This paper discusses the design of an online medical dictionary that provides a solution to this problem for developers of multi-domain knowledge bases for MEDAS (Medical Emergency Decision Assistance System). This Feature Dictionary supports phrase equivalents for features, feature interactions, feature classifications, and translations to the binary features generated by the expert during knowledge creation. It is also used in the conversion of a domain knowledge to the database used by the MEDAS inference diagnostic sessions. The Feature Dictionary also provides capabilities for complex queries across multiple domains using the supported relations. The Feature Dictionary supports three methods for feature representation: 1) binary, 2) continuous valued, and 3) derived.
PMCID: PMC2245233
19.  Multilingual Biomedical Dictionary 
We present a unique technique to create a multilingual biomedical dictionary, based on a methodology called Morpho-Semantic indexing. Our approach closes a gap caused by the absence of free available multilingual medical dictionaries and the lack of accuracy of non-medical electronic translation tools. We first explain the underlying technology followed by a description of the dictionary interface, which makes use of a multilingual subword thesaurus and of statistical information from a domain-specific, multilingual corpus.
PMCID: PMC1560551  PMID: 16779220
20.  Interchanging Lexical Information for a Multilingual Dictionary 
Objective
To facilitate the interchange of lexical information for multiple languages in the medical domain. To pave the way for the emergence of a generally available truly multilingual electronic dictionary in the medical domain.
Methods
An interchange format has to be neutral relative to the target languages. It has to be consistent with current needs of lexicon authors, present and future. An active interaction between six potential authors aimed to determine a common denominator striking the right balance between richness of content and ease of use for lexicon providers.
Results
A simple list of relevant attributes has been established and published. The format has the potential for collecting relevant parts of a future multilingual dictionary. An XML version is available.
Conclusion
This effort makes feasible the exchange of lexical information between research groups. Interchange files are made available in a public repository. This procedure opens the door to a true multilingual dictionary, in the awareness that the exchange of lexical information is (only) a necessary first step, before structuring the corresponding entries in different languages.
PMCID: PMC1560452  PMID: 16778996
21.  Portability issues for a structured clinical vocabulary: mapping from Yale to the Columbia medical entities dictionary. 
OBJECTIVE: To examine the issues involved in mapping an existing structured controlled vocabulary, the Medical Entities Dictionary (MED) developed at Columbia University, to an institutional vocabulary, the laboratory and pharmacy vocabularies of the Yale New Haven Medical Center. DESIGN: 200 Yale pharmacy terms and 200 Yale laboratory terms were randomly selected from database files containing all of the Yale laboratory and pharmacy terms. These 400 terms were then mapped to the MED in three phases: mapping terms, mapping relationships between terms, and mapping attributes that modify terms. RESULTS: 73% of the Yale pharmacy terms mapped to MED terms. 49% of the Yale laboratory terms mapped to MED terms. After certain obsolete and otherwise inappropriate laboratory terms were eliminated, the latter rate improved to 59%. 23% of the unmatched Yale laboratory terms failed to match because of differences in granularity with MED terms. The Yale and MED pharmacy terms share 12 of 30 distinct attributes. The Yale and MED laboratory terms share 14 of 23 distinct attributes. CONCLUSION: The mapping of an institutional vocabulary to a structured controlled vocabulary requires that the mapping be performed at the level of terms, relationships, and attributes. The mapping process revealed the importance of standardization of local vocabulary subsets, standardization of attribute representation, and term granularity.
PMCID: PMC116288  PMID: 8750391
22.  Creating a medical dictionary using word alignment: The influence of sources and resources 
Background
Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality.
Methods
We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary.
Results
The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms.
Conclusion
More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10.
doi:10.1186/1472-6947-7-37
PMCID: PMC2267171  PMID: 18036221
23.  Creating an Online Dictionary of Abbreviations from MEDLINE 
Objective. The growth of the biomedical literature presents special challenges for both human readers and automatic algorithms. One such challenge derives from the common and uncontrolled use of abbreviations in the literature. Each additional abbreviation increases the effective size of the vocabulary for a field. Therefore, to create an automatically generated and maintained lexicon of abbreviations, we have developed an algorithm to match abbreviations in text with their expansions.
Design. Our method uses a statistical learning algorithm, logistic regression, to score abbreviation expansions based on their resemblance to a training set of human-annotated abbreviations. We applied it to Medstract, a corpus of MEDLINE abstracts in which abbreviations and their expansions have been manually annotated. We then ran the algorithm on all abstracts in MEDLINE, creating a dictionary of biomedical abbreviations. To test the coverage of the database, we used an independently created list of abbreviations from the China Medical Tribune.
Measurements. We measured the recall and precision of the algorithm in identifying abbreviations from the Medstract corpus. We also measured the recall when searching for abbreviations from the China Medical Tribune against the database.
Results. On the Medstract corpus, our algorithm achieves up to 83% recall at 80% precision. Applying the algorithm to all of MEDLINE yielded a database of 781,632 high-scoring abbreviations. Of all the abbreviations in the list from the China Medical Tribune, 88% were in the database.
Conclusion. We have developed an algorithm to identify abbreviations from text. We are making this available as a public abbreviation server at \url{http://abbreviation.stanford.edu/}.
doi:10.1197/jamia.M1139
PMCID: PMC349378  PMID: 12386112
24.  Extending the MEDAS Feature Dictionary to Support Access to Radiological Images 
This paper discusses a method of adding a library of radiological images to MEDAS (the Medical Emergency Decision Assistance System). This library is interfaced with the MEDAS Feature Dictionary [1, 2], a dictionary containing terminology for MEDAS knowledge bases. The connections between the radiological images and the terms in the dictionary are used in two ways: 1) To retrieve the images with free text queries. 2) To help in the evaluation of radiological findings during the diagnostic cycle of MEDAS. We plan to use this library as a tool for training students and residents in understanding imaging and its role in diagnostics. This will require construction of a control set of images.
PMCID: PMC2245598
25.  An Entity-Relationship Model for a European Machine-Dictionary of Medicine 
Dictionaries, thesauri, nomenclatures are among the conventional tools for the systematic organization of terms and concepts of medicine. Computer support provides new functions for them, mainly because it allows for the co-existence in the same system of different views and approaches, until now in alternative. We analyze and model the general (language independent) features of both linguistic-terminological aspects and conceptual aspects of an integrated terminological data base of medicine.
The special language of medicine is peculiar with respect to common language, particularly because of its high rate of synonyms and phrasal terms. We examine this peculiarity, analyzing the relationship between medical terms and underlying concepts. We build a continuous scale from the ’free’ text used by health operators in a given document to more and more regular and abstract forms (spelling variants, true synonyms, contextual variants, equivalent terms in different languages, morphosyntactical representatives, concepts).
PMCID: PMC2245517

Results 1-25 (66220)