Search tips
Search criteria

Results 1-25 (132260)

Clipboard (0)

Related Articles

10.  Data dictionaries at Giessen University Hospital: past--present--future. 
The concept of maintaining a medical data dictionary as a HIS core component was fundamental for all HIS development phases since the mid eighties at Giessen University Hospital. Being influenced by an early experimental installation of the HELP hospital information system and its PTXT data dictionary, we kept this approach through a number of development cycles of our own hospital information system. While our first data dictionary implementation (GMDD) was still very close to the PTXT structure (polyhierarchical design with an eight level hierarchy), the second generation dictionary (MDD-GIPHARM) has already been designed using a more flexible semantic network model. GMDD was a mainframe development (realized on Tandem Computers) based on the Tandem Nonstop SQL RDBMS. The major clinical applications established on top of the GMDD were laboratory results review, diagnosis documentation and physician discharge summaries. The MDD-GIPHARM development was initiated on PC-basis as the core of a rheumatology departmental system using MS-Access and then further enhanced within a research project to build knowledge-based functions for drug therapy. A first set of such functions based on MDD-GIPHARM is in routine use since 1996. Our current focus is to enhance MDD-GIPHARM towards an application independent vocabulary server (GDDS), which may be used for a variety of applications with the intranet of Giessen University Hospital. In this paper the evolutionary development of those data dictionary concepts at Giessen University Hospital is illustrated and compared with international activities in the last decade.
PMCID: PMC2232233  PMID: 9929344
11.  Coding of adverse events of suicidality in clinical study reports of duloxetine for the treatment of major depressive disorder: descriptive study 
Objective To assess the effects of coding and coding conventions on summaries and tabulations of adverse events data on suicidality within clinical study reports.
Design Systematic electronic search for adverse events of suicidality in tables, narratives, and listings of adverse events in individual patients within clinical study reports. Where possible, for each event we extracted the original term reported by the investigator, the term as coded by the medical coding dictionary, medical coding dictionary used, and the patient’s trial identification number. Using the patient’s trial identification number, we attempted to reconcile data on the same event between the different formats for presenting data on adverse events within the clinical study report.
Setting 9 randomised placebo controlled trials of duloxetine for major depressive disorder submitted to the European Medicines Agency for marketing approval.
Data sources Clinical study reports obtained from the EMA in 2011.
Results Six trials used the medical coding dictionary COSTART (Coding Symbols for a Thesaurus of Adverse Reaction Terms) and three used MedDRA (Medical Dictionary for Regulatory Activities). Suicides were clearly identifiable in all formats of adverse event data in clinical study reports. Suicide attempts presented in tables included both definitive and provisional diagnoses. Suicidal ideation and preparatory behaviour were obscured in some tables owing to the lack of specificity of the medical coding dictionary, especially COSTART. Furthermore, we found one event of suicidal ideation described in narrative text that was absent from tables and adverse event listings of individual patients. The reason for this is unclear, but may be due to the coding conventions used.
Conclusion Data on adverse events in tables in clinical study reports may not accurately represent the underlying patient data because of the medical dictionaries and coding conventions used. In clinical study reports, the listings of adverse events for individual patients and narratives of adverse events can provide additional information, including original investigator reported adverse event terms, which can enable a more accurate estimate of harms.
PMCID: PMC4045315  PMID: 24899651
12.  Using the LOINC Semantic Structure to Integrate Community-based Survey Items into a Concept-based Enterprise Data Dictionary to Support Comparative Effectiveness Research 
In designing informatics infrastructure to support comparative effectiveness research (CER), it is necessary to implement approaches for integrating heterogeneous data sources such as clinical data typically stored in clinical data warehouses and those that are normally stored in separate research databases. One strategy to support this integration is the use of a concept-oriented data dictionary with a set of semantic terminology models. The aim of this paper is to illustrate the use of the semantic structure of Clinical LOINC (Logical Observation Identifiers, Names, and Codes) in integrating community-based survey items into the Medical Entities Dictionary (MED) to support the integration of survey data with clinical data for CER studies.
PMCID: PMC3799173  PMID: 24199059
13.  On A Nonlinear Generalization of Sparse Coding and Dictionary Learning 
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝd, and the dictionary is learned from the training data using the vector space structure of ℝd and its Euclidean L2-metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.
PMCID: PMC3796141  PMID: 24129583
14.  Evaluation of Controlled Vocabulary Resources for Development of a Consumer Entry Vocabulary for Diabetes 
Digital information technology can facilitate informed decision making by individuals regarding their personal health care. The digital divide separates those who do and those who do not have access to or otherwise make use of digital information. To close the digital divide, health care communications research must address a fundamental issue, the consumer vocabulary problem: consumers of health care, at least those who are laypersons, are not always familiar with the professional vocabulary and concepts used by providers of health care and by providers of health care information, and, conversely, health care and health care information providers are not always familiar with the vocabulary and concepts used by consumers. One way to address this problem is to develop a consumer entry vocabulary for health care communications.
To evaluate the potential of controlled vocabulary resources for supporting the development of consumer entry vocabulary for diabetes.
We used folk medical terms from the Dictionary of American Regional English project to create exended versions of 3 controlled vocabulary resources: the Unified Medical Language System Metathesaurus, the Eurodicautom of the European Commission's Translation Service, and the European Commission Glossary of popular and technical medical terms. We extracted consumer terms from consumer-authored materials, and physician terms from physician-authored materials. We used our extended versions of the vocabulary resources to link diabetes-related terms used by health care consumers to synonymous, nearly-synonymous, or closely-related terms used by family physicians. We also examined whether retrieval of diabetes-related World Wide Web information sites maintained by nonprofit health care professional organizations, academic organizations, or governmental organizations can be improved by substituting a physician term for its related consumer term in the query.
The Dictionary of American Regional English extension of the Metathesaurus provided coverage, either direct or indirect, of approximately 23% of the natural language consumer-term-physician-term pairs. The Dictionary of American Regional English extension of the Eurodicautom provided coverage for 16% of the term pairs. Both the Metathesaurus and the Eurodicautom indirectly related more terms than they directly related. A high percentage of covered term pairs, with more indirectly covered pairs than directly covered pairs, might be one way to make the most out of expensive controlled vocabulary resources. We compared retrieval of diabetes-related Web information sites using the physician terms to retrieval using related consumer terms We based the comparison on retrieval of sites maintained by non-profit healthcare professional organizations, academic organizations, or governmental organizations. The number of such sites in the first 20 results from a search was increased by substituting a physician term for its related consumer term in the query. This suggests that the Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom may be used to provide useful links from natural language consumer terms to natural language physician terms.
The Dictionary of American Regional English extensions of the Metathesaurus and Eurodicautom should be investigated further for support of consumer entry vocabulary for diabetes.
PMCID: PMC1761907  PMID: 11720966
Communication barriers; vocabulary, controlled; public health
15.  A dictionary server for supplying context sensitive medical knowledge. 
The Giessen Data Dictionary Server (GDDS), developed at Giessen University Hospital, integrates clinical systems with on-line, context sensitive medical knowledge to help with making medical decisions. By "context" we mean the clinical information that is being presented at the moment the information need is occurring. The dictionary server makes use of a semantic network supported by a medical data dictionary to link terms from clinical applications to their proper information sources. It has been designed to analyze the network structure itself instead of knowing the layout of the semantic net in advance. This enables us to map appropriate information sources to various clinical applications, such as nursing documentation, drug prescription and cancer follow up systems. This paper describes the function of the dictionary server and shows how the knowledge stored in the semantic network is used in the dictionary service.
PMCID: PMC2243816  PMID: 11079978
16.  Continuous Speech Recognition for Clinicians 
The current generation of continuous speech recognition systems claims to offer high accuracy (greater than 95 percent) speech recognition at natural speech rates (150 words per minute) on low-cost (under $2000) platforms. This paper presents a state-of-the-technology summary, along with insights the authors have gained through testing one such product extensively and other products superficially.
The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute “twice a day” for “bid” when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians' offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal “usable” hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a “speech quality” sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate.
The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription.
PMCID: PMC61360  PMID: 10332653
17.  Low-Dose X-ray CT Reconstruction via Dictionary Learning 
IEEE transactions on medical imaging  2012;31(9):1682-1697.
Although diagnostic medical imaging provides enormous benefits in the early detection and accuracy diagnosis of various diseases, there are growing concerns on the potential side effect of radiation induced genetic, cancerous and other diseases. How to reduce radiation dose while maintaining the diagnostic performance is a major challenge in the computed tomography (CT) field. Inspired by the compressive sensing theory, the sparse constraint in terms of total variation (TV) minimization has already led to promising results for low-dose CT reconstruction. Compared to the discrete gradient transform used in the TV method, dictionary learning is proven to be an effective way for sparse representation. On the other hand, it is important to consider the statistical property of projection data in the low-dose CT case. Recently, we have developed a dictionary learning based approach for low-dose X-ray CT. In this paper, we present this method in detail and evaluate it in experiments. In our method, the sparse constraint in terms of a redundant dictionary is incorporated into an objective function in a statistical iterative reconstruction framework. The dictionary can be either predetermined before an image reconstruction task or adaptively defined during the reconstruction process. An alternating minimization scheme is developed to minimize the objective function. Our approach is evaluated with low-dose X-ray projections collected in animal and human CT studies, and the improvement associated with dictionary learning is quantified relative to filtered backprojection and TV-based reconstructions. The results show that the proposed approach might produce better images with lower noise and more detailed structural features in our selected cases. However, there is no proof that this is true for all kinds of structures.
PMCID: PMC3777547  PMID: 22542666
Compressive sensing (CS); computed tomography (CT); dictionary learning; low-dose CT; sparse representation; statistical iterative reconstruction
18.  Creating a medical dictionary using word alignment: The influence of sources and resources 
Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality.
We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary.
The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms.
More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10.
PMCID: PMC2267171  PMID: 18036221
19.  American Medical Literary Firsts, 1700-1820, in the Countway Library 
A combination of two major collections, those of the Harvard Medical and the Boston Medical Libraries, took place at a formal dedication of the Francis A. Countway Library of Medicine in May of 1965. As a result of this unification the historian may now consult many significant primary source materials in one research collection in Boston. Many of these works are early imprints of medical Americana. This paper discusses twenty-four imprints, from 1700 to 1820, which were firsts of their kind in American medical literature. Presented are the first American medical publication, book, pharmacopeia, mortality statistics, anatomical illustrations, transactions of a medical society, medical journal, textbook of obstetrics, medical dictionary, textbook of medicine, and official pharmacopeia. Also discussed are the first works in this country on smallpox inoculation, scarlet fever, pleurisy, public health, medical education, medical ethics, the history of medicine, surgery, epidemiology, smallpox vaccination, dentistry, meningitis, and psychiatry. An exhibit of these items in the Countway Library is planned for the spring and summer of 1966.
PMCID: PMC198371  PMID: 5901363
20.  Creating a medical English-Swedish dictionary using interactive word alignment 
This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish.
The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates.
A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages.
In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems.
PMCID: PMC1624822  PMID: 17034649
21.  MiDas: Automatic Extraction of a Common Domain of Discourse in Sleep Medicine for Multi-center Data Integration 
AMIA Annual Symposium Proceedings  2011;2011:1196-1205.
Clinical studies often use data dictionaries with controlled sets of terms to facilitate data collection, limited interoperability and sharing at a local site. Multi-center retrospective clinical studies require that these data dictionaries, originating from individual participating centers, be harmonized in preparation for the integration of the corresponding clinical research data. Domain ontologies are often used to facilitate multi-center data integration by modeling terms from data dictionaries in a logic-based language, but interoperability among domain ontologies (using automated techniques) is an unresolved issue. Although many upper-level reference ontologies have been proposed to address this challenge, our experience in integrating multi-center sleep medicine data highlights the need for an upper level ontology that models a common set of terms at multiple-levels of abstraction, which is not covered by the existing upper-level ontologies. We introduce a methodology underpinned by a Minimal Domain of Discourse (MiDas) algorithm to automatically extract a minimal common domain of discourse (upper-domain ontology) from an existing domain ontology. Using the Multi-Modality, Multi-Resource Environment for Physiological and Clinical Research (Physio-MIMI) multi-center project in sleep medicine as a use case, we demonstrate the use of MiDas in extracting a minimal domain of discourse for sleep medicine, from Physio-MIMI’s Sleep Domain Ontology (SDO). We then extend the resulting domain of discourse with terms from the data dictionary of the Sleep Heart and Health Study (SHHS) to validate MiDas. To illustrate the wider applicability of MiDas, we automatically extract the respective domains of discourse from 6 sample domain ontologies from the National Center for Biomedical Ontologies (NCBO) and the OBO Foundry.
PMCID: PMC3243207  PMID: 22195180
22.  Model-based semantic dictionaries for medical language understanding. 
Semantic dictionaries are emerging as a major cornerstone towards achieving sound natural language understanding. Indeed, they constitute the main bridge between words and conceptual entities that reflect their meanings. Nowadays, more and more wide-coverage lexical dictionaries are electronically available in the public domain. However, associating a semantic content with lexical entries is not a straightforward task as it is subordinate to the existence of a fine-grained concept model of the treated domain. This paper presents the benefits and pitfalls in building and maintaining multilingual dictionaries, the semantics of which is directly established on an existing concept model. Concrete cases, handled through the GALEN-IN-USE project, illustrate the use of such semantic dictionaries for the analysis and generation of multilingual surgical procedures.
PMCID: PMC2232654  PMID: 10566333
23.  A Medical Data Dictionary for Decision Support Applications 
Building and maintaining clinically-based medical information systems is a complex task. Advances in database management technologies, including the concept of a data dictionary, have helped support this process. A medical data dictionary is described with discussions on the role of data dictionaries within a medical information system and the conceptual model supported by the AT&T CareComm (TM) data dictionary. Additionally, a high level overview of features that we deemed important to support a medical information system with integrated decision support capabilities is presented.
PMCID: PMC2245107
24.  Categorization of free-text problem lists: an effective method of capturing clinical data. 
Problem lists assist in organizing patient information in computer based medical records. However, in order to use problem lists for billing, research, decision support and standardization, a categorization of the problems entered is required. We describe the problem list component of our computerized patient record, the On-line Medical Record (OMR), which combines a free-text entry mechanism with a categorization scheme, using a dictionary containing 846 terms. All 118,040 problems entered during the system's six years of use have been analyzed, 477 clinicians have entered a mean +/- S.D. of 238 +/- 604 problems into 22,311 patient records. The average number of problems in each patient's file was 5.1 +/- 3.9. Comments were typed for 80,281 (68%) of the problems, ranging in length from 1 to 2456 characters, with a mean length of 98 +/- 110 characters. Half the problems were entered on the day of the encounter with the patient. Overall, 66% of all problems were categorized in relation to terms from the problem dictionary. Lexical analysis of all problem names showed that 80% could be mapped to Meta 1.4, Snomed 3.0 or a pre-release version of Read 3.0. We conclude that a problem list entry scheme combining free-text entry and optional categorization using a dictionary can result in a high proportion of problems being categorized as desired. Improvement of the system by elimination of unused dictionary terms and addition of 1000 terms identified by the lexical analysis is likely to result in even higher categorization rates.
PMCID: PMC2579126  PMID: 8563314
25.  A Feature Dictionary for a Multi-Domain Medical Knowledge Base 
Because different terminology is used by physicians of different specialties in different locations to refer to the same feature (signs, symptoms, test results), it is essential that our knowledge development tools provide a means to access a common pool of terms. This paper discusses the design of an online medical dictionary that provides a solution to this problem for developers of multi-domain knowledge bases for MEDAS (Medical Emergency Decision Assistance System). This Feature Dictionary supports phrase equivalents for features, feature interactions, feature classifications, and translations to the binary features generated by the expert during knowledge creation. It is also used in the conversion of a domain knowledge to the database used by the MEDAS inference diagnostic sessions. The Feature Dictionary also provides capabilities for complex queries across multiple domains using the supported relations. The Feature Dictionary supports three methods for feature representation: 1) binary, 2) continuous valued, and 3) derived.
PMCID: PMC2245233

Results 1-25 (132260)