Search tips
Search criteria 


Logo of procascamcLink to Publisher's site
Proc Annu Symp Comput Appl Med Care. 1993 : 685–689.
PMCID: PMC2850662

Words or concepts: the features of indexing units and their optimal use in information retrieval.


Words or Concepts, which are a better choice for indexing the contents of documents? The answer depends on what method is used for retrieval. This paper studies the effects of using canonical concepts versus document words in different retrieval systems with a testing collection of MEDLINE documents. In our tests, for a retrieval system which does not use any human knowledge, using words yielded better retrieval results, while using concepts suffered from a vocabulary difference between canonical expressions of concepts and non-canonical words in queries or documents. For a system which depends on the UMLS synonym set for a mapping from queries or documents to canonical concepts, the retrieval results were slightly better than the case of not using the synonyms, but still worse than the systems using words. For the systems which automatically "learn" empirical connections between words and concepts from examples in the testing collection, the vocabulary problem was effectively solved, and the results of using concepts were competitive or better, compared to those using words.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (844K), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Salton G. Developments in automatic text retrieval. Science. 1991 Aug 30;253(5023):974–980. [PubMed]
  • Salton G, Buckley C. Global text matching for information retrieval. Science. 1991 Aug 30;253(5023):1012–1015. [PubMed]
  • Hersh W, Hickam DH, Haynes RB, McKibbon KA. Evaluation of SAPHIRE: an automated approach to indexing and retrieving medical literature. Proc Annu Symp Comput Appl Med Care. 1991:808–812. [PMC free article] [PubMed]
  • Hersh WR, Hickam DH, Leone TJ. Words, concepts, or both: optimal indexing units for automated information retrieval. Proc Annu Symp Comput Appl Med Care. 1992:644–648. [PMC free article] [PubMed]
  • Yang Y, Chute CG. An application of least squares fit mapping to clinical classification. Proc Annu Symp Comput Appl Med Care. 1992:460–464. [PMC free article] [PubMed]
  • Haynes RB, McKibbon KA, Walker CJ, Ryan N, Fitzgerald D, Ramsden MF. Online access to MEDLINE in clinical settings. A study of use and usefulness. Ann Intern Med. 1990 Jan 1;112(1):78–84. [PubMed]

Articles from Proceedings of the Annual Symposium on Computer Application in Medical Care are provided here courtesy of American Medical Informatics Association