Search tips
Search criteria 


Logo of austmjLink to Publisher's site
Australas Med J. 2012; 5(9): 489–496.
Published online 2012 September 30. doi:  10.4066/AMJ.2012.1378
PMCID: PMC3477791

Automated medical literature retrieval



The constantly growing publication rate of medical research articles puts increasing pressure on medical specialists who need to be aware of the recent developments in their field. The currently used literature retrieval systems allow researchers to find specific papers; however the search task is still repetitive and time-consuming.


In this paper we describe a system that retrieves medical publications by automatically generating queries based on data from an electronic patient record. This allows the doctor to focus on medical issues and provide an improved service to the patient, with higher confidence that it is underpinned by current research.


Our research prototype automatically generates query terms based on the patient record and adds weight factors for each term. Currently the patient’s age is taken into account with a fuzzy logic derived weight, and terms describing blood-related anomalies are derived from recent blood test results. Conditionally selected homonyms are used for query expansion.

The query retrieves matching records from a local index of PubMed publications and displays results in descending relevance for the given patient. Recent publications are clearly highlighted for instant recognition by the researcher.


Nine medical specialists from the Royal Adelaide Hospital evaluated the system and submitted pre-trial and post-trial questionnaires. Throughout the study we received positive feedback as doctors felt the support provided by the prototype was useful, and which they would like to use in their daily routine.


By supporting the time-consuming task of query formulation and iterative modification as well as by presenting the search results in order of relevance for the specific patient, literature retrieval becomes part of the daily workflow of busy professionals.

Keywords: Literature retrieval

What this study adds:

  1. Whilst many literature retrieval tools exist on the market, none provide an automated workflow to minimise the search effort.
  2. Our approach automatically finds relevant publications without much interaction by the doctor.
  3. Medical specialists are made aware of relevant literature even during patient consultation without taking attention away from the patient.


The publication rate of medical research articles is constantly increasing. This puts pressure on medical specialists who need to be aware of the outcome of recent studies in their field, recent discoveries and new technologies in order to provide the optimal diagnosis and treatment for their patients.1,2

The search tools currently available to medical specialists allow them to instantly search all publications, however the retrieval process is time consuming and repetitious.3 The most widely used medical publication system PubMed4 provided by the United States National Library of Medicine is based on Boolean retrieval. It therefore only shows publications exactly matching the search terms entered into the system. Those search terms can be combined as Boolean expressions using the logical operators AND, OR and NOT.

Specifying only a few search terms often results in hundreds of thousands of matching publications. However adding a few more search terms quickly leads to an empty result set. Doctors therefore tend to under-specify the query to avoid missing out on important articles. They also have to manually iterate through various combinations of related search terms and their homonyms to find all relevant articles. Whilst PubMed offers a range of features to assist the power user, including identification of related citations and automated query expansion, the search process is still an extremely time consuming and repetitive task. Medical practitioners working in a clinical setting need to be able to focus on the patient, rather than on retrieval technology.

The system we developed allows medical staff to retrieve the publications most relevant for an actual patient without the need to specify long queries describing the patient’s current condition. Our main contribution is the automatic generation of a query including the use of term weights based on data in the patient record. We developed a prototype based on records of patients diagnosed with Cystic Fibrosis, a disease that is managed by a multi- disciplinary team.

Cystic Fibrosis

Cystic fibrosis (CF) is the most common autosomal recessive disease resulting in premature death in Caucasian populations. In Australia the disease occurs 1 in every 2800 live births and the number of people with CF in Australia is approximately 3000.5 The disease is due to a mutation of the Cystic Fibrosis Transmembrane conductance Regulator (CFTR) which is responsible for controlling chloride (and therefore sodium and water) transport across cells.6

Materials and Methods

The following section describes the data used and the components of our system.

It is a multisystem disease with the respiratory tract being the most severely affected and respiratory failure being the common cause of death if lung transplantation is not available. The gastrointestinal tract including the pancreas and liver are commonly involved with resultant malabsorption. Ongoing pancreatic disease may also result in diabetes. Liver involvement may result in the development of liver damage with cirrhosis and portal hypertension. Abnormal function of the CFTR results in loss of sodium and chloride in the sweat and may lead to serum electrolyte abnormalities. In males with CF congenital absence of the vas deferens occurs in 95% of patients, rendering them infertile.

The important management strategies in the treatment of CF are:7

  • management by a multidisciplinary team;
  • treating respiratory tract infections;
  • maintaining a high caloric, high fat diet;
  • airway clearance i.e. ensuring the airways are free of mucous plugs and sputum.

The multisystem nature of the disease and the multidisciplinary treatment strategies means that any search engine must be able to cover many aspects of medicine from basic genetics to therapeutics, physiotherapy and dietetics. It is also important the results from literature searches be clinically relevant, current and in a format not requiring further searches through hundreds of pages. This was the challenge of this project. Blood electrolytes and glucose levels are often abnormal in patients with CF. We thought that searching on these biochemical indices would be an appropriate starting point to test the search engine logic.


The corpus of publications indexed is the subset of all CF- related publications accessible through PubMed. As The system developed has been designed to be used by a multi- disciplinary team of medical researchers sharing care for CF patients, articles describing studies are of high relevance. For another audience in different applications it might be more suitable to limit the corpus to review papers or even clinical practice guidelines.

Search engine

Our tool is based on the probabilistic search engine PADRE8 which supports the use of weights for search terms. PADRE considers a query term with a term weight of 2 to be twice as important as a query term with the default weight of 1. Query terms and synonyms are therefore weighted in order to distinguish between more or less relevant search terms. Although some probabilistic search engines allow the manual weighting of query terms, they are only exploited for internal use such as relevance feedback since using them for manually entered queries would make the user interface too complex. We will show example query components in PADRE syntax with alternative terms being encapsulated in square brackets, e.g. ["search term"^weight "alternative term"^alt weight]

The fundamental text-matching component of the PADRE ranking function is a slightly modified version of BM25.9 Page 4 of8 gives the precise formula. The BM25 score for a document, given a multi-term query, is the weighted sum of the scores due to each query term. Individual term scores are calculated using a tf.idf variant in which term frequencies are document-length normalised (parameter- ised) and subjected to a (parameterised) saturation function. Individual term scores are weighted (multiplied) by qt, the weight calculated by our tool for term t.

Some studies show that certain domain experts are reluctant to move away from boolean retrieval systems.10 However, rather than just aiming for 100% recall, this tool has functions that simplify the retrieval process, which allow it to be embedded in a doctor’s daily workflow.

Automatic query generation

The first step in taking a load off the doctor is to release him from the task of manually entering search terms describing the patient’s situation.

This covers the first aspect of an Evidence-based medicine (EBM)11 related search strategy described as the acronym PICO, where the doctor expresses clinical questions in four steps:

  • Patient Who/what is the patient/problem being addressed?
  • Intervention What is the intended intervention?
  • Comparison What is the intervention compared to?
  • Outcome What are the outcomes?

The query is automatically generated based on the electronic patient record. The automatic query generation is currently based on diagnoses, the age of the patient and results of recent blood test results.

Simple search terms:Diagnoses are already added manually into the patient record by one of the doctors in previous consultations. We extract those underlying diagnoses from the patient record and weight them with 1 as it would be as manually added search term.

Example query component:"Cystic Fibrosis"^1

Fuzzy mapped search terms:PubMed uses Medical Subject Headings (MeSH) terms to encode the age of the patients described within a publication. Those age-related MeSH terms are defined crisply and a PubMed search for a certain age group would not necessarily return publications mentioning patients that are only one year older. We compensate for this disadvantage of indexing using crisp MeSH terms. The age of the patient is mapped to age- specific search terms and weights are calculated based on fuzzy membership functions describing each of those terms. Table 1 lists the age-related MeSH terms and their crisp definition, as well as the fuzzy definition we expanded it to by adding the sides of the trapezoids. Notice that we only generated fuzzy membership functions for the age ranges of the patients currently in the database. Most importantly we extended the age range for the term Adult, which is defined within MeSH as 19 to 44 years. We do not believe that this is the common understanding of the term Adult. To reflect this, we de facto removed the upper boundaries for Adult and Aged by setting it to the artificially high age of 200 years to keep the algorithm simpler whilst still effectively removing one flank.

Figure 1 shows two overlapping trapezoidal membership functions for Adolescent and Young Adult and how search term weights are generated for a given age of 20 years. Using a crisp definition as in Table 1, only publications labeled with the MeSH term Young Adult would be returned. Our system however also retrieves publications indexed with the MeSH term Adolescent, but rank that search result lower, due to the reduced term weight.

Figure 1
Calculating term weights for age-related MeSH terms using fuzzy membership functions for a 20 year old patient. The fuzzy membership function for Adolescent is drawn in red (dash-dotted line), the one for Young Adult in green (dashed line).
Table 1:
Age-related MeSH terms and their crisp definition compared to our fuzzy definition, for age ranges of patients in the database

The weight for an age-related search term is computed using Formula 1 for trapezoidal fuzzy membership functions, using the values a, b, c, d defined in Table 1.

Formula 1:
Weights based on fuzzy membership functions.

Example query component: ["Young Adult"^1 "Adolescent" ^0.5]

Extremeness of blood test results:The results from recent blood tests are compared to minimal and maximum values provided by the lab for each measurement. Using those ranges provided by the labs did not prove sufficient as for some blood tests such as glucose, where the mean value of the cohort of patients is already higher than the maximum defined by the lab. This phenomenon is caused by the fact that all patients in the database are CF patients and CF patients tend to develop CF-related diabetes.

Our tool also takes into account the cohort of patients in the database. The variability in the blood tests is shown in Figure 2 a Parallel Coordinates Plot.12 Each line shows the most recent values of blood tests for each patient in the database. On the four vertical parallel axes labelled calcium, glucose, potassium and sodium, the data points are scaled to plot the lowest recorded value on the bottom and the highest recorded value on top of the axes. This visualisation was used to examine the variation in combinations of expressions across the patient cohort. The plot has been produced using the statistical environment R and a package called iPlots. The iPlot package allows the user to select a certain line or group of lines to see the combination of expressions. In the screenshot, a group of patients sharing a similarly low sodium level have been selected causing the lines to be drawn in red.

Figure 2
A Parallel Coordinates Plot showing the variety of combinations of blood test results in the cohort of patients.

This type of plot clearly shows that even though the five patients are low in sodium, the values of other blood test types are widely distributed. This variability in distribution shows that every patient’s data is likely to generate a unique query even without taking the other query components into account.

Definition 1: Let p(x) denote the percentage of observations of the metabolite that are lower than the observation x, for the patient of interest. Therefore x is the p(x) percentile of the empirical distribution of the metabolite across all patients). We decided to increase the weight of search terms according to the following empirically defined exponential weighting Formula 2.

Formula 2
Percentile-based weights.

According to Formula 2, a test result at the 50th percentile (the median) will receive a weighting of 0.1. This small weight will effectively result in the search term becoming insignificant. Outlying values however will receive higher weights, for example a test result in the 5th (or 95th) percentiles will receive a weighting of 8.6 and therefore rank publications containing the search term much higher.

Introduction of homonyms and conditional homonyms We manually compiled a list of homonyms to allow the specialist to fine-tune the system for their needs.

If the patient’s test result is out of the range provided by the test lab, we also add the appropriate terms describing this condition. A high or low glucose level would therefore trigger the alternative search term Hyperglycemia or Hypo-glycemia respectively to be added. The weight for this alternative search term is still determined by the percentile of the value.

Example query component: ["Glucose"^0.2 "blood sugar"^0.2 "Hyperglycemia"^0.2]

User interface design and result presentation

The design of the user interface was mainly guided by the following five design goals: easy to learn, simple to use, fast, comprehensible and visually appealing.

Figure 3
Screen shot of the medical literature retrieval tool developed.


System Behaviour

The first version of the prototype used PubMed directly as a backend and composed queries in PubMed syntax.

However this did not prove beneficial for various reasons:

  • It took several seconds to retrieve the results through the PubMed API.
  • PubMed does not allow the use of term weights.
  • The results are not ordered by relevance, but e.g. by date of publication.
  • A high number of query terms led to empty result sets.

We did not measure the exact query execution time for both systems for comparison, as the prohibitive problems of the other three reasons listed above are of a qualitative nature.

User study

We approached medical staff providing ongoing medical treatment for CF patients at the Royal Adelaide Hospital to evaluate the system and asked them for feedback. Nine participants were recruited and contributed their feedback. Although this is a small number of participants and does not allow a quantitative analysis of the results, this is compensated for by our access to real patient data and that the participants are professionals actually treating CF patients and are therefore the potential future users of the system.

Pre-trial questionnaire: A pre-trial questionnaire handed out before introducing our system asked for a description their used literature retrieval procedure. Many participants indicated that they only have very limited time for literature searches and that it usually takes many iterations of modified queries to find what they need. Most of them indicated concerns about missing out on important results not retrieved by their current systems.

Post-trial questionnaire: After using the system to freely explore its behaviour searching for results for various patients, participants were asked to provide feedback via post-trial questionnaires. The questions were of an open nature and explicitly asked for elaborative answers. The feedback gathered was generally very positive. Most participants felt that they had to spend less time searching whilst being more confident that they had not missed important results. The most surprising comment added by several participants was that they wanted to use the system even without the case-based feature. This indicates that they found the system superior to their current literature retrieval tool and wanted to use it for general medical literature retrieval.

Discussion and Conclusions

The integration of this tool in the electronic patient record system will help medical practitioners to regularly review patients’ cases with respect to any new literature available. Furthermore it will allow them to improve confidence in their diagnoses and decisions about the treatment, based on the knowledge that they are aware of the recent progress in the field. The doctor can simply focus on his patient rather than search engine peculiarities.

Future work might include a way for the researcher to specify the origin of publications as search preferences. Additional resources such as clinical trials could be added to the searchable collection.

Furthermore, the homonyms could be resolved through a medical ontology or thesaurus; however this activity was beyond the scope of this prototype.


We want to thank Alec Zwart from CSIRO CMIS for statistical advice, Andreas Lehrbaum for technical support and the participants in the evaluation for their time and efforts and the ANU for funding the trip to Adelaide to conduct the evaluation and to Perth to participate in AIH2011.



Not commissioned. Externally peer reviewed.


The authors declare that they have no competing interests


The Australian National University funded the trip to Adelaide to conduct the evaluation and to Perth to participate in AIH2011.


Australian National University Human Ethics Protocol Number 2009/438

Please cite this paper as: Krumpholz, A. Hawking, D. Jones, R. Gedeon, T. Greville, H. Automated medical literature retrieval. AMJ 2012, 5, 9, 489-496. http//


1. Greenes RA, editor. Burlington: Academic Press; 2006. Clinical Decision Support: The Road Ahead.
2. Fraser AG, Dunstan FD. On the impossibility of being expert. BMJ. 2010 Dec; 341:c6815 doi: 10.1136/bmj.c6815. [PubMed]
3. Krumpholz AH, Hawking D, Gedeon T. In: Concepts, Techniques, Applications and Use. Kreuzberger G, Lunzer A, Kaschek R, Your personal, virtual librarian, editors. Hershey: IGI Global; 2011. Interdisciplinary Advances in Adaptive and Intelligent Assistant Systems:
4. US National Library of Medicine (National Institutes of Health). Pubmed. [cited Feb 2012]. Available from:
5. Bell SC, Bye PTP, Cooper PJ, Martin AJ, McKay KO, Robinson PJ, results from a data registry. Med J Aust. 7. Vol. 195. ACM; 2011. Cystic fibrosis in Australia, 2009; pp. 396–400. [PubMed]
6. Riordan JR, Rommens JM, Kerem B-S, Alon N, Rozmahel R, Grzelczak Z. et al. Identification of the cystic fibrosis gene: Cloning and characterization of complementary DNA. Science. 1989 Sep 8;245:4922. [PubMed]
7. UK Cystic Fibrosis Trust. Standards for the Clinical Care of Children and Adults with Cystic Fibrosis in the UK. 2011 Dec; [cited Apr 2012].Available from: publications/consensusdoc/CF_Trust_Standards_of _Care_2011_(website_Apr_12).pdf.
8. Hawking D, Bailey P, Craswell N. Efficient and flexible search using text and metadata. Tech. rep., CSIRO Mathematical and Information Sciences 2000. [cited Feb 2012]. Available from: .
9. Robertson SE, Walker S, Hancock-Beaulieu M, Gatford M. In: Proceedings of TREC-3. NIST special publication 500-225; Nov, 1994. Okapi at TREC-3. pp. 109–126.
10. Joho H, Azzopardi LA, Vanderbauwhede W. Proceeding of the third symposium on Information interaction in context. IIiX ’10, ACM; New York, NY, USA: 2010. A survey of patent users: an analysis of tasks, behavior, search functionality and system requirements. pp. 13–24.
11. Guyatt G, Cairns J, Churchill D, Cook D, Haynes B, Hirsh J. et al. Evidence-based medicine: A new approach to teaching the practice of medicine. JAMA. 1992 Nov;268(17):2420–2425. [PubMed]
12. Chen JX, Wang S. Data Visualization: Parallel coordinates and dimension reduction. Computing In Science & Engineering. 2001 Sep;3(5):110–113.

Articles from The Australasian Medical Journal are provided here courtesy of Australasian Medical Journal