askMEDLINE uses a multi-round search strategy. In the first round, the parser ignores punctuation marks and deletes words found on a "stop-word" list. The stop-word list includes PubMed stop words, and other words that we found by experience, to be detrimental to the search. The parser, a PHP script, then sends the modified query to PubMed Entrez' E-Utilities. The Extensible Markup Language (XML) file returned by E-Utilities indicates the category of each term in the query. Terms marked as "All Fields" denote that they are neither Medical Subject Headings (MeSH) terms nor MeSH Subheadings. These terms are checked to determine if they are found in a "MeSH Backup vocabulary." The backup vocabulary includes words other than MeSH terms, such as MeSH descriptors, that are classified as "other eligible entries". If an "All Fields" word is in the backup vocabulary, it remains in the query; if it is not, it is deleted. The remaining terms are sent back to PubMed, again through E-Utilities. Human and English language limits are always applied. If the journal retrieval count after the first round is between 1 and 50,000, the first 20 results are displayed in the user's browser and the search process terminates. Further searches are dependent on the user.
The search may proceed to Round 2 under two conditions: 1) If no journals are found in the first round, a result that could signify that the search was too narrow (i.e., too many terms are searched, too many filters), the "All Fields" words are deleted from the query, even though they are found in the backup vocabulary. Only MeSH Terms and Subheadings remain (Round 2A.) 2) If the first round retrieval count is larger than 50,000 articles (an indication that the search was too broad) the "All Fields" words removed during the first round (words not found in the backup vocabulary) are put back into the query (Round 2B.) Round 2B searches contain all the MeSH terms (or MeSH Subheadings) and "All Fields" words in the original question. The updated query from either 2A or 2B is once again sent to Entrez E-Utilities. Retrieved journal articles are sent to the user.
Similarly, if the count returned from second round is in the range of 1 to 50000, the search process terminates. If the second round count is still equal to 0 (denoting that the search is still too narrow) another list of "No-Go Terms", terms that when removed could result in a successful search is checked. Common MeSH abbreviations, acronyms and words like, "method," "affect," and "lead" are examples of terms on the list. New terms are continuously added to this list as they are encountered. The third round modified query is once again sent to E-Utilities and the retrieved journal articles are sent to the user. A result of 1 to 50000 citations terminates the process and displays the first 20 articles.
If askMEDLINE retrieves only one to four journal articles, a search is automatically done for related articles of the top two articles. All the articles (one to four previous) and the first 25 related articles of the first two are retrieved. As in any of the previous steps, the first 20 are displayed in the browser. In all the search retrieval pages, a link is provided for the user to manually intervene and modify the search process through the PICO interface. Links to related articles, full-text articles and abstracts are shown.
Since November 2002, the British Medical Journal (BMJ) has published a POEM (Patient-Oriented Evidence that Matters) in every issue. [3
] POEMs are provided to BMJ by InfoRetriever http://www.infopoems.com.ask
MEDLINE was evaluated by comparing its accuracy to retrieve an article cited as a reference in a POEM ("gold standard".) [3
] Every POEM has a question with a cited reference that is relevant to the question. We entered every POEM question into ask
MEDLINE, and for comparison, in Entrez, the integrated, text-based search and retrieval tool for PubMed. New critically appraised topics (CATs) from the University of Michigan, Department of Pediatrics Evidence-Based Pediatrics Web site were also used. [4
] Unlike BMJ POEMs, some questions in CATs had more than one cited reference.
The initial search result was examined to determine if the reference cited in a POEM or CAT was among those retrieved. Subsequent steps were taken if the reference article cited was not: 1) If the initial search retrieved journal citations, but not the specific journals cited in a POEM or CAT, the titles and abstracts were scanned to find out if they were relevant (deemed to answer the question.) If they were, related articles were retrieved, and again evaluated to determine if they matched the cited reference. 2) If no journal articles were retrieved, the question was rephrased, then searched again. Retrievals were again examined for the cited articles and relevancy to the clinical question. Overall efficiency was determined by the accuracy in retrieving a cited article and relevance of citations retrieved for citations that did not match cited references.