|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Summary: MiSearch is an adaptive biomedical literature search tool that ranks citations based on a statistical model for the likelihood that a user will choose to view them. Citation selections are automatically acquired during browsing and used to dynamically update a likelihood model that includes authorship, journal and PubMed indexing information. The user can optionally elect to include or exclude specific features and vary the importance of timeliness in the ranking.
Supplementary information: Supplementary data are available at Bioinformatics online.
With the rapidly increasing volume of publications in the biomedical literature, finding relevant work is an ever more difficult challenge. General solutions to the literature search problem are difficult because biomedical science is very diverse; the articles most relevant to one reader may not be relevant to another. Relevance feedback is a well-established technique to improve performance in information retrieval (Rocchio, 1971; Salton, 1971; Salton and Buckley, 1990). Feedback may be acquired explicitly by asking users to rate retrieval results. However, many users find this task burdensome. Even for widely deployed search engines such as Excite, where relevance feedback is available and effective, it is rarely used (Spink et al., 2000). An alternative is to acquire feedback implicitly by observing user behavior (Kelly and Teevan, 2003).
MiSearch is an adaptive literature search tool using implicit relevance feedback that helps users rapidly find PubMed citations relevant to their specific interests. MiSearch automatically saves information on citations a reader has viewed during search and browsing, and uses this information to build a statistical profile describing the readers’ choices. This profile is used to rank the results of future searches, placing those articles that this reader is most likely to view at the top of the list. In effect, MiSearch is using query expansion with probabilistic weighting of terms derived from the implicitly defined relevant document set. Using this implicit feedback approach is effective and improves the relevance ranking of bibliographic search results.
The NCBI Entrez search tool is widely used and alternative interfaces have been developed allowing users to manually vary the weight of different features in determining relevance (Muin et al., 2005) and to reformulate and refine Boolean queries (Bernstam, 2001; Ding et al., 2006), but unlike MiSearch, these tools do not adapt to user behavior.
MiSearch records the users search history and the history of documents selected for viewing using an HTTP redirect mechanism. Four domains are considered: authors (Au), journal (Jl), MeSH terms (Me) and substance names (Sn) indexed by NLM (Nelson et al., 2004). Each domain is described using a statistical profile of term use. The frequency fu(t) of term t occurring in citations that user, u, has selected for viewing is defined as
where Nu(t) is the count of citations indexed with term t that were viewed by the user,Nu is the total number of citations viewed by the user and fP(t) is the absolute frequency with which papers indexed with term t occur in the entire PubMed database. The pseudo counts smooth behavior when the profile has few citations and avoid division by zero if a specific term does not occur in the citations selected by a user. If no feedback is available for the user (Nu = 0), then fu(t) is fP(t). When the user has viewed many articles, fu(t) asymptotically approaches Nu(t/Nu).
MiSearch uses the PubMed eUtils interface to query the PubMed database and ranks citations based on a log likelihood score, S,
where SD are log likelihood scores for each domain and α(T − T0) is term weighting the timeliness of an article. T is the date of publication for a citation, T0 is a reference date (January 1, 2000) and α is an adjustable factor that allows the user to vary the weight given to timeliness in ranking citations.
The score SD for domain D is calculated for each citation as a log likelihood ratio that the term t associated with this citation occur in citations viewed by the user, fu, relative to their frequency in all of PubMed, fP
A positive score indicates that a citation is more likely to be viewed by the user, and a negative score indicates that a citation is less likely to be selected for viewing.
MiSearch is implemented in two components, a PHP script running on an Apache web server that generates forms, dispatches search requests to NCBI Entrez and communicates with a local citation datbase, and a relational database server that stores both the PubMed corpus and user search histories. Ranking is implemented as an SQL stored procedure. Users can label profiles with a string and can define several different profiles for different search tasks.
Figure 1 illustrates the effect of adaptive re-ranking of citation searches based on a profile created from the publication of one author (AWL). Dr Lee's research focuses on signal transduction downstream of the CSF-1 receptor (CSF1R). Using a profile based on viewing Dr Lee's publications (top panel), MiSearch ranks two of Dr Lee's publications and a third recent and highly relevant publication at the highest relevance in a PubMed search for ‘CSF1R’. In contrast, without adaptive ranking (lower panel), the publications are ranked in reverse chronological order with only one moderately relevant publication highly ranked and that citation is in a journal that Dr Lee does not frequently read.
Because MiSearch ranks citations using a statistical profile, the user does not need to explicitly specify the ranking criteria. MiSearch thus complements Boolean search strategies. In Boolean searches, a relevant article may be missed if the user specifies an overly restrictive Boolean filter and the citation uses a synonym for a term not specified in the search query. Using MiSearch, a broader Boolean query can be performed. The MiSearch relevance ranking places the citations most likely to be of interest to this user at the top of the list and avoids the need to view large numbers of citations. Further, a reader may not be aware that all the citations they are viewing contain a common term such as reference to a chemical substance. The MiSearch statistical profile will automatically capture this information and rank other citations, mentioning this term more highly.
Optionally, the user can request that MiSearch use the results of the query itself to construct the profile. This results in a ranking where the citations sharing features with the largest number of other citations in the result set are ranked highly. In this view, citations that are most central to the topic rank highly while citations peripheral to the topic rank lower on the list. For example, in a search about a gene, citations where the gene is the major focus of the paper will be at the top of the query profile ranking while citations that only mention the gene in passing will rank lower on the list. Query profile mode is invoked by using ‘query’ or ‘username query’ as the username.
To assess the effectiveness of implicit relevance feedback, we use a cross validation approach. A training profile is constructed by sampling from the citations selected as relevant for viewing by a user. The test set consists of the remaining citations selected by this user. Typical results are shown in Figure 2. Increasing the number of citations in the training set progressively improves the ranking of the test citations. In Supplementary Data, we compare the performance of MiSearch to relevance feedback using the Entrez ‘related articles’/feature. The improved results in cross validation demonstrate that users are consistent in the articles that they select for viewing and that these selections are an effective implicit source of relevance feedback yielding improved biomedical literature search performance.
Automated collection of implicit relevant feedback information gathered using a click-through mechanism improve bibliographic search performance. Users find this interface intuitive and easy to use. Relevance feedback is applied by simply rerunning a query periodically during normal browsing. We find that response time is a critical factor in user acceptance of a relevance feedback system. While more sophisticated algorithms for classification and ranking based on relevance feedback have been proposed, the likelihood ratios used in MiSearch are effective and easily implemented within an RDBMS. This avoids the need to move large data sets in and out of the database server and improves user response time.
We encountered a number of issues in implementing MiSearch. Optimizing the performance of the relevance feedback system to work with small numbers of events is important. In a typical biomedical literature search task, users often view fewer than a dozen articles.
Many author names are not unique. In the MiSearch formulation, such author names are not resolved, but are expected to occur with higher frequency in the reference corpus and thus provide less information in ranking articles.
Documents vary greatly in the number of authors, MeSH terms and substance names applied to them. It is thus necessary to rank articles based on variable number of terms in these domains. We attempt to avoid bias in the formulation of the scores and by using pseudo counts where zero term counts give zero scores.
The MiSearch ranking is necessarily dependent on the NLM indexing processing. We are developing ways to base retrieval on automatically scored name, substance and MeSH headings, so that we can process documents such as web pages or journal articles that are not indexed by NLM.
Response time is an issue, particularly with very large result sets. The major performance bottleneck is that the system needs to calculate usage frequencies for every term appearing in every document in the result set. This is done on the fly so that rankings reflect the user's most recent search and retrieval behavior, but the reference term frequencies are pre-computed for all of PubMed. This is a compromise. For the task of ranking documents a user is likely to select, the reference corpus would ideally be the collection of documents that the users decided not to view among the citations that their queries had retrieve from Entrez. Implementing this would, however, be computationally intensive.
We thank Dr Angel W. Lee for her advice and comments during the course of preparing this manuscript. This work was supported in part by grants R01 LM008106 and U54 DA021519 from NIH.
Conflict of Interest: none declared.