Automated collection of implicit relevant feedback information gathered using a click-through mechanism improve bibliographic search performance. Users find this interface intuitive and easy to use. Relevance feedback is applied by simply rerunning a query periodically during normal browsing. We find that response time is a critical factor in user acceptance of a relevance feedback system. While more sophisticated algorithms for classification and ranking based on relevance feedback have been proposed, the likelihood ratios used in MiSearch are effective and easily implemented within an RDBMS. This avoids the need to move large data sets in and out of the database server and improves user response time.
We encountered a number of issues in implementing MiSearch. Optimizing the performance of the relevance feedback system to work with small numbers of events is important. In a typical biomedical literature search task, users often view fewer than a dozen articles.
Many author names are not unique. In the MiSearch formulation, such author names are not resolved, but are expected to occur with higher frequency in the reference corpus and thus provide less information in ranking articles.
Documents vary greatly in the number of authors, MeSH terms and substance names applied to them. It is thus necessary to rank articles based on variable number of terms in these domains. We attempt to avoid bias in the formulation of the scores and by using pseudo counts where zero term counts give zero scores.
The MiSearch ranking is necessarily dependent on the NLM indexing processing. We are developing ways to base retrieval on automatically scored name, substance and MeSH headings, so that we can process documents such as web pages or journal articles that are not indexed by NLM.
Response time is an issue, particularly with very large result sets. The major performance bottleneck is that the system needs to calculate usage frequencies for every term appearing in every document in the result set. This is done on the fly so that rankings reflect the user's most recent search and retrieval behavior, but the reference term frequencies are pre-computed for all of PubMed. This is a compromise. For the task of ranking documents a user is likely to select, the reference corpus would ideally be the collection of documents that the users decided not to view among the citations that their queries had retrieve from Entrez. Implementing this would, however, be computationally intensive.