PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (523198)

Clipboard (0)
None

Related Articles

1.  Willow: a uniform search interface. 
The objective of the Willow Project is to develop a uniform search interface that allows a diverse community of users to retrieve information from heterogeneous network-based information resources. Willow separates the user interface from the database management or information retrieval system. It provides a graphic user interface to a variety of information resources residing on diverse hosts, and using different search engines and idiomatic query languages through networked-based client-server and Transmission Control Protocol/Internet Protocol (TCP/IP) protocols. It is based on a "database driver'' model, which allows new database hosts to be added without altering Willow itself. Willow employs a multimedia extension mechanism to launch external viewers to handle data in almost any form. Drivers are currently available for a local BRS/SEARCH system and the Z39.50 protocol. Students, faculty, clinicians, and researchers at the University of Washington are currently offered 30 local and remote databases via Willow. They conduct more than 250,000 sessions a month in libraries, medical centers and clinics, laboratories, and offices, and from home. The Massachusetts Institute of Technology is implementing Willow as its uniform search interface to Z39.50 hosts.
PMCID: PMC116285  PMID: 8750388
2.  CDAPubMed: a browser extension to retrieve EHR-based biomedical literature 
Background
Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs). In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs.
Results
We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i) load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA), (ii) identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH), automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii) generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination.
Conclusions
CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard PubMed interface. It has been tested on a public dataset of HL7-CDA documents, returning significantly fewer citations since queries are focused on characteristics identified within the EHR. For instance, compared with more than 200,000 citations retrieved by breast neoplasm, fewer than ten citations were retrieved when ten patient features were added using CDAPubMed. This is an open source tool that can be freely used for non-profit purposes and integrated with other existing systems.
doi:10.1186/1472-6947-12-29
PMCID: PMC3366875  PMID: 22480327
3.  Understanding PubMed® user search behavior through log analysis 
This article reports on a detailed investigation of PubMed users’ needs and behavior as a step toward improving biomedical information retrieval. PubMed is providing free service to researchers with access to more than 19 million citations for biomedical articles from MEDLINE and life science journals. It is accessed by millions of users each day. Efficient search tools are crucial for biomedical researchers to keep abreast of the biomedical literature relating to their own research. This study provides insight into PubMed users’ needs and their behavior. This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries. Multiple aspects of users’ interactions with PubMed are characterized in detail with evidence from these logs. Despite having many features in common with general Web searches, biomedical information searches have unique characteristics that are made evident in this study. PubMed users are more persistent in seeking information and they reformulate queries often. The three most frequent types of search are search by author name, search by gene/protein, and search by disease. Use of abbreviation in queries is very frequent. Factors such as result set size influence users’ decisions. Analysis of characteristics such as these plays a critical role in identifying users’ information needs and their search habits. In turn, such an analysis also provides useful insight for improving biomedical information retrieval.
Database URL: http://www.ncbi.nlm.nih.gov/PubMed
doi:10.1093/database/bap018
PMCID: PMC2797455  PMID: 20157491
4.  Electronic Biomedical Literature Search for Budding Researcher 
Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research.
doi:10.7860/JCDR/2013/6348.3399
PMCID: PMC3809676  PMID: 24179937
Research; Steps in literature search; Search engine; Literature review; Level of evidence
5.  BioModels.net Web Services, a free and integrated toolkit for computational modelling software 
Briefings in Bioinformatics  2009;11(3):270-277.
Exchanging and sharing scientific results are essential for researchers in the field of computational modelling. BioModels.net defines agreed-upon standards for model curation. A fundamental one, MIRIAM (Minimum Information Requested in the Annotation of Models), standardises the annotation and curation process of quantitative models in biology. To support this standard, MIRIAM Resources maintains a set of standard data types for annotating models, and provides services for manipulating these annotations. Furthermore, BioModels.net creates controlled vocabularies, such as SBO (Systems Biology Ontology) which strictly indexes, defines and links terms used in Systems Biology. Finally, BioModels Database provides a free, centralised, publicly accessible database for storing, searching and retrieving curated and annotated computational models. Each resource provides a web interface to submit, search, retrieve and display its data. In addition, the BioModels.net team provides a set of Web Services which allows the community to programmatically access the resources. A user is then able to perform remote queries, such as retrieving a model and resolving all its MIRIAM Annotations, as well as getting the details about the associated SBO terms. These web services use established standards. Communications rely on SOAP (Simple Object Access Protocol) messages and the available queries are described in a WSDL (Web Services Description Language) file. Several libraries are provided in order to simplify the development of client software. BioModels.net Web Services make one step further for the researchers to simulate and understand the entirety of a biological system, by allowing them to retrieve biological models in their own tool, combine queries in workflows and efficiently analyse models.
doi:10.1093/bib/bbp056
PMCID: PMC2913671  PMID: 19939940
BioModels.net; Systems Biology; modelling; Web Services; annotation; ontology
6.  Statins in the Treatment of Chronic Heart Failure: A Systematic Review 
PLoS Medicine  2006;3(8):e333.
Background
The efficacy of statin therapy in patients with established chronic heart failure (CHF) is a subject of much debate.
Methods and Findings
We conducted three systematic literature searches to assess the evidence supporting the prescription of statins in CHF. First, we investigated the participation of CHF patients in randomized placebo-controlled clinical trials designed to evaluate the efficacy of statins in reducing major cardiovascular events and mortality. Second, we assessed the association between serum cholesterol and outcome in CHF. Finally, we evaluated the ability of statin treatment to modify surrogate endpoint parameters in CHF.
Using validated search strategies, we systematically searched PubMed for our three queries. In addition, we searched the reference lists from eligible studies, used the “see related articles” feature for key publications in PubMed, consulted the Cochrane Library, and searched the ISI Web of Knowledge for papers citing key publications.
Search 1 resulted in the retrieval of 47 placebo-controlled clinical statin trials involving more than 100,000 patients. CHF patients had, however, been systematically excluded from these trials. Search 2 resulted in the retrieval of eight studies assessing the relationship between cholesterol levels and outcome in CHF patients. Lower serum cholesterol was consistently associated with increased mortality. Search 3 resulted in the retrieval of 18 studies on the efficacy of statin treatment in CHF. On the whole, these studies reported favorable outcomes for almost all surrogate endpoints.
Conclusions
Since CHF patients have been systematically excluded from randomized, controlled clinical cholesterol-lowering trials, the effect of statin therapy in these patients remains to be established. Currently, two large, randomized, placebo-controlled statin trials are under way to evaluate the efficacy of statin treatment in terms of reducing clinical endpoints in CHF patients in particular.
A systematic review found that patients with heart failure have been excluded from randomised controlled trials on the use of statins. Evidence from other studies on the effectiveness of statins for patients with heart failure is weak and conflicting.
Editors' Summary
Background.
When medical researchers test a drug—or some other treatment—for a particular medical condition, they often decide not to include in their study anyone who has, in addition to the disease they are interested in, certain other health problems. This is because including patients with two or more conditions can complicate the analysis of the results and make it hard to reach firm conclusions. However, excluding patients in this way can result in uncertainty as to whether treatments are effective for anyone who suffers from the disease in question, or just for people like those who took part in the research.
A great deal of research has been conducted with drugs known as statins, which lower cholesterol levels in the blood. (A raised level of cholesterol is known to be a major risk factor for cardiovascular disease, which causes heart attacks and strokes.) As a result of this research, statins have been accepted as effective and safe. They are now, in consequence, among the most commonly prescribed medicines. Heart failure, however, is not the same thing as a heart attack. It is the name given to the condition where the muscles of the heart have become weakened, most often as a result of aging, and the heart becomes gradually less efficient at pumping blood around the body. (Some people with heart failure live for many years, but 70% of those with the condition die within ten years.) It is common for people with cardiovascular disease also to have heart failure. Nevertheless, some researchers who have studied the effects of statins have made the decision not to include in their studies any patients with cardiovascular disease who, in addition, have heart failure.
Why Was This Study Done?
The researchers in this study were aware that patients with heart failure have often been excluded from statin trials. They felt it was important to assess the available evidence supporting the prescription of statins for such patients. Specifically, they wanted to find out the following: how often have patients with heart failure been included in statin trials, what evidence is available as to whether it is beneficial for patients with heart failure to have low cholesterol, and what evidence is there that prescribing statins helps these patients?
What Did the Researchers Do and Find?
They did not do any new work involving patients. Instead, they did a very thorough search for all relevant studies of good quality that had already been published and they reviewed the results. “Randomized clinical trials” (RCTs) are the most reliable type of medical research. The researchers found there had been 47 such trials (involving over 100,000 patients) on the use of statins for treating cardiovascular disease, but all these trials had excluded heart failure patients. They found eight studies (which were not RCTs) looking at cholesterol levels and heart failure. These studies found, perhaps surprisingly, that death rates were higher in those patients with heart failure who had low cholesterol. However, they also found 18 studies (again not RCTs) on the use of statins in patients with heart failure. These 18 studies seemed to suggest that statins were of benefit to the patients who received them.
What Do These Findings Mean?
The evidence for or against prescribing statins for people with heart failure is limited, conflicting, and unclear. Further research involving RTCs is necessary. (Two such trials are known to be in progress.)
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0030333.
General information about statins is available from the Web site of Patient UK
The American Heart Association Web site is a good source of information about all types of heart disease, including heart attacks and heart failure
For a definition of randomized controlled trials see Wikipedia, a free online encyclopedia that anyone can edit
More detailed information about the quality of evidence from medical research may be found in the James Lind Library
doi:10.1371/journal.pmed.0030333
PMCID: PMC1551909  PMID: 16933967
7.  A Study on Pubmed Search Tag Usage Pattern: Association Rule Mining of a Full-day Pubmed Query Log 
Background
The practice of evidence-based medicine requires efficient biomedical literature search such as PubMed/MEDLINE. Retrieval performance relies highly on the efficient use of search field tags. The purpose of this study was to analyze PubMed log data in order to understand the usage pattern of search tags by the end user in PubMed/MEDLINE search.
Methods
A PubMed query log file was obtained from the National Library of Medicine containing anonymous user identification, timestamp, and query text. Inconsistent records were removed from the dataset and the search tags were extracted from the query texts. A total of 2,917,159 queries were selected for this study issued by a total of 613,061 users. The analysis of frequent co-occurrences and usage patterns of the search tags was conducted using an association mining algorithm.
Results
The percentage of search tag usage was low (11.38% of the total queries) and only 2.95% of queries contained two or more tags. Three out of four users used no search tag and about two-third of them issued less than four queries. Among the queries containing at least one tagged search term, the average number of search tags was almost half of the number of total search terms. Navigational search tags are more frequently used than informational search tags. While no strong association was observed between informational and navigational tags, six (out of 19) informational tags and six (out of 29) navigational tags showed strong associations in PubMed searches.
Conclusions
The low percentage of search tag usage implies that PubMed/MEDLINE users do not utilize the features of PubMed/MEDLINE widely or they are not aware of such features or solely depend on the high recall focused query translation by the PubMed’s Automatic Term Mapping. The users need further education and interactive search application for effective use of the search tags in order to fulfill their biomedical information needs from PubMed/MEDLINE.
doi:10.1186/1472-6947-13-8
PMCID: PMC3552776  PMID: 23302604
8.  A Search Engine to Access PubMed Monolingual Subsets: Proof of Concept and Evaluation in French 
Background
PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing.
Objective
The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French).
Methods
To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy.
Results
More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French.
Conclusions
It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese.
doi:10.2196/jmir.3836
PMCID: PMC4275477  PMID: 25448528
databases, bibliographic; French language; information storage and retrieval; PubMed; user-computer interface; search engine
9.  BLAST+: architecture and applications 
BMC Bioinformatics  2009;10:421.
Background
Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications.
Results
We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site.
Conclusion
The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.
doi:10.1186/1471-2105-10-421
PMCID: PMC2803857  PMID: 20003500
10.  SLIM: an alternative Web interface for MEDLINE/PubMed searches – a preliminary study 
Background
With the rapid growth of medical information and the pervasiveness of the Internet, online search and retrieval systems have become indispensable tools in medicine. The progress of Web technologies can provide expert searching capabilities to non-expert information seekers. The objective of the project is to create an alternative search interface for MEDLINE/PubMed searches using JavaScript slider bars. SLIM, or Slider Interface for MEDLINE/PubMed searches, was developed with PHP and JavaScript. Interactive slider bars in the search form controlled search parameters such as limits, filters and MeSH terminologies. Connections to PubMed were done using the Entrez Programming Utilities (E-Utilities). Custom scripts were created to mimic the automatic term mapping process of Entrez. Page generation times for both local and remote connections were recorded.
Results
Alpha testing by developers showed SLIM to be functionally stable. Page generation times to simulate loading times were recorded the first week of alpha and beta testing. Average page generation times for the index page, previews and searches were 2.94 milliseconds, 0.63 seconds and 3.84 seconds, respectively. Eighteen physicians from the US, Australia and the Philippines participated in the beta testing and provided feedback through an online survey. Most users found the search interface user-friendly and easy to use. Information on MeSH terms and the ability to instantly hide and display abstracts were identified as distinctive features.
Conclusion
SLIM can be an interactive time-saving tool for online medical literature research that improves user control and capability to instantly refine and refocus search strategies. With continued development and by integrating search limits, methodology filters, MeSH terms and levels of evidence, SLIM may be useful in the practice of evidence-based medicine.
doi:10.1186/1472-6947-5-37
PMCID: PMC1318459  PMID: 16321145
11.  A Day in the Life of PubMed: Analysis of a Typical Day’s Query Log 
Objective
To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines.
Design
We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day.
Measurements
We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies.
Results
The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms.
Conclusion
PubMed’s usage profile should be considered when educating users, building user interfaces, and developing future biomedical information retrieval systems.
doi:10.1197/jamia.M2191
PMCID: PMC2213463  PMID: 17213501
12.  How Complementary and Alternative Medicine Practitioners Use PubMed 
Background
PubMed is the largest bibliographic index in the life sciences. It is freely available online and is used by professionals and the public to learn more about medical research. While primarily intended to serve researchers, PubMed provides an array of tools and services that can help a wider readership in the location, comprehension, evaluation, and utilization of medical research.
Objective
This study sought to establish the potential contributions made by a range of PubMed tools and services to the use of the database by complementary and alternative medicine practitioners.
Methods
In this study, 10 chiropractors, 7 registered massage therapists, and a homeopath (N = 18), 11 with prior research training and 7 without, were taken through a 2-hour introductory session with PubMed. The 10 PubMed tools and services considered in this study can be divided into three functions: (1) information retrieval (Boolean Search, Limits, Related Articles, Author Links, MeSH), (2) information access (Publisher Link, LinkOut, Bookshelf ), and (3) information management (History, Send To, Email Alert). Participants were introduced to between six and 10 of these tools and services. The participants were asked to provide feedback on the value of each tool or service in terms of their information needs, which was ranked as positive, positive with emphasis, negative, or indifferent.
Results
The participants in this study expressed an interest in the three types of PubMed tools and services (information retrieval, access, and management), with less well-regarded tools including MeSH Database and Bookshelf. In terms of their comprehension of the research, the tools and services led the participants to reflect on their understanding as well as their critical reading and use of the research. There was universal support among the participants for greater access to complete articles, beyond the approximately 15% that are currently open access. The abstracts provided by PubMed were felt to be necessary in selecting literature to read but entirely inadequate for both evaluating and learning from the research. Thus, the restrictions and fees the participants faced in accessing full-text articles were points of frustration.
Conclusions
The study found strong indications of PubMed’s potential value in the professional development of these complementary and alternative medicine practitioners in terms of engaging with and understanding research. It provides support for the various initiatives intended to increase access, including a recommendation that the National Library of Medicine tap into the published research that is being archived by authors in institutional archives and through other websites.
doi:10.2196/jmir.9.2.e19
PMCID: PMC1913941  PMID: 17613489
PubMed; research dissemination; complementary and alternative medicine; open access; professional development; information retrieval; information management; literacy
13.  Google Scholar is not enough to be used alone for systematic reviews 
Background: Google Scholar (GS) has been noted for its ability to search broadly for important references in the literature. Gehanno et al. recently examined GS in their study: ‘Is Google scholar enough to be used alone for systematic reviews?’ In this paper, we revisit this important question, and some of Gehanno et al.’s other findings in evaluating the academic search engine.
Methods: The authors searched for a recent systematic review (SR) of comparable size to run search tests similar to those in Gehanno et al. We selected Chou et al. (2013) contacting the authors for a list of publications they found in their SR on social media in health. We queried GS for each of those 506 titles (in quotes ""), one by one. When GS failed to retrieve a paper, or produced too many results, we used the allintitle: command to find papers with the same title.
Results: Google Scholar produced records for ~95% of the papers cited by Chou et al. (n=476/506). A few of the 30 papers that were not in GS were later retrieved via PubMed and even regular Google Search. But due to its different structure, we could not run searches in GS that were originally performed by Chou et al. in PubMed, Web of Science, Scopus and PsycINFO®. Identifying 506 papers in GS was an inefficient process, especially for papers using similar search terms.
Conclusions: Has Google Scholar improved enough to be used alone in searching for systematic reviews? No. GS’ constantly-changing content, algorithms and database structure make it a poor choice for systematic reviews. Looking for papers when you know their titles is a far different issue from discovering them initially. Further research is needed to determine when and how (and for what purposes) GS can be used alone. Google should provide details about GS’ database coverage and improve its interface (e.g., with semantic search filters, stored searching, etc.). Perhaps then it will be an appropriate choice for systematic reviews.
doi:10.5210/ojphi.v5i2.4623
PMCID: PMC3733758  PMID: 23923099
MeSH Keywords: Google Scholar; information retrieval; PubMed; searching; systematic reviews
14.  NEMO: Extraction and normalization of organization names from PubMed affiliation strings 
Background. We are witnessing an exponential increase in biomedical research citations in PubMed. However, translating biomedical discoveries into practical treatments is estimated to take around 17 years, according to the 2000 Yearbook of Medical Informatics, and much information is lost during this transition. Pharmaceutical companies spend huge sums to identify opinion leaders and centers of excellence. Conventional methods such as literature search, survey, observation, self-identification, expert opinion, and sociometry not only need much human effort, but are also noncomprehensive. Such huge delays and costs can be reduced by “connecting those who produce the knowledge with those who apply it”. A humble step in this direction is large scale discovery of persons and organizations involved in specific areas of research. This can be achieved by automatically extracting and disambiguating author names and affiliation strings retrieved through Medical Subject Heading (MeSH) terms and other keywords associated with articles in PubMed. In this study, we propose NEMO (Normalization Engine for Matching Organizations), a system for extracting organization names from the affiliation strings provided in PubMed abstracts, building a thesaurus (list of synonyms) of organization names, and subsequently normalizing them to a canonical organization name using the thesaurus.
Results: We used a parsing process that involves multi-layered rule matching with multiple dictionaries. The normalization process involves clustering based on weighted local sequence alignment metrics to address synonymy at word level, and local learning based on finding connected components to address synonymy. The graphical user interface and java client library of NEMO are available at http://lnxnemo.sourceforge.net .
Conclusion: NEMO is developed to associate each biomedical paper and its authors with a unique organization name and the geopolitical location of that organization. This system provides more accurate information about organizations than the raw affiliation strings provided in PubMed abstracts. It can be used for : a) bimodal social network analysis that evaluates the research relationships between individual researchers and their institutions; b) improving author name disambiguation; c) augmenting National Library of Medicine (NLM)’s Medical Articles Record System (MARS) system for correcting errors due to OCR on affiliation strings that are in small fonts; and d) improving PubMed citation indexing strategies (authority control) based on normalized organization name and country.
PMCID: PMC2990275  PMID: 20922666
15.  Design and Implementation of a Portal for the Medical Equipment Market: MEDICOM 
Background
The MEDICOM (Medical Products Electronic Commerce) Portal provides the electronic means for medical-equipment manufacturers to communicate online with their customers while supporting the Purchasing Process and Post Market Surveillance. The Portal offers a powerful Internet-based search tool for finding medical products and manufacturers. Its main advantage is the fast, reliable and up-to-date retrieval of information while eliminating all unrelated content that a general-purpose search engine would retrieve. The Universal Medical Device Nomenclature System (UMDNS) registers all products. The Portal accepts end-user requests and generates a list of results containing text descriptions of devices, UMDNS attribute values, and links to manufacturer Web pages and online catalogues for access to more-detailed information. Device short descriptions are provided by the corresponding manufacturer. The Portal offers technical support for integration of the manufacturers' Web sites with itself. The network of the Portal and the connected manufacturers' sites is called the MEDICOM system.
Objective
To establish an environment hosting all the interactions of consumers (health care organizations and professionals) and providers (manufacturers, distributors, and resellers of medical devices).
Methods
The Portal provides the end-user interface, implements system management, and supports database compatibility. The Portal hosts information about the whole MEDICOM system (Common Database) and summarized descriptions of medical devices (Short Description Database); the manufacturers' servers present extended descriptions. The Portal provides end-user profiling and registration, an efficient product-searching mechanism, bulletin boards, links to on-line libraries and standards, on-line information for the MEDICOM system, and special messages or advertisements from manufacturers. Platform independence and interoperability characterize the system design. Relational Database Management Systems are used for the system's databases. The end-user interface is implemented using HTML, Javascript, Java applets, and XML documents. Communication between the Portal and the manufacturers' servers is implemented using a CORBA interface. Remote administration of the Portal is enabled by dynamically-generated HTML interfaces based on XML documents.
A representative group of users evaluated the system. The aim of the evaluation was validation of the usability of all of MEDICOM's functionality. The evaluation procedure was based on ISO/IEC 9126 Information technology - Software product evaluation - Quality characteristics and guidelines for their use.
Results
The overall user evaluation of the MEDICOM system was very positive. The MEDICOM system was characterized as an innovative concept that brings significant added value to medical-equipment commerce.
Conclusions
The eventual benefits of the MEDICOM system are (a) establishment of a worldwide-accessible marketplace between manufacturers and health care professionals that provides up-to-date and high-quality product information in an easy and friendly way and (b) enhancement of the efficiency of marketing procedures and after-sales support.
doi:10.2196/jmir.3.4.e32
PMCID: PMC1761912  PMID: 11772547
Electronic commerce, medical devices, equipment and supplies, Internet, CORBA, XML, RDBMS
16.  SeqHound: biological sequence and structure database as a platform for bioinformatics research 
BMC Bioinformatics  2002;3:32.
Background
SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment.
Results
SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries.
Conclusions
The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.
doi:10.1186/1471-2105-3-32
PMCID: PMC138791  PMID: 12401134
sequence database; structure database; local database resource
17.  G-Bean: an ontology-graph based web tool for biomedical literature retrieval 
BMC Bioinformatics  2014;15(Suppl 12):S1.
Background
Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently.
Methods
G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles.
Results
Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php.
Conclusions
G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user.
doi:10.1186/1471-2105-15-S12-S1
PMCID: PMC4243180  PMID: 25474588
18.  GO2PUB: Querying PubMed with semantic expansion of gene ontology terms 
Background
With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants.
Results
GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts’ agreement was high (kappa = 0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries.
Conclusions
We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org.
doi:10.1186/2041-1480-3-7
PMCID: PMC3599846  PMID: 22958570
Gene ontology; Semantic expansion; Query enrichment; PubMed
19.  Improving accuracy for identifying related PubMed queries by an integrated approach 
Journal of biomedical informatics  2008;42(5):831-838.
PubMed is the most widely used tool for searching biomedical literature online. As with many other online search tools, a user often types a series of multiple related queries before retrieving satisfactory results to fulfill a single information need. Meanwhile, it is also a common phenomenon to see a user type queries on unrelated topics in a single session. In order to study PubMed users’ search strategies, it is necessary to be able to automatically separate unrelated queries and group together related queries. Here, we report a novel approach combining both lexical and contextual analyses for segmenting PubMed query sessions and identifying related queries and compare its performance with the previous approach based solely on concept mapping.
We experimented with our integrated approach on sample data consisting of 1,539 pairs of consecutive user queries in 351 user sessions. The prediction results of 1,396 pairs agreed with the gold-standard annotations, achieving an overall accuracy of 90.7%. This demonstrates that our approach is significantly better than the previously published method. By applying this approach to a one day query log of PubMed, we found that a significant proportion of information needs involved more than one PubMed query, and that most of the consecutive queries for the same information need are lexically related. Finally, the proposed PubMed distance is shown to be an accurate and meaningful measure for determining the contextual similarity between biological terms. The integrated approach can play a critical role in handling real-world PubMed query log data as is demonstrated in our experiments.
doi:10.1016/j.jbi.2008.12.006
PMCID: PMC2764279  PMID: 19162232
PubMed Distance; Related Query; PubMed Query Log; Session Segmentation; Lexical Similarity; Contextual Similarity
20.  Integration of open access literature into the RCSB Protein Data Bank using BioLit 
BMC Bioinformatics  2010;11:220.
Background
Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB).
Results
BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the results as simple XML Web Services. Here, we integrate BioLit results with the RCSB PDB website by using these services to find PDB IDs that are mentioned in research articles and subsequently retrieving abstract, figures, and text excerpts for those articles. A new RCSB PDB literature view permits browsing through the figures and abstracts of the articles that mention a given structure. The BioLit Web Services that are providing the underlying data are publicly accessible. A client library is provided that supports querying these services (Java).
Conclusions
The integration between literature and websites, as demonstrated here with the RCSB PDB, provides a broader view for how a given structure has been analyzed and used. This approach detects the mention of a PDB structure even if it is not formally cited in the paper. Other structures related through the same literature references can also be identified, possibly providing new scientific insight. To our knowledge this is the first time that database and literature have been integrated in this way and it speaks to the opportunities afforded by open and free access to both database and literature content.
doi:10.1186/1471-2105-11-220
PMCID: PMC2880030  PMID: 20429930
21.  Development and evaluation of evidence-based nursing (EBN) filters and related databases* 
Objectives: Difficulties encountered in the retrieval of evidence-based nursing (EBN) literature and recognition of terminology, research focus, and design differences between evidence-based medicine and nursing led to the realization that nursing needs its own filter strategies for evidence-based practice. This article describes the development and evaluation of filters that facilitate evidence-based nursing searches.
Methods: An inductive, multistep methodology was employed. A sleep search strategy was developed for uniform application to all filters for filter development and evaluation purposes. An EBN matrix was next developed as a framework to illustrate conceptually the placement of nursing-sensitive filters along two axes: horizontally, an adapted nursing process, and vertically, levels of evidence. Nursing diagnosis, patient outcomes, and primary data filters were developed recursively. Through an interface with the PubMed search engine, the EBN matrix filters were inserted into a database that executes filter searches, retrieves citations, and stores and updates retrieved citations sets hourly. For evaluation purposes, the filters were subjected to sensitivity and specificity analyses and retrieval set comparisons. Once the evaluation was complete, hyperlinks providing access to any one or a combination of completed filters to the EBN matrix were created. Subject searches on any topic may be applied to the filters, which interface with PubMed.
Results: Sensitivity and specificity for the combined nursing diagnosis and primary data filter were 64% and 99%, respectively; for the patient outcomes filter, the results were 75% and 71%, respectively. Comparisons were made between the EBN matrix filters (nursing diagnosis and primary data) and PubMed's Clinical Queries (diagnosis and sensitivity) filters. Additional comparisons examined publication types and indexing differences. Review articles accounted for the majority of the publication type differences, because “review” was accepted by the CQ but was “NOT'd” by the EBN filter. Indexing comparisons revealed that although the term “nursing diagnosis” is in Medical Subject Headings (MeSH), the nursing diagnoses themselves (e.g., sleep deprivation, disturbed sleep pattern) are not indexed as nursing diagnoses. As a result, abstracts deemed to be appropriate nursing diagnosis by the EBN filter were not accepted by the CQ diagnosis filter.
Conclusions: The EBN filter capture of desired articles may be enhanced by further refinement to achieve a greater degree of filter sensitivity. Retrieval set comparisons revealed publication type differences and indexing issues. The EBN matrix filter “NOT'd” out “review,” while the CQ filter did not. Indexing issues were identified that explained the retrieval of articles deemed appropriate by the EBN filter matrix but not included in the CQ retrieval. These results have MeSH definition and indexing implications as well as implications for clinical decision support in nursing practice.
PMCID: PMC545129  PMID: 15685282
22.  Rethinking information delivery: using a natural language processing application for point-of-care data discovery*† 
Objective:
This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs.
Methods:
The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool.
Results:
Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57.
Conclusion:
This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users.
doi:10.3163/1536-5050.100.2.009
PMCID: PMC3324802  PMID: 22514507
23.  LitMiner: integration of library services within a bio-informatics application 
Background
This paper examines how the adoption of a subject-specific library service has changed the way in which its users interact with a digital library. The LitMiner text-analysis application was developed to enable biologists to explore gene relationships in the published literature. The application features a suite of interfaces that enable users to search PubMed as well as local databases, to view document abstracts, to filter terms, to select gene name aliases, and to visualize the co-occurrences of genes in the literature. At each of these stages, LitMiner offers the functionality of a digital library. Documents that are accessible online are identified by an icon. Users can also order documents from their institution's library collection from within the application. In so doing, LitMiner aims to integrate digital library services into the research process of its users.
Methods
Case study
Results
This integration of digital library services into the research process of biologists results in increased access to the published literature.
Conclusion
In order to make better use of their collections, digital libraries should customize their services to suit the research needs of their patrons.
doi:10.1186/1742-5581-3-11
PMCID: PMC1626482  PMID: 17052341
24.  Analysis of queries sent to PubMed at the point of care: Observation of search behaviour in a medical teaching hospital 
Background
The use of PubMed to answer daily medical care questions is limited because it is challenging to retrieve a small set of relevant articles and time is restricted. Knowing what aspects of queries are likely to retrieve relevant articles can increase the effectiveness of PubMed searches. The objectives of our study were to identify queries that are likely to retrieve relevant articles by relating PubMed search techniques and tools to the number of articles retrieved and the selection of articles for further reading.
Methods
This was a prospective observational study of queries regarding patient-related problems sent to PubMed by residents and internists in internal medicine working in an Academic Medical Centre. We analyzed queries, search results, query tools (Mesh, Limits, wildcards, operators), selection of abstract and full-text for further reading, using a portal that mimics PubMed.
Results
PubMed was used to solve 1121 patient-related problems, resulting in 3205 distinct queries. Abstracts were viewed in 999 (31%) of these queries, and in 126 (39%) of 321 queries using query tools. The average term count per query was 2.5. Abstracts were selected in more than 40% of queries using four or five terms, increasing to 63% if the use of four or five terms yielded 2–161 articles.
Conclusion
Queries sent to PubMed by physicians at our hospital during daily medical care contain fewer than three terms. Queries using four to five terms, retrieving less than 161 article titles, are most likely to result in abstract viewing. PubMed search tools are used infrequently by our population and are less effective than the use of four or five terms. Methods to facilitate the formulation of precise queries, using more relevant terms, should be the focus of education and research.
doi:10.1186/1472-6947-8-42
PMCID: PMC2567311  PMID: 18816391
25.  Concept-based query expansion for retrieving gene related publications from MEDLINE 
BMC Bioinformatics  2010;11:212.
Background
Advances in biotechnology and in high-throughput methods for gene analysis have contributed to an exponential increase in the number of scientific publications in these fields of study. While much of the data and results described in these articles are entered and annotated in the various existing biomedical databases, the scientific literature is still the major source of information. There is, therefore, a growing need for text mining and information retrieval tools to help researchers find the relevant articles for their study. To tackle this, several tools have been proposed to provide alternative solutions for specific user requests.
Results
This paper presents QuExT, a new PubMed-based document retrieval and prioritization tool that, from a given list of genes, searches for the most relevant results from the literature. QuExT follows a concept-oriented query expansion methodology to find documents containing concepts related to the genes in the user input, such as protein and pathway names. The retrieved documents are ranked according to user-definable weights assigned to each concept class. By changing these weights, users can modify the ranking of the results in order to focus on documents dealing with a specific concept. The method's performance was evaluated using data from the 2004 TREC genomics track, producing a mean average precision of 0.425, with an average of 4.8 and 31.3 relevant documents within the top 10 and 100 retrieved abstracts, respectively.
Conclusions
QuExT implements a concept-based query expansion scheme that leverages gene-related information available on a variety of biological resources. The main advantage of the system is to give the user control over the ranking of the results by means of a simple weighting scheme. Using this approach, researchers can effortlessly explore the literature regarding a group of genes and focus on the different aspects relating to these genes.
doi:10.1186/1471-2105-11-212
PMCID: PMC2873540  PMID: 20426836

Results 1-25 (523198)