Search tips
Search criteria

Results 1-25 (889794)

Clipboard (0)

Related Articles

1.  Willow: a uniform search interface. 
The objective of the Willow Project is to develop a uniform search interface that allows a diverse community of users to retrieve information from heterogeneous network-based information resources. Willow separates the user interface from the database management or information retrieval system. It provides a graphic user interface to a variety of information resources residing on diverse hosts, and using different search engines and idiomatic query languages through networked-based client-server and Transmission Control Protocol/Internet Protocol (TCP/IP) protocols. It is based on a "database driver'' model, which allows new database hosts to be added without altering Willow itself. Willow employs a multimedia extension mechanism to launch external viewers to handle data in almost any form. Drivers are currently available for a local BRS/SEARCH system and the Z39.50 protocol. Students, faculty, clinicians, and researchers at the University of Washington are currently offered 30 local and remote databases via Willow. They conduct more than 250,000 sessions a month in libraries, medical centers and clinics, laboratories, and offices, and from home. The Massachusetts Institute of Technology is implementing Willow as its uniform search interface to Z39.50 hosts.
PMCID: PMC116285  PMID: 8750388
2.  CDAPubMed: a browser extension to retrieve EHR-based biomedical literature 
Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs). In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs.
We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i) load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA), (ii) identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH), automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii) generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination.
CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard PubMed interface. It has been tested on a public dataset of HL7-CDA documents, returning significantly fewer citations since queries are focused on characteristics identified within the EHR. For instance, compared with more than 200,000 citations retrieved by breast neoplasm, fewer than ten citations were retrieved when ten patient features were added using CDAPubMed. This is an open source tool that can be freely used for non-profit purposes and integrated with other existing systems.
PMCID: PMC3366875  PMID: 22480327
3. Web Services, a free and integrated toolkit for computational modelling software 
Briefings in Bioinformatics  2009;11(3):270-277.
Exchanging and sharing scientific results are essential for researchers in the field of computational modelling. defines agreed-upon standards for model curation. A fundamental one, MIRIAM (Minimum Information Requested in the Annotation of Models), standardises the annotation and curation process of quantitative models in biology. To support this standard, MIRIAM Resources maintains a set of standard data types for annotating models, and provides services for manipulating these annotations. Furthermore, creates controlled vocabularies, such as SBO (Systems Biology Ontology) which strictly indexes, defines and links terms used in Systems Biology. Finally, BioModels Database provides a free, centralised, publicly accessible database for storing, searching and retrieving curated and annotated computational models. Each resource provides a web interface to submit, search, retrieve and display its data. In addition, the team provides a set of Web Services which allows the community to programmatically access the resources. A user is then able to perform remote queries, such as retrieving a model and resolving all its MIRIAM Annotations, as well as getting the details about the associated SBO terms. These web services use established standards. Communications rely on SOAP (Simple Object Access Protocol) messages and the available queries are described in a WSDL (Web Services Description Language) file. Several libraries are provided in order to simplify the development of client software. Web Services make one step further for the researchers to simulate and understand the entirety of a biological system, by allowing them to retrieve biological models in their own tool, combine queries in workflows and efficiently analyse models.
PMCID: PMC2913671  PMID: 19939940; Systems Biology; modelling; Web Services; annotation; ontology
4.  Design and Implementation of a Portal for the Medical Equipment Market: MEDICOM 
The MEDICOM (Medical Products Electronic Commerce) Portal provides the electronic means for medical-equipment manufacturers to communicate online with their customers while supporting the Purchasing Process and Post Market Surveillance. The Portal offers a powerful Internet-based search tool for finding medical products and manufacturers. Its main advantage is the fast, reliable and up-to-date retrieval of information while eliminating all unrelated content that a general-purpose search engine would retrieve. The Universal Medical Device Nomenclature System (UMDNS) registers all products. The Portal accepts end-user requests and generates a list of results containing text descriptions of devices, UMDNS attribute values, and links to manufacturer Web pages and online catalogues for access to more-detailed information. Device short descriptions are provided by the corresponding manufacturer. The Portal offers technical support for integration of the manufacturers' Web sites with itself. The network of the Portal and the connected manufacturers' sites is called the MEDICOM system.
To establish an environment hosting all the interactions of consumers (health care organizations and professionals) and providers (manufacturers, distributors, and resellers of medical devices).
The Portal provides the end-user interface, implements system management, and supports database compatibility. The Portal hosts information about the whole MEDICOM system (Common Database) and summarized descriptions of medical devices (Short Description Database); the manufacturers' servers present extended descriptions. The Portal provides end-user profiling and registration, an efficient product-searching mechanism, bulletin boards, links to on-line libraries and standards, on-line information for the MEDICOM system, and special messages or advertisements from manufacturers. Platform independence and interoperability characterize the system design. Relational Database Management Systems are used for the system's databases. The end-user interface is implemented using HTML, Javascript, Java applets, and XML documents. Communication between the Portal and the manufacturers' servers is implemented using a CORBA interface. Remote administration of the Portal is enabled by dynamically-generated HTML interfaces based on XML documents.
A representative group of users evaluated the system. The aim of the evaluation was validation of the usability of all of MEDICOM's functionality. The evaluation procedure was based on ISO/IEC 9126 Information technology - Software product evaluation - Quality characteristics and guidelines for their use.
The overall user evaluation of the MEDICOM system was very positive. The MEDICOM system was characterized as an innovative concept that brings significant added value to medical-equipment commerce.
The eventual benefits of the MEDICOM system are (a) establishment of a worldwide-accessible marketplace between manufacturers and health care professionals that provides up-to-date and high-quality product information in an easy and friendly way and (b) enhancement of the efficiency of marketing procedures and after-sales support.
PMCID: PMC1761912  PMID: 11772547
Electronic commerce, medical devices, equipment and supplies, Internet, CORBA, XML, RDBMS
5.  BLAST+: architecture and applications 
BMC Bioinformatics  2009;10:421.
Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications.
We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site.
The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.
PMCID: PMC2803857  PMID: 20003500
6.  Understanding PubMed® user search behavior through log analysis 
This article reports on a detailed investigation of PubMed users’ needs and behavior as a step toward improving biomedical information retrieval. PubMed is providing free service to researchers with access to more than 19 million citations for biomedical articles from MEDLINE and life science journals. It is accessed by millions of users each day. Efficient search tools are crucial for biomedical researchers to keep abreast of the biomedical literature relating to their own research. This study provides insight into PubMed users’ needs and their behavior. This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries. Multiple aspects of users’ interactions with PubMed are characterized in detail with evidence from these logs. Despite having many features in common with general Web searches, biomedical information searches have unique characteristics that are made evident in this study. PubMed users are more persistent in seeking information and they reformulate queries often. The three most frequent types of search are search by author name, search by gene/protein, and search by disease. Use of abbreviation in queries is very frequent. Factors such as result set size influence users’ decisions. Analysis of characteristics such as these plays a critical role in identifying users’ information needs and their search habits. In turn, such an analysis also provides useful insight for improving biomedical information retrieval.
Database URL:
PMCID: PMC2797455  PMID: 20157491
7.  Electronic Biomedical Literature Search for Budding Researcher 
Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research.
PMCID: PMC3809676  PMID: 24179937
Research; Steps in literature search; Search engine; Literature review; Level of evidence
8.  Accessing Biomedical Literature in the Current Information Landscape 
i. Summary
Biomedical and life sciences literature is unique because of its exponentially increasing volume and interdisciplinary nature. Biomedical literature access is essential for several types of users including biomedical researchers, clinicians, database curators, and bibliometricians. In the past few decades, several online search tools and literature archives, generic as well as biomedicine-specific, have been developed. We present this chapter in the light of three consecutive steps of literature access: searching for citations, retrieving full-text, and viewing the article. The first section presents the current state of practice of biomedical literature access, including an analysis of the search tools most frequently used by the users, including PubMed, Google Scholar, Web of Science, Scopus, and Embase, and a study on biomedical literature archives such as PubMed Central. The next section describes current research and the state-of-the-art systems motivated by the challenges a user faces during query formulation and interpretation of search results. The research solutions are classified into five key areas related to text and data mining, text similarity search, semantic search, query support, relevance ranking, and clustering results. Finally, the last section describes some predicted future trends for improving biomedical literature access, such as searching and reading articles on portable devices, and adoption of the open access policy.
PMCID: PMC4593617  PMID: 24788259
Biomedical Literature Search; Text Mining; Information Retrieval; Bioinformatics; Open Access; Relevance Ranking; Semantic Search; Text Similarity Search
9.  Uptake of Workplace HIV Counselling and Testing: A Cluster-Randomised Trial in Zimbabwe 
PLoS Medicine  2006;3(7):e238.
HIV counselling and testing is a key component of both HIV care and HIV prevention, but uptake is currently low. We investigated the impact of rapid HIV testing at the workplace on uptake of voluntary counselling and testing (VCT).
Methods and Findings
The study was a cluster-randomised trial of two VCT strategies, with business occupational health clinics as the unit of randomisation. VCT was directly offered to all employees, followed by 2 y of open access to VCT and basic HIV care. Businesses were randomised to either on-site rapid HIV testing at their occupational clinic (11 businesses) or to vouchers for off-site VCT at a chain of free-standing centres also using rapid tests (11 businesses). Baseline anonymised HIV serology was requested from all employees.
HIV prevalence was 19.8% and 18.4%, respectively, at businesses randomised to on-site and off-site VCT. In total, 1,957 of 3,950 employees at clinics randomised to on-site testing had VCT (mean uptake by site 51.1%) compared to 586 of 3,532 employees taking vouchers at clinics randomised to off-site testing (mean uptake by site 19.2%). The risk ratio for on-site VCT compared to voucher uptake was 2.8 (95% confidence interval 1.8 to 3.8) after adjustment for potential confounders. Only 125 employees (mean uptake by site 4.3%) reported using their voucher, so that the true adjusted risk ratio for on-site compared to off-site VCT may have been as high as 12.5 (95% confidence interval 8.2 to 16.8).
High-impact VCT strategies are urgently needed to maximise HIV prevention and access to care in Africa. VCT at the workplace offers the potential for high uptake when offered on-site and linked to basic HIV care. Convenience and accessibility appear to have critical roles in the acceptability of community-based VCT.
Editors' Summary
Since the first case of AIDS (acquired immunodeficiency syndrome) was reported 25 years ago, AIDS has become a major worldwide epidemic, with 3 million people dying from it in 2005. AIDS is caused by the human immunodeficiency virus (HIV), which is usually spread through unprotected sex with an infected partner. HIV damages the immune system, leaving infected individuals unable to fight off other viruses and bacteria. HIV infections can be treated with drugs know as “antiretrovirals,” and in an effort to deal with the global epidemic, world leaders have committed themselves to providing universal access to these drugs for everyone who needs them by 2010. Unfortunately, although access to antiretrovirals is rapidly increasing, so is the number of people infected with HIV. Last year, there were about 5 million new HIV infections, suggesting that more emphasis on prevention will be needed to halt or reverse the spread of HIV and AIDS. An important part of prevention is testing for HIV infection, but globally only 10% of people who need testing can access it. And even where such services are available, few people use them because of the stigma attached to HIV infection and fear of discrimination.
Why Was This Study Done?
There is limited understanding about the factors that determine whether an individual will decide to have an HIV test. Yet, to reduce HIV spread, as many people at risk of infection must be tested as possible. Previous studies on VCT—a combination of voluntary testing and counseling about the implications of HIV infection and how to avoid transmitting the virus—have indicated that the convenience of getting the test, whether the test is directly offered, and the attitude of staff supplying it are all very important. In this study, the researchers asked whether providing VCT in the workplace could improve the “uptake” of HIV testing in Africa, where the HIV/AIDS epidemic is most widespread.
What Did the Researchers Do and Find?
The researchers identified businesses with occupational health clinics in Zimbabwe, a country where 25% of adults carry HIV, and divided them into two “intervention” groups. Employees at half the businesses were offered “on-site VCT”—pre-test counseling followed by same-day on-site rapid testing, results, and post-test counseling. Employees at the other businesses had the same pre-test counseling but were offered a voucher for an HIV test at an off-site testing center and a later appointment to discuss the results—so-called off-site VCT. Everyone had the same access to limited HIV care should they need it. Although half of the employees at the on-site VCT businesses took up the option of HIV testing, only a fifth of employees at the off-site VCT businesses accepted vouchers for testing, and only one in five of these people actually used their voucher. This means that on-site VCT resulted in about 12 times as many HIV tests as off-site VCT. In both interventions, most of the people who accepted testing did so soon after entering the study and very few people were tested more than once. Finally, people 25 years old or younger, manual workers, and single people were most likely to accept testing in both interventions.
What Do These Findings Mean?
These results suggest that on-site VCT in the workplace might be one way to improve uptake of HIV testing in Africa from its current low level and that providing VCT intermittently might be as effective as continuous provision. Importantly, say the researchers, the results of their study show that a relatively minor change in accessibility to testing can translate into a major difference in test uptake. This may hold true in non-occupational settings. However, these observations need to be repeated in more businesses and other settings, including those where there is no linked HIV care, before they can be generalized. Also, this study reports on the acceptability of this approach to providing VCT, but not on its impact on HIV prevention. As such the results do not indicate whether workplace VCT prevents HIV spread as effectively as other ways of delivering VCT. This will require research investigating how HIV incidence among HIV-negative employees and the partners of HIV-positive employees are affected by different VCT strategies.
Additional Information.
Please access these Web sites via the online version of this summary at
• United States National Institute of Allergy and Infectious Diseases factsheet on HIV infection and AIDS
• United States Department of Health and Human Services information on HIV/AIDS treatment, prevention, and research
• US Centers for Disease Control and Prevention information on HIV/AIDS
• UNAIDS (Joint United Nations Programme on HIV/AIDS) information on political issues related to the HIV/AIDS epidemic and the 2004 UNAIDS/World Health Organization policy statement on HIV testing
•  Aidsmap: information on HIV and AIDS provided by the charity NAM, which includes the latest scientific and political news
• MedlinePlus encyclopedia entry on HIV/AIDS
Voluntary counseling and testing for HIV has the potential for high uptake when it is offered on-site at the workplace and linked to basic HIV care.
PMCID: PMC1483908  PMID: 16796402
10.  SLIM: an alternative Web interface for MEDLINE/PubMed searches – a preliminary study 
With the rapid growth of medical information and the pervasiveness of the Internet, online search and retrieval systems have become indispensable tools in medicine. The progress of Web technologies can provide expert searching capabilities to non-expert information seekers. The objective of the project is to create an alternative search interface for MEDLINE/PubMed searches using JavaScript slider bars. SLIM, or Slider Interface for MEDLINE/PubMed searches, was developed with PHP and JavaScript. Interactive slider bars in the search form controlled search parameters such as limits, filters and MeSH terminologies. Connections to PubMed were done using the Entrez Programming Utilities (E-Utilities). Custom scripts were created to mimic the automatic term mapping process of Entrez. Page generation times for both local and remote connections were recorded.
Alpha testing by developers showed SLIM to be functionally stable. Page generation times to simulate loading times were recorded the first week of alpha and beta testing. Average page generation times for the index page, previews and searches were 2.94 milliseconds, 0.63 seconds and 3.84 seconds, respectively. Eighteen physicians from the US, Australia and the Philippines participated in the beta testing and provided feedback through an online survey. Most users found the search interface user-friendly and easy to use. Information on MeSH terms and the ability to instantly hide and display abstracts were identified as distinctive features.
SLIM can be an interactive time-saving tool for online medical literature research that improves user control and capability to instantly refine and refocus search strategies. With continued development and by integrating search limits, methodology filters, MeSH terms and levels of evidence, SLIM may be useful in the practice of evidence-based medicine.
PMCID: PMC1318459  PMID: 16321145
11.  Statins in the Treatment of Chronic Heart Failure: A Systematic Review 
PLoS Medicine  2006;3(8):e333.
The efficacy of statin therapy in patients with established chronic heart failure (CHF) is a subject of much debate.
Methods and Findings
We conducted three systematic literature searches to assess the evidence supporting the prescription of statins in CHF. First, we investigated the participation of CHF patients in randomized placebo-controlled clinical trials designed to evaluate the efficacy of statins in reducing major cardiovascular events and mortality. Second, we assessed the association between serum cholesterol and outcome in CHF. Finally, we evaluated the ability of statin treatment to modify surrogate endpoint parameters in CHF.
Using validated search strategies, we systematically searched PubMed for our three queries. In addition, we searched the reference lists from eligible studies, used the “see related articles” feature for key publications in PubMed, consulted the Cochrane Library, and searched the ISI Web of Knowledge for papers citing key publications.
Search 1 resulted in the retrieval of 47 placebo-controlled clinical statin trials involving more than 100,000 patients. CHF patients had, however, been systematically excluded from these trials. Search 2 resulted in the retrieval of eight studies assessing the relationship between cholesterol levels and outcome in CHF patients. Lower serum cholesterol was consistently associated with increased mortality. Search 3 resulted in the retrieval of 18 studies on the efficacy of statin treatment in CHF. On the whole, these studies reported favorable outcomes for almost all surrogate endpoints.
Since CHF patients have been systematically excluded from randomized, controlled clinical cholesterol-lowering trials, the effect of statin therapy in these patients remains to be established. Currently, two large, randomized, placebo-controlled statin trials are under way to evaluate the efficacy of statin treatment in terms of reducing clinical endpoints in CHF patients in particular.
A systematic review found that patients with heart failure have been excluded from randomised controlled trials on the use of statins. Evidence from other studies on the effectiveness of statins for patients with heart failure is weak and conflicting.
Editors' Summary
When medical researchers test a drug—or some other treatment—for a particular medical condition, they often decide not to include in their study anyone who has, in addition to the disease they are interested in, certain other health problems. This is because including patients with two or more conditions can complicate the analysis of the results and make it hard to reach firm conclusions. However, excluding patients in this way can result in uncertainty as to whether treatments are effective for anyone who suffers from the disease in question, or just for people like those who took part in the research.
A great deal of research has been conducted with drugs known as statins, which lower cholesterol levels in the blood. (A raised level of cholesterol is known to be a major risk factor for cardiovascular disease, which causes heart attacks and strokes.) As a result of this research, statins have been accepted as effective and safe. They are now, in consequence, among the most commonly prescribed medicines. Heart failure, however, is not the same thing as a heart attack. It is the name given to the condition where the muscles of the heart have become weakened, most often as a result of aging, and the heart becomes gradually less efficient at pumping blood around the body. (Some people with heart failure live for many years, but 70% of those with the condition die within ten years.) It is common for people with cardiovascular disease also to have heart failure. Nevertheless, some researchers who have studied the effects of statins have made the decision not to include in their studies any patients with cardiovascular disease who, in addition, have heart failure.
Why Was This Study Done?
The researchers in this study were aware that patients with heart failure have often been excluded from statin trials. They felt it was important to assess the available evidence supporting the prescription of statins for such patients. Specifically, they wanted to find out the following: how often have patients with heart failure been included in statin trials, what evidence is available as to whether it is beneficial for patients with heart failure to have low cholesterol, and what evidence is there that prescribing statins helps these patients?
What Did the Researchers Do and Find?
They did not do any new work involving patients. Instead, they did a very thorough search for all relevant studies of good quality that had already been published and they reviewed the results. “Randomized clinical trials” (RCTs) are the most reliable type of medical research. The researchers found there had been 47 such trials (involving over 100,000 patients) on the use of statins for treating cardiovascular disease, but all these trials had excluded heart failure patients. They found eight studies (which were not RCTs) looking at cholesterol levels and heart failure. These studies found, perhaps surprisingly, that death rates were higher in those patients with heart failure who had low cholesterol. However, they also found 18 studies (again not RCTs) on the use of statins in patients with heart failure. These 18 studies seemed to suggest that statins were of benefit to the patients who received them.
What Do These Findings Mean?
The evidence for or against prescribing statins for people with heart failure is limited, conflicting, and unclear. Further research involving RTCs is necessary. (Two such trials are known to be in progress.)
Additional Information.
Please access these Web sites via the online version of this summary at
General information about statins is available from the Web site of Patient UK
The American Heart Association Web site is a good source of information about all types of heart disease, including heart attacks and heart failure
For a definition of randomized controlled trials see Wikipedia, a free online encyclopedia that anyone can edit
More detailed information about the quality of evidence from medical research may be found in the James Lind Library
PMCID: PMC1551909  PMID: 16933967
12.  A comparison of the performance of seven key bibliographic databases in identifying all relevant systematic reviews of interventions for hypertension 
Systematic Reviews  2016;5:27.
Bibliographic databases are the primary resource for identifying systematic reviews of health care interventions. Reliable retrieval of systematic reviews depends on the scope of indexing used by database providers. Therefore, searching one database may be insufficient, but it is unclear how many need to be searched. We sought to evaluate the performance of seven major bibliographic databases for the identification of systematic reviews for hypertension.
We searched seven databases (Cochrane library, Database of Abstracts of Reviews of Effects (DARE), Excerpta Medica Database (EMBASE), Epistemonikos, Medical Literature Analysis and Retrieval System Online (MEDLINE), PubMed Health and Turning Research Into Practice (TRIP)) from 2003 to 2015 for systematic reviews of any intervention for hypertension. Citations retrieved were screened for relevance, coded and checked for screening consistency using a fuzzy text matching query. The performance of each database was assessed by calculating its sensitivity, precision, the number of missed reviews and the number of unique records retrieved.
Four hundred systematic reviews were identified for inclusion from 11,381 citations retrieved from seven databases. No single database identified all the retrieved systematic reviews for hypertension. EMBASE identified the most reviews (sensitivity 69 %) but also retrieved the most irrelevant citations with 7.2 % precision (Pr). The sensitivity of the Cochrane library was 60 %, DARE 57 %, MEDLINE 57 %, PubMed Health 53 %, Epistemonikos 49 % and TRIP 33 %. EMBASE contained the highest number of unique records (n = 43). The Cochrane library identified seven unique records and had the highest precision (Pr = 30 %), followed by Epistemonikos (n = 2, Pr = 19 %). No unique records were found in PubMed Health (Pr = 24 %) DARE (Pr = 21 %), TRIP (Pr = 10 %) or MEDLINE (Pr = 10 %). Searching EMBASE and the Cochrane library identified 88 % of all systematic reviews in the reference set, and searching the freely available databases (Cochrane, Epistemonikos, MEDLINE) identified 83 % of all the reviews.
The databases were re-analysed after systematic reviews of non-conventional interventions (e.g. yoga, acupuncture) were removed. Similarly, no database identified all the retrieved systematic reviews. EMBASE identified the most relevant systematic reviews (sensitivity 73 %) but also retrieved the most irrelevant citations with Pr = 5 %. The sensitivity of the Cochrane database was 62 %, followed by MEDLINE (60 %), DARE (55 %), PubMed Health (54 %), Epistemonikos (50 %) and TRIP (31 %). The precision of the Cochrane library was the highest (20 %), followed by PubMed Health (Pr = 16 %), DARE (Pr = 13 %), Epistemonikos (Pr = 12 %), MEDLINE (Pr = 6 %), TRIP (Pr = 6 %) and EMBASE (Pr = 5 %). EMBASE contained the most unique records (n = 34). The Cochrane library identified seven unique records. The other databases held no unique records.
The coverage of bibliographic databases varies considerably due to differences in their scope and content. Researchers wishing to identify systematic reviews should not rely on one database but search multiple databases.
PMCID: PMC4748526  PMID: 26862061
EMBASE; MEDLINE; PubMed Health; TRIP; DARE; Epistemonikos; The Cochrane library; Systematic review; Evaluation
13.  Digitising legacy zoological taxonomic literature: Processes, products and using the output 
ZooKeys  2016;189-206.
By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate metadata to, for example, enable algorithmic assessment of availability of a name under the Code. Legacy (and new) literature delivered in this fashion will become part of a global taxonomic resource from which users can extract tailored content to meet their particular needs, be they nomenclatural, taxonomic, faunistic or other.
To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output of the many efforts has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, the means by which many scientific papers are indexed and located is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising taxonomic names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be searched directly. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original papers and extract target content manually. This opens the door to dynamic linking of new content with extant systems: automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. A means to do this is through marking up content into XML, and the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. We now need to automate this process as far as possible. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying in its production the requirements of the Code, is fit for purpose in the future.
PMCID: PMC4741221  PMID: 26877659
XML; taxonomy; digitisation; nomenclature; legacy literature; zoology; botany
14.  A Day in the Life of PubMed: Analysis of a Typical Day’s Query Log 
To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines.
We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day.
We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies.
The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms.
PubMed’s usage profile should be considered when educating users, building user interfaces, and developing future biomedical information retrieval systems.
PMCID: PMC2213463  PMID: 17213501
15.  A Search Engine to Access PubMed Monolingual Subsets: Proof of Concept and Evaluation in French 
PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing.
The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French).
To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy.
More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French.
It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese.
PMCID: PMC4275477  PMID: 25448528
databases, bibliographic; French language; information storage and retrieval; PubMed; user-computer interface; search engine
16.  Minimally invasive surgical procedures for the treatment of lumbar disc herniation 
In up to 30% of patients undergoing lumbar disc surgery for herniated or protruded discs outcomes are judged unfavourable. Over the last decades this problem has stimulated the development of a number of minimally-invasive operative procedures. The aim is to relieve pressure from compromised nerve roots by mechanically removing, dissolving or evaporating disc material while leaving bony structures and surrounding tissues as intact as possible. In Germany, there is hardly any utilisation data for these new procedures – data files from the statutory health insurances demonstrate that about 5% of all lumbar disc surgeries are performed using minimally-invasive techniques. Their real proportion is thought to be much higher because many procedures are offered by private hospitals and surgeries and are paid by private health insurers or patients themselves. So far no comprehensive assessment comparing efficacy, safety, effectiveness and cost-effectiveness of minimally-invasive lumbar disc surgery to standard procedures (microdiscectomy, open discectomy) which could serve as a basis for coverage decisions, has been published in Germany.
Against this background the aim of the following assessment is:
Based on published scientific literature assess safety, efficacy and effectiveness of minimally-invasive lumbar disc surgery compared to standard procedures. To identify and critically appraise studies comparing costs and cost-effectiveness of minimally-invasive procedures to that of standard procedures. If necessary identify research and evaluation needs and point out regulative needs within the German health care system. The assessment focusses on procedures that are used in elective lumbar disc surgery as alternative treatment options to microdiscectomy or open discectomy. Chemonucleolysis, percutaneous manual discectomy, automated percutaneous lumbar discectomy, laserdiscectomy and endoscopic procedures accessing the disc by a posterolateral or posterior approach are included.
In order to assess safety, efficacy and effectiveness of minimally-invasive procedures as well as their economic implications systematic reviews of the literature are performed. A comprehensive search strategy is composed to search 23 electronic databases, among them MEDLINE, EMBASE and the Cochrane Library. Methodological quality of systematic reviews, HTA reports and primary research is assessed using checklists of the German Scientific Working Group for Health Technology Assessment. Quality and transparency of cost analyses are documented using the quality and transparency catalogues of the working group. Study results are summarised in a qualitative manner. Due to the limited number and the low methodological quality of the studies it is not possible to conduct metaanalyses. In addition to the results of controlled trials results of recent case series are introduced and discussed.
The evidence-base to assess safety, efficacy and effectiveness of minimally-invasive lumbar disc surgery procedures is rather limited:
Percutaneous manual discectomy: Six case series (four after 1998)Automated percutaneous lumbar discectomy: Two RCT (one discontinued), twelve case series (one after 1998)Chemonucleolysis: Five RCT, five non-randomised controlled trials, eleven case seriesPercutaneous laserdiscectomy: One non-randomised controlled trial, 13 case series (eight after 1998)Endoscopic procedures: Three RCT, 21 case series (17 after 1998)
There are two economic analyses each retrieved for chemonucleolysis and automated percutaneous discectomy as well as one cost-minimisation analysis comparing costs of an endoscopic procedure to costs for open discectomy.
Among all minimally-invasive procedures chemonucleolysis is the only of which efficacy may be judged on the basis of results from high quality randomised controlled trials (RCT). Study results suggest that the procedure maybe (cost)effectively used as an intermediate therapeutical option between conservative and operative management of small lumbar disc herniations or protrusions causing sciatica. Two RCT comparing transforaminal endoscopic procedures with microdiscectomy in patients with sciatica and small non-sequestered disc herniations show comparable short and medium term overall success rates. Concerning speed of recovery and return to work a trend towards more favourable results for the endoscopic procedures is noted. It is doubtful though, whether these results from the eleven and five years old studies are still valid for the more advanced procedures used today. The only RCT comparing the results of automated percutaneous lumbar discectomy to those of microdiscectomy showed clearly superior results of microdiscectomy. Furthermore, success rates of automated percutaneous lumbar discectomy reported in the RCT (29%) differ extremely from success rates reported in case series (between 56% and 92%).
The literature search retrieves no controlled trials to assess efficacy and/or effectiveness of laser-discectomy, percutaneous manual discectomy or endoscopic procedures using a posterior approach in comparison to the standard procedures. Results from recent case series permit no assessment of efficacy, especially not in comparison to standard procedures. Due to highly selected patients, modi-fications of operative procedures, highly specialised surgical units and poorly standardised outcome assessment results of case series are highly variable, their generalisability is low.
The results of the five economical analyses are, due to conceptual and methodological problems, of no value for decision-making in the context of the German health care system.
Aside from low methodological study quality three conceptual problems complicate the interpretation of results.
Continuous further development of technologies leads to a diversity of procedures in use which prohibits generalisation of study results. However, diversity is noted not only for minimally-invasive procedures but also for the standard techniques against which the new developments are to be compared. The second problem refers to the heterogeneity of study populations. For most studies one common inclusion criterion was "persisting sciatica after a course of conservative treatment of variable duration". Differences among study populations are noted concerning results of imaging studies. Even within every group of minimally-invasive procedure, studies define their own in- and exclusion criteria which differ concerning degree of dislocation and sequestration of disc material. There is the non-standardised assessment of outcomes which are performed postoperatively after variable periods of time. Most studies report results in a dichotomous way as success or failure while the classification of a result is performed using a variety of different assessment instruments or procedures. Very often the global subjective judgement of results by patients or surgeons is reported. There are no scientific discussions whether these judgements are generalisable or comparable, especially among studies that are conducted under differing socio-cultural conditions. Taking into account the weak evidence-base for efficacy and effectiveness of minimally-invasive procedures it is not surprising that so far there are no dependable economic analyses.
Conclusions that can be drawn from the results of the present assessment refer in detail to the specified minimally-invasive procedures of lumbar disc surgery but they may also be considered exemplary for other fields where optimisation of results is attempted by technological development and widening of indications (e.g. total hip replacement).
Compared to standard technologies (open discectomy, microdiscectomy) and with the exception of chemonucleolysis, the developmental status of all other minimally-invasive procedures assessed must be termed experimental. To date there is no dependable evidence-base to recommend their use in routine clinical practice. To create such a dependable evidence-base further research in two directions is needed: a) The studies need to include adequate patient populations, use realistic controls (e.g. standard operative procedures or continued conservative care) and use standardised measurements of meaningful outcomes after adequate periods of time. b) Studies that are able to report effectiveness of the procedures under everyday practice conditions and furthermore have the potential to detect rare adverse effects are needed. In Sweden this type of data is yielded by national quality registries. On the one hand their data are used for quality improvement measures and on the other hand they allow comprehensive scientific evaluations. Since the year of 2000 a continuous rise in utilisation of minimally-invasive lumbar disc surgery is observed among statutory health insurers. Examples from other areas of innovative surgical technologies (e.g. robot assisted total hip replacement) indicate that the rise will probably continue - especially because there are no legal barriers to hinder introduction of innovative treatments into routine hospital care. Upon request by payers or providers the "Gemeinsamer Bundesausschuss" may assess a treatments benefit, its necessity and cost-effectiveness as a prerequisite for coverage by the statutory health insurance. In the case of minimally-invasive disc surgery it would be advisable to examine the legal framework for covering procedures only if they are provided under evaluation conditions. While in Germany coverage under evaluation conditions is established practice in ambulatory health care only (“Modellvorhaben") examples from other European countries (Great Britain, Switzerland) demonstrate that it is also feasible for hospital based interventions. In order to assure patient protection and at the same time not hinder the further development of new and promising technologies provision under evaluation conditions could also be realised in the private health care market - although in this sector coverage is not by law linked to benefit, necessity and cost-effectiveness of an intervention.
PMCID: PMC3011322  PMID: 21289928
17.  SeqHound: biological sequence and structure database as a platform for bioinformatics research 
BMC Bioinformatics  2002;3:32.
SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment.
SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries.
The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site in the SLRI Toolkit.
PMCID: PMC138791  PMID: 12401134
sequence database; structure database; local database resource
18.  NEMO: Extraction and normalization of organization names from PubMed affiliation strings 
Background. We are witnessing an exponential increase in biomedical research citations in PubMed. However, translating biomedical discoveries into practical treatments is estimated to take around 17 years, according to the 2000 Yearbook of Medical Informatics, and much information is lost during this transition. Pharmaceutical companies spend huge sums to identify opinion leaders and centers of excellence. Conventional methods such as literature search, survey, observation, self-identification, expert opinion, and sociometry not only need much human effort, but are also noncomprehensive. Such huge delays and costs can be reduced by “connecting those who produce the knowledge with those who apply it”. A humble step in this direction is large scale discovery of persons and organizations involved in specific areas of research. This can be achieved by automatically extracting and disambiguating author names and affiliation strings retrieved through Medical Subject Heading (MeSH) terms and other keywords associated with articles in PubMed. In this study, we propose NEMO (Normalization Engine for Matching Organizations), a system for extracting organization names from the affiliation strings provided in PubMed abstracts, building a thesaurus (list of synonyms) of organization names, and subsequently normalizing them to a canonical organization name using the thesaurus.
Results: We used a parsing process that involves multi-layered rule matching with multiple dictionaries. The normalization process involves clustering based on weighted local sequence alignment metrics to address synonymy at word level, and local learning based on finding connected components to address synonymy. The graphical user interface and java client library of NEMO are available at .
Conclusion: NEMO is developed to associate each biomedical paper and its authors with a unique organization name and the geopolitical location of that organization. This system provides more accurate information about organizations than the raw affiliation strings provided in PubMed abstracts. It can be used for : a) bimodal social network analysis that evaluates the research relationships between individual researchers and their institutions; b) improving author name disambiguation; c) augmenting National Library of Medicine (NLM)’s Medical Articles Record System (MARS) system for correcting errors due to OCR on affiliation strings that are in small fonts; and d) improving PubMed citation indexing strategies (authority control) based on normalized organization name and country.
PMCID: PMC2990275  PMID: 20922666
19.  A Study on Pubmed Search Tag Usage Pattern: Association Rule Mining of a Full-day Pubmed Query Log 
The practice of evidence-based medicine requires efficient biomedical literature search such as PubMed/MEDLINE. Retrieval performance relies highly on the efficient use of search field tags. The purpose of this study was to analyze PubMed log data in order to understand the usage pattern of search tags by the end user in PubMed/MEDLINE search.
A PubMed query log file was obtained from the National Library of Medicine containing anonymous user identification, timestamp, and query text. Inconsistent records were removed from the dataset and the search tags were extracted from the query texts. A total of 2,917,159 queries were selected for this study issued by a total of 613,061 users. The analysis of frequent co-occurrences and usage patterns of the search tags was conducted using an association mining algorithm.
The percentage of search tag usage was low (11.38% of the total queries) and only 2.95% of queries contained two or more tags. Three out of four users used no search tag and about two-third of them issued less than four queries. Among the queries containing at least one tagged search term, the average number of search tags was almost half of the number of total search terms. Navigational search tags are more frequently used than informational search tags. While no strong association was observed between informational and navigational tags, six (out of 19) informational tags and six (out of 29) navigational tags showed strong associations in PubMed searches.
The low percentage of search tag usage implies that PubMed/MEDLINE users do not utilize the features of PubMed/MEDLINE widely or they are not aware of such features or solely depend on the high recall focused query translation by the PubMed’s Automatic Term Mapping. The users need further education and interactive search application for effective use of the search tags in order to fulfill their biomedical information needs from PubMed/MEDLINE.
PMCID: PMC3552776  PMID: 23302604
20.  Design and utilization of the colorectal and pancreatic neoplasm virtual biorepository: An early detection research network initiative 
The Early Detection Research Network (EDRN) colorectal and pancreatic neoplasm virtual biorepository is a bioinformatics-driven system that provides high-quality clinicopathology-rich information for clinical biospecimens. This NCI-sponsored EDRN resource supports translational cancer research. The information model of this biorepository is based on three components: (a) development of common data elements (CDE), (b) a robust data entry tool and (c) comprehensive data query tools.
The aim of the EDRN initiative is to develop and sustain a virtual biorepository for support of translational research. High-quality biospecimens were accrued and annotated with pertinent clinical, epidemiologic, molecular and genomic information. A user-friendly annotation tool and query tool was developed for this purpose. The various components of this annotation tool include: CDEs are developed from the College of American Pathologists (CAP) Cancer Checklists and North American Association of Central Cancer Registries (NAACR) standards. The CDEs provides semantic and syntactic interoperability of the data sets by describing them in the form of metadata or data descriptor. The data entry tool is a portable and flexible Oracle-based data entry application, which is an easily mastered, web-based tool. The data query tool facilitates investigators to search deidentified information within the warehouse through a “point and click” interface thus enabling only the selected data elements to be essentially copied into a data mart using a dimensional-modeled structure from the warehouse’s relational structure.
The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository database contains multimodal datasets that are available to investigators via a web-based query tool. At present, the database holds 2,405 cases and 2,068 tumor accessions. The data disclosure is strictly regulated by user’s authorization. The high-quality and well-characterized biospecimens have been used in different translational science research projects as well as to further various epidemiologic and genomics studies.
The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository with a tangible translational biomedical informatics infrastructure facilitates translational research. The data query tool acts as a central source and provides a mechanism for researchers to efficiently query clinically annotated datasets and biospecimens that are pertinent to their research areas. The tool ensures patient health information protection by disclosing only deidentified data with Institutional Review Board and Health Insurance Portability and Accountability Act protocols.
PMCID: PMC2956178  PMID: 21031013
Colorectal and pancreatic neoplasm; tissue banking informatics
21.  Personalized online information search and visualization 
The rapid growth of online publications such as the Medline and other sources raises the questions how to get the relevant information efficiently. It is important, for a bench scientist, e.g., to monitor related publications constantly. It is also important, for a clinician, e.g., to access the patient records anywhere and anytime. Although time-consuming, this kind of searching procedure is usually similar and simple. Likely, it involves a search engine and a visualization interface. Different words or combination reflects different research topics. The objective of this study is to automate this tedious procedure by recording those words/terms in a database and online sources, and use the information for an automated search and retrieval. The retrieved information will be available anytime and anywhere through a secure web server.
We developed such a database that stored searching terms, journals and et al., and implement a piece of software for searching the medical subject heading-indexed sources such as the Medline and other online sources automatically. The returned information were stored locally, as is, on a server and visible through a Web-based interface. The search was performed daily or otherwise scheduled and the users logon to the website anytime without typing any words. The system has potentials to retrieve similarly from non-medical subject heading-indexed literature or a privileged information source such as a clinical information system. The issues such as security, presentation and visualization of the retrieved information were thus addressed. One of the presentation issues such as wireless access was also experimented. A user survey showed that the personalized online searches saved time and increased and relevancy. Handheld devices could also be used to access the stored information but less satisfactory.
The Web-searching software or similar system has potential to be an efficient tool for both bench scientists and clinicians for their daily information needs.
PMCID: PMC1079857  PMID: 15766382
22.  User centered and ontology based information retrieval system for life sciences 
BMC Bioinformatics  2012;13(Suppl 1):S4.
Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations.
This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway.
The ontology based information retrieval system described in this paper (OBIRS) is freely available at: This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.
PMCID: PMC3434427  PMID: 22373375
23.  Concept-based query expansion for retrieving gene related publications from MEDLINE 
BMC Bioinformatics  2010;11:212.
Advances in biotechnology and in high-throughput methods for gene analysis have contributed to an exponential increase in the number of scientific publications in these fields of study. While much of the data and results described in these articles are entered and annotated in the various existing biomedical databases, the scientific literature is still the major source of information. There is, therefore, a growing need for text mining and information retrieval tools to help researchers find the relevant articles for their study. To tackle this, several tools have been proposed to provide alternative solutions for specific user requests.
This paper presents QuExT, a new PubMed-based document retrieval and prioritization tool that, from a given list of genes, searches for the most relevant results from the literature. QuExT follows a concept-oriented query expansion methodology to find documents containing concepts related to the genes in the user input, such as protein and pathway names. The retrieved documents are ranked according to user-definable weights assigned to each concept class. By changing these weights, users can modify the ranking of the results in order to focus on documents dealing with a specific concept. The method's performance was evaluated using data from the 2004 TREC genomics track, producing a mean average precision of 0.425, with an average of 4.8 and 31.3 relevant documents within the top 10 and 100 retrieved abstracts, respectively.
QuExT implements a concept-based query expansion scheme that leverages gene-related information available on a variety of biological resources. The main advantage of the system is to give the user control over the ranking of the results by means of a simple weighting scheme. Using this approach, researchers can effortlessly explore the literature regarding a group of genes and focus on the different aspects relating to these genes.
PMCID: PMC2873540  PMID: 20426836
24.  Evidence based medicine in clinical practice: how to advise patients on the influence of age on the outcome of surgical anterior cruciate ligament reconstruction: a review of the literature 
Objective: To determine, using a literature search, whether patient age influences the outcome of surgical reconstruction of a torn anterior cruciate ligament.
Methods: Medline (1966 to present) was searched using the PubMed interface, Embase (1974 to present) using the Datastar system, and the Cochrane Library at the Update Software web site. Papers retrieved from the three databases were independently assessed by two reviewers using preliminary inclusion criteria. Reference lists of papers satisfying the preliminary criteria were then scanned and appropriate papers reviewed. Any new papers in turn had their reference lists scanned, this process continuing until no new papers were identified. Final inclusion criteria were then applied to all papers satisfying the preliminary inclusion criteria.
Results: The initial search identified 661 papers. Exclusion of duplicates produced 536 unique papers. Medline contained 445, Embase 185, and the Cochrane Library 31. Of the 536, 523 were assessed by abstract and 12 by full text; one paper was not retrieved. Application of the preliminary inclusion criteria produced 33 papers. Their reference lists contained 950 references. Scanning of these added six new papers to the dataset. These six had their reference lists assessed; no new papers were identified. Four of the 39 papers in the completed dataset satisfied the final inclusion criteria. There was wide variation in the total number of subjects in the four studies, ranging from 22 to 203 patients. The total number of different outcome measures was 17; only one measure was used by all four studies. None of the objective outcome measures showed any significant difference between age groups, and the subjective measures, which did show differences, were contradictory. A total of 108 interlibrary loans were requested, by a full time researcher, at a total cost of IR£432.00 over a 10 week period.
Conclusions: When advising patients on the outcome of anterior cruciate ligament reconstruction, age should not be considered in isolation. In the absence of relevant guidelines, meta-analyses, or systematic reviews, the application of evidence based medicine to clinical practice has significant resource implications.
PMCID: PMC1724514  PMID: 12055115
25.  SEACOIN – An Investigative Tool for Biomedical Informatics Researchers 
Peer-reviewed scientific literature is a prime source for accessing knowledge in the biomedical field. Its rapid growth and diverse domain coverage require systematic efforts in developing interactive tools for efficiently searching and summarizing current advances for acquiring knowledge and referencing, and for furthering scientific discovery. Although information retrieval systems exist, the conventional tools and systems remain difficult for biomedical investigators to use. There remain gaps even in the state-of-the-art systems as little attention has been devoted to understanding the needs of biomedical researchers.
Our work attempts to bridge the gap between the needs of biomedical users and systems design efforts. We first study the needs of users and then design a simple visual analytic application tool, SEACOIN. A key motivation stems from biomedical researchers’ request for a “simple interface” that is suitable for novice users in information technology. The system minimizes information overload, and allows users to search easily even in time-constrained situations. Users can manipulate the depth of information according to the purpose of usage. SEACOIN enables interactive exploration and filtering of search results via “metamorphose topological visualization” and “tag cloud,” visualization tools that are commonly used in social network sites. We illustrate SEACOIN’s usage through applications on PubMed publications on heart disease, cancer, Alzheimer’s disease, diabetes, and asthma.
PMCID: PMC3243266  PMID: 22195132

Results 1-25 (889794)