Failure to identify relevant information in systematic reviews can result in bias [9
]. The importance of including other sources of data in addition to electronic databases in general and MEDLINE in particular has been documented, especially for clinical or randomised controlled trials [1
]. On the other hand, search strategies for systematic reviews of observational data on morbidities are less precise, more difficult to narrow the focus and have been studied to a lesser extent [12
]. This led us to perform for our systematic review an extensive search strategy that is highly sensitive but barely precise. From 64098 citations identified only 2580 were included which represents 4% of the scrutinized articles.
Although MEDLINE identified about 62% of all citations and 76% of electronic citations relevant for this review, sources of data other than the major electronic databases are confirmed to be crucial. Some 487 (one fifth) citations were identified by reference lists of articles, expert contacts, congress proceedings, abstract books, hand searching of journals available in libraries that are not indexed in electronic databases, and other emerging databases in developing countries. As expected, there has been a large overlap between databases: 60% were identified by two or more databases and about 44% were identified by MEDLINE and EMBASE together. These two databases also provided the largest number of unique citations and both are considered necessary. PAIS International and Econlit only identified 3 and 1 citations, respectively, that were not identified by any other database and they could probably be disregarded in future reviews.
The nature of this systematic review with its focus on settings where burden of disease is highest necessitates extensive searching of developing country sources. However, literature from developing countries is difficult to access and it is not well represented in MEDLINE or other well-known electronic databases [13
]. An editorial by Zielinksi in 1995 stated that only 2% of the journals indexed in MEDLINE or the Science Citation Index were from developing countries [15
]. In 2004, the situation was similar. We calculated the number of journals published in developing countries and also indexed in MEDLINE to be about 6%. In 1996, the whole Latin American continent accounted for 0.39% of the total number of articles included in MEDLINE, down from a high of 2.03% in 1966 [16
]. One of the reasons for this is the indexing of journals on a priority system where the impact factor of a journal influences its chances of being indexed. This results in country bias since western journals have in general higher impact factors, and they are therefore more likely to be indexed than those from developing countries.
The value of LILACS database to improve the quality of systematic reviews has been previously reported [17
]. Our analysis confirms LILACS as a unique source of information for the Latin America and the Caribbean region that is not covered in other databases (117 unique citations included). Unfortunately, specific databases for other less developed regions like Asia and Africa, are just emerging or their access and functioning limited (e.g. AIM, IMEMR, IndMED, HELLIS.ORG). Although these regional databases are included in the review, the results are not presented individually but under 'other' in Table . With these databases, we experienced language barriers, difficulty in obtaining abstracts and full-text reports, inconsistencies, lack of essential information from the citation (e.g. year or title missing) and other technical problems. We believe that the low number of citations identified by IMEMR (see Table ) is due more to the limitations mentioned above than to lack of data. These regional databases provide unique relevant citations and incomplete access limits their usefulness. Strengthening the functionality and improving the search facility of these databases could provide substantial relevant information.
A limiting factor for identifying citations is related to late indexing of journals in electronic databases. Search strategies for this review were conducted in early 2003 to identify articles published in 2002 or earlier. While only few articles published in 2002 could be expected not to be in the databases by 2003, some articles published in 1997 were only appearing in the databases as late as 2003. Traditionally, EMBASE has been found to index faster than MEDLINE, thus supporting the argument to search multiple databases [18
]. Furthermore, each database producer has a particular schedule that the searcher needs to be aware of. For example, MEDLINE available through OVID, due to the updating of the MeSH terms by the National Library of Medicine, will cease entry of new citations in November and only update the database in January of the following year. These factors need to be considered in assessing the yield from different databases. It is necessary to determine how to capture these 'late indexed' citations whether by delaying the running of the search or building into follow-up studies the need to capture these citations. The electronic search for this review would have probably captured more citations had it been run in 2004.
This systematic review involved significant financial and human resources over a 3-year period [3
]. Screening of a large number of citations and retrieving the full text of about 5000 articles have resource implications that need to be balanced with the benefits of the results. For this type of reviews, decisions on the extent of the comprehensiveness of the search strategy should take the resource implications into account. A careful selection of databases to be used and tailored search strategies for each database would help to maximise the benefits compared to costs.