|Home | About | Journals | Submit | Contact Us | Français|
To analyze the 2007 citation count of articles published by the Croatian Medical Journal in 2005-2006 based on data from the Web of Science, Scopus, and Google Scholar.
Web of Science and Scopus were searched for the articles published in 2005-2006. As all articles returned by Scopus were included in Web of Science, the latter list was the sample for further analysis. Total citation counts for each article on the list were retrieved from Web of Science, Scopus, and Google Scholar. The overlap and unique citations were compared and analyzed. Proportions were compared using χ2-test.
Google Scholar returned the greatest proportion of articles with citations (45%), followed by Scopus (42%), and Web of Science (38%). Almost a half (49%) of articles had no citations and 11% had an equal number of identical citations in all 3 databases. The greatest overlap was found between Web of Science and Scopus (54%), followed by Scopus and Google Scholar (51%), and Web of Science and Google Scholar (44%). The greatest number of unique citations was found by Google Scholar (n=86). The majority of these citations (64%) came from journals, followed by books and PhD theses. Approximately 55% of all citing documents were full-text resources in open access. The language of citing documents was mostly English, but as many as 25 citing documents (29%) were in Chinese.
Google Scholar shares a total of 42% citations returned by two others, more influential, bibliographic resources. The list of unique citations in Google Scholar is predominantly journal based, but these journals are mainly of local character. Citations received by internationally recognized medical journals are crucial for increasing the visibility of small medical journals but Google Scholar may serve as an alternative bibliometric tool for an orientational citation insight.
Despite many controversies, bibliometric scores are widely used in measuring research output and impact at individual and collective levels, as well as in measuring the performance of scientific journals (1). The citation rate of scientific publications has been measured for many years using the citation indexes of the Institute for Scientific Information (now Thomson Reuters) (2). In general medical literature, virtually all citation analysis studies have exclusively used these databases (3). Scholarly communication has been changing rapidly and new means of sharing and archiving results have emerged (digital repositories, open-access journals, etc) (4). Several other citation databases have also become available, including Scopus (5) and Google Scholar (6), both attracting much interest in academic community. Scopus is a multidisciplinary subscription-based bibliographic resource covering more than 16500 peer-reviewed journals, among them many titles published in less-developed and developing countries and over 1200 open access journals (7). Falagas et al found, for example, that for citation analysis Scopus offers about 20% more coverage than Web of Science (8). Google Scholar probably has the widest coverage because it indexes traditional scientific literature, as well as preprints, institutional repositories, and conference proceedings and it is freely available (9). Both databases were introduced in 2004. Many authors have compared various aspects of citation analyzes based on the data provided by these three resources (10-13). A recent study has compared the citation counts of articles published in 3 most prestigious general medical journals using all 3 databases (3).
The Croatian Medical Journal is a general medical journal regularly indexed by both subscription-based bibliographic databases, Web of Science (Science Citation Index Expanded) and Scopus. It is a free-access journal, with full-texts available without any restriction via PubMed Central (14), national open-access repository Hrčak (15), and the journal’s Web site. For a small journal from a small country, citation rate is not only a matter of its scientific visibility, but it could also be a matter of local financial support and manuscript inflow. By exploring the Croatian Medical Journal’s citation profile in all 3 resources, our aim was to find: a) whether the number of citations differed significantly between Web of Science and Scopus since the latter covers more regional (among them many Croatian medical journals) and open access-journals, b) whether the Google Scholar citation score differed significantly from the numbers returned by Web of Science and Scopus, especially regarding citations deriving from peer-reviewed journals, and c) whether Google Scholar can be a qualitative replacement for expensive databases, especially in low-income communities.
In the Croatian Medical Journal citation analysis, we used the method for calculation of Thompson impact factor (16). We collected two sets of data: a) articles published in 2005-2006 and b) citations they received in 2007.
The first round search was conducted in Web of Science and Scopus by the publication name (Croatian Medical Journal) limited to the 2005-2006 period. Web of Science returned 294 and Scopus 286 articles. The comparison showed that all articles returned by Scopus were included in Web of Science database. The list of Web of Science-indexed articles (n=294) was our sample for further analysis.
The number of citations was checked in all 3 databases. Every sample item was checked for citations in Web of Science database, using the “Times Cited” field of the respective bibliographic record. Scopus was searched by the full title of each sample item, and citations were checked using the “Cited By” field of the respective bibliographic record. The Google Scholar was searched by the full title of each sample item (taken from journal Web site tables of contents), and the citations of all retrieved items were analyzed using the “Cited By” function. If an item was not found by the full title search, then it was retrieved by the author, journal title, and publication year, or title word. If not retrieved at all, it was counted as the Google Scholar not-indexed article.
Three lists of captured citations were then checked for the citations received in 2007. The separation was simple for the Web of Science and Scopus citations. However, many records found by Google Scholar did not contain the date of publication and all of them had to be verified further. The identified citation duplicates (eg, title of cited article in various languages) were eliminated.
The 2007 citations returned by all 3 databases were then analyzed and compared. Unique citations, defined as those retrieved by 1 database only and not by the other 2, were registered, as well as those found in all 3 databases. The overlap between databases was also identified and analyzed. All searches were done manually between January and March 2009.
The citations were sorted as follows (10): 1) overlap between all 3 resources; 2) overlap between Web of Science and Scopus; 3) overlap between Web of Science and Google Scholar; 4) overlap between Scopus and Google Scholar; 5) unique citations from Web of Science; 6) unique citations from Scopus; 7) unique citations from Google Scholar. Proportions were compared using χ2-test.
Table 1 presents the number of articles indexed from the Croatian Medical Journal, total number of citations returned by each of the 3 analyzed databases, and the number of unique citations. Minimal difference in the number of indexed articles was due to different indexing practices for non-research material (book reviews, obituaries, etc.). Google Scholar returned the greatest share of articles with citations (45%), followed by Scopus (42%), and Web of Science (38%). We identified a group of 145 articles (49%) that had no citations, while a group of 32 articles (11%) returned an equal number of identical citations by all 3 databases. The difference between Scopus and Google Scholar in the average number of citations per indexed article was minimal (1.01 vs 1.03), while the average number of citations received by a Web of Science-indexed article was 0.80. The greatest number of unique citations was found by Google Scholar.
The analysis of the distribution of unique and overlapping citations as returned by the 3 databases (Figure 1) showed that among 395 citations returned, 166 (42%) were tracked in all 3 of the databases. The greatest overlap was found between Web of Science and Scopus (213 or 54%), followed by Scopus and Google Scholar (202 or 51%) and Web of Science and Google Scholar (175 or 44%). There was a significant difference in unique citations between the 3 databases (χ2=56.5, P<0.001), as well as between the 3 pairs of databases (P<0.05).
We examined unique citations in Web of Science and Scopus only randomly and it seems that the coverage makes only a part (though major) of the difference found. For example, some of the Scopus unique citations originated from the Croatian and regional journals, not indexed by Web of Science. But, in Scopus we found several citations not returned by Web of Science because of an error in the process of citing or linking. Further investigation is needed to come upon a valid conclusion.
The list of Google Scholar unique citations regarding the types of citing documents, their web accessibility, and language is presented in Table 2.
The qualitative analysis of Google Scholar unique citations revealed that the majority (64%) came from journals, followed by books and PhD theses (Table 2). Approximately 55% of all citing documents were full-text resources in open access. The language of citing documents was mostly English, but as many as 25 citing documents (29%) were in Chinese.
Our study demonstrated that the Web of Science databases covered the highest-impact scientific journals as the source of citation for the Croatian Medical Journal, but that the coverage of Scopus, and especially of Google Scholar was broader and included additional local sources. It has been shown that the Web of Science is a selective source of publication citations (11). On the other hand, for a sample of high-profile general medicine articles, Google Scholar and Scopus may retrieve a greater number of citations than Web of Science (3). In our study, the difference in the number of retrieved citations was 19% between Scopus and Web of Science, 21% between Web of Science and Google Scholar, and 3% between Scopus and Google Scholar.
Previous studies have shown that the degree of citation overlap between the 3 databases varied by field of study (10,11) with no more than 31% of citations overlapping in all 3 databases (10). Our results showed that 42% of the citations could be tracked in all 3 databases. The greatest overlap was found between Web of Science and Scopus (54%), followed by Scopus and Google Scholar (51%) and Web of Science and Google Scholar (44%). Meho and Yang have found the overlap between Web of Science and Scopus of 58.2% (12), while Kousha and Thelwall (11) have found the overlap between Web of Science and Google Scholar of 57%.
There were 86 unique citations from Google Scholar that did not occur either within Scopus or Web of Science. Google Scholar produced more than twice as many unique citations than Scopus. Bakkalbasi et al (10) have also shown that Google Scholar returned the greatest number of unique citing documents for a group of oncology journals, but the difference between Scopus and Google Scholar has been significantly smaller (12% vs 13%). The qualitative analysis of the Google Scholar unique citations revealed that most of them (64%) were from scholarly journals, half of them being in open-access. These findings are typical for the medical field in two aspects, ie, in predominant importance of scientific journals and continuing rapid growth of publicly accessible electronic biomedical information. Our results also confirmed the findings of Kousha and Thelwall on Chinese as a fast-rising language of scientific literature (11).
In conclusion, significant difference in citation rate between Web of Science and Scopus was a result of the difference in coverage. Since Web of Science has recently introduced the policy of wider coverage for “regional scholarship,” we may expect that the difference in citation return would not be significant in the near future (17). Our results showed that Google Scholar shared a total of 42% citations returned by the 2 other, more influential, bibliographic resources. Google Scholar list of unique citations is also predominantly journal based, but these journals are mainly of peripheral and/or local character. Though citations received by internationally recognized medical journals are crucial for increasing the visibility of small medical journals, but it is also useful to follow their visibility in journals of their size and importance in the global scientific community. For these small journals, the open question – whether extra citations, especially from non-journal resources, would improve or over-value journal visibility (11), is, therefore, of minor importance.
In the times of various financial constraints, expensive databases are not affordable to many smaller and low-income communities. Several studies (12,18) have confirmed that, despite many open questions raised by non-transparent indexing policies and quality of covered material, Google Scholar may serve as a complementary tool for accessing citation data. In our opinion, it may also serve as an alternative resource for the quick orientational citation insight. Its use in evaluative bibliometric analysis is a matter of further research.