The aim of this paper was to determine the relative position of the English Wikipedia and other Web sites containing health information in a search engine-based approach. The results show that if the first page of results of a general search engine lists ten Web sites, Wikipedia can be found among those results in more than 70% of cases. This confirms preliminary findings by others.
17 Wikipedia had a higher average position than any other reference in this study. Our findings on resources other than Wikipedia confirm previous findings using Internet audience measurement services, which did not include Wikipedia's medical content.
20 Wikipedia ranked higher with quality articles, although this is not necessarily a causal relationship since these quality articles covered more common health topics, and we have observed that Wikipedia was more prominent among search results for common health terms in some categories. Wikipedia's good results for rare diseases compared to other online health resources also suggest that it has articles on a wide range of conditions. The results pertaining short- and long-term epidemiological influences on article traffic create a link between search engine results and page viewing. Others have previously observed the relationship between search engine activity and news coverage.
21 A study on Google Flu showed that search engine queries related to influenza-like illness correlated with the epidemiological data from the United States Centers for Disease Control and Prevention.
22 These findings were replicated for queries submitted to a Swedish medical Web site.
23 We believe that these studies support our assumptions that firstly, Internet activity can be used as a surrogate marker for consumer behavior, and secondly, that online health information seekers often use search engines to find individual health Web sites, which underscore the importance of Wikipedia as a prominent source of information in such searches.
Our study has several strengths. The software tool we used allowed us to check a large set of keywords on multiple search engines while avoiding observer bias. The use of a broad set of keywords from governmental online health information initiatives was important to avoid selection bias, and additionally it might make these data useful from a policy-making point of view; we provide data that might be relevant to the search engines position of the government-sponsored health information Web sites MedlinePlus and NHS Direct Online.
However, our study design does have limitations. We did not perform a weighted analysis based on often-used health-related keywords (such as “Diabetes”);
3,17 so in our study, each keyword was given equal importance. Some keywords were listed together with their abbreviations, which results in multiple counting. Nevertheless, this could mimic how people use search engines, with some people using an abbreviation while others might know the full term. Indeed, consumers appear to differ widely in the queries they use to find specific information,
24 which also limits the generalizability of our search terms. Because we wanted to avoid selection biases, we also retained keywords that returned several non-medical Web sites as search engine results (for words like “Walkers” or abbreviations like “CFS” for chronic fatigue syndrome), which favors Wikipedia because it contains more than just medical information. Unexpectedly, the conditions listed by the NORD contained some fairly common disorders (see Methods); we did not remove these, again to avoid selection bias. We made a personal selection of commercial, non-profit and governmental Web sites based on manual searches on Google for comparison to Wikipedia, but our list is by no means exhaustive and has the serious drawback of possible selection bias. However, the software allows storage of the data set and post hoc analysis for additional domains. We also created several clusters to pool the impact of a single content provider who might use multiple domains; we cannot completely exclude that some less prominent United States government Web sites were not listed and might not have been counted, although we believe that any such Web sites were unlikely to have a major impact on the results. It should also be noted that we did not study sponsored search engines results, which might influence consumers. We have tried to make a simple dichotomy (quality vs. non-quality articles) based on Wikipedia's community article rating system, but we emphasize that this is a system that has not yet been externally validated as a true measure of quality (compared to expert review). The examples we have provided of real-life epidemiological changes correlating with page views of the relevant Wikipedia article are illustrative, although this remains indirect evidence. There may be other seasonal disorders or pathogens that do not follow this pattern, and disease outbreaks that do not result in increased article traffic. The latter examples may be confounded by Wikipedia's role as a source of news, as disease outbreaks straddle the border between health information and news.
With regards to the generalizability of these results, it should again be stressed that not all online health information seekers are patients, and that not all patients seek health information online. Obviously, this study says little about consumers with a native language different from English
25 or using search engines popular in other countries (like
http://Baidu.com in the People's Republic of China and
http://Guruji.com in India). In this aspect, the results from British versus American English keywords are not mutually exchangeable. Furthermore, this Internet study provides no evidence on the level of trust that patients assign to the health information they read on Wikipedia. However, a recent survey indicated a shift of priorities for both non-professional and professional Internet users from trustworthiness and accuracy of information to availability and ease of finding information.
5 Finally, differences could exist between medical specialties with regards to the importance of both general health information Web sites and Web sites devoted to a specific topic.
Although several medical scientists and policy makers have highlighted the potential use of wikis to foster collaboration on easily-accessible health information for the community,
26–30 and Wikipedia is the most prominent example of a wiki, we could identify no previous research specifically focusing on Wikipedia as a source of health information for consumers. Thus, it appears that Wikipedia provides an important area for future research on sources of online health information. However, we also found misconceptions about Wikipedia in the scientific literature: for example, a recent study examining search engine results for obstetric queries misclassified it as a commercial instead of a non-profit Web site.
31 Importantly, the open editing policy offers no way of assessing the expertise of contributors, resulting in fears of inaccuracies. This may be one reason why doctors are creating wikis where only they can contribute (such as
http://Ganfyd.org,
http://RadiologyWiki.org or
http://WikiSurgery.com).
32–34 Instead of creating new wikis, Wikipedia itself could be used by doctors, as well as patient groups and associations, to collaboratively edit articles on the topics they value.
17,35 As we have shown here, these articles are among the top results on general search engines, thus providing a free platform to disseminate information globally. Until now, doctors have lagged behind biomedical scientists in realizing the potentials of wikis.
18,36–51 However, in Feb 2009, Medpedia, Inc, in collaboration with several prominent medical faculties, the NHS, the American College of Physicians and other partners, launched its open access medical encyclopedia running on the same software and under the same license as Wikipedia. It will have content for the public as well as for experts, and allow for discussion of the subject. Contrary to Wikipedia where anyone can contribute regardless of qualifications, only experts are allowed to contribute to Medpedia (although others may suggest changes), which might alleviate quality concerns. We believe this wiki may address some of the concerns that discourage the medical community from contributing to Wikipedia.
Although this Internet study showed that Internet consumers are likely to be exposed to Wikipedia through search engine results for health-related keywords, examining quality of health information present in Wikipedia was beyond the scope of this article. Thus, further studies are urgently needed to determine whether Wikipedia articles are of sufficient quality to support patient-provider communication. Assessment of the quality of Wikipedia's freely editable content is difficult, since its articles are inherently in a constant state of flux. Examples of flagrant mistakes have been reported in the media, as well as an analysis that found that Wikipedia contains a similar numbers of mistakes compared to Encyclopedia Brittannica.
52 Clauson et al (2008) compared drug information from Wikipedia to the Medscape Drug Reference, and concluded that although Wikipedia was less complete (especially regarding dosing information, which is explicitly discouraged by Wikipedia guidelines), no factual errors were found, and it “may be a useful point of engagement for consumers” for supplemental drug information.
53 Although articles in the English Wikipedia are increasingly being referenced with articles from leading scientific journals,
16,54 and articles have been shown to improve over time,
53 Wikipedia itself makes no claim to correctness, and the medical disclaimer aptly describes the situation: “Wikipedia contains articles on many medical topics; however, no warranty whatsoever is made that any of the articles are accurate.” However, while consumers appear to rarely check the source and quality of the information they find online;
4,9 at least with a well-known brand like Wikipedia, they know that they should remain skeptical. Indeed, while Wikipedia contributors have classified over 14,000 of their articles as dealing with medical topics: only around 50 of them have been confirmed as top quality (“Featured articles”).
55 This implies that Wikipedia is still a long way from achieving the idea of its founder Jimmy Wales, who imagined a world where every human being had free access to the sum of all human knowledge in his or her language.
56 Maybe doctors, like researchers, “should read Wikipedia cautiously and amend it enthusiastically”,
57 thus fulfilling the proposed new professional obligation of making their knowledge and expertise freely available on the Internet.
58