|Home | About | Journals | Submit | Contact Us | Français|
An increasing number of people search for health information online. During the last 10 years various researchers have determined the requirements for an ideal consumer health information system. The aim of this study was to figure out, whether medical laymen can find a more accurate diagnosis for a given anamnesis via the developed prototype health information system than via ordinary internet search.
In a randomized controlled trial, the prototype information system was evaluated by the assessment of two sample cases. Participants had to determine the diagnosis of a patient with a headache via information found searching the web. A patient’s history sheet and a computer with internet access were provided to the participants and they were guided through the study by an especially designed study website. The intervention group used the prototype information system; the control group used common search engines and portals. The numbers of correct diagnoses in each group were compared.
A total of 140 (60/80) participants took part in two study sections. In the first case, which determined a common diagnosis, both groups did equally well. In the second section, which determined a less common and more complex case, the intervention group did significantly better (P=0.031) due to the tailored information supply.
Using medical expert systems in combination with a portal searching meta-search engine represents a feasible strategy to provide reliable patient-tailored information and can ultimately contribute to patient safety with respect to information found via the internet.
The number of people in the US and Europe that search for health related topics on the internet has steadily grown in recent years. The latest studies revealed that 61% of American adults  and 54% of European grown-ups  are so called “e-patients” . To gather health information, e-patients use either search engines or portals to locate relevant information .
The success of an internet search depends crucially on creativity, knowledge, and education. These foundations are necessary to build successful search terms . Poorly selected search strategies and terms will discover only the surface of the available information, such as the first available and likely not the best information would be found . Another study confirmed the importance of keywords. In this study, keywords are identified as a potential barrier to accessing health information, especially for users with substandard education . Further, poor spelling skills, which are a widespread problem, are also an important factor because they often leads to incorrect or no search results . Search engine features that prompt users with alternate spellings, such as "Did you mean…", may be helpful in some cases but may also be counter-productive since common terms not intended by the user could be suggested. In addition, many patients do not know the exact name of their illness and therefore are not able to put the relevant keywords into a search engine .
Without creativity building search terms or the knowledge of the functionality of search engines or experience with the use of quality controlled information, and without sharing information with people who are familiar with health care the use of the internet to gather health information presents the user with incomplete or incorrect information. The risk of misinformation is particularly damaging as it could jeopardize the relationship between patients and their doctors . The ability to generate information on the internet involves three essential components: information search, information retrieval, and information verification. The search in this context stands for the craft and technical aspects, such as how a specific search engine is to be operated. Retrieval refers to the content, with which the search will be elaborated. Verification, finally, is the process of excluding misunderstandings and incorrect information .
Even if a search term is properly entered, search engines only list an extract of the information available . In the fullness of hits for each keyword, the most inexperienced users will not notice that important information is not listed or has been pushed to the end of the list by search engine-optimized or paid offers . Another troubling aspect is that well over half (62%) of search users do not distinguish between information and advertising . In a survey, 25% of those who were looking for health information on the internet were overwhelmed by the quantity of information, 22% were frustrated because the information they sought could not be found, 18% found the information confusing, and 3% knew a person that suffered damages by the use of online health information .
Though search engines can quickly identify new sources of information, but they are not capable of analyzing the meaning of content. For this reason irrelevant false or misleading information often appears at the beginning of a hit list . It is therefore useful to query multiple search engines. In this context, a meta-search engine like Metacrawler.com , which accesses various search engines, produces more diverse, and thus potentially better, results.
Unfortunately, the users’ query is usually not well specified or accurate enough to gain the information desired ; thus, the search process often fails or takes much more time than expected, resulting in a frustrated and dissatisfied user. In addition, there is not only an information overflow, for example, searching for ‘headaches’ with Google results in more than 16 million hits, but also misleading, outdated, and even false and life-threatening information available .
Since both diagnosis and treatment are always associated with uncertainty, identifying appropriate health information is a challenging process. Information may be adequate in one situation, but inaccurate in another. In addition, health information is often not prepared for laymen, but rather for experts . Moreover, the internet is still only available for a limited part of the population. Access to the media and the literacy is distributed unevenly in dependence on age, gender, and social position of users. This could mean an inequality in the provision health information .
This inequality is also described by Debatin who mentions media literacy, time budget, and level of education as key factors in internet search success . The demand for health information from the internet is highest among women from 30 to 49 years with higher education and with more than 5 years of internet experience . In this context, the results of Harbor and Chowdhury appear contradictory, since they noted in their study that only 2% of the population and 30% of students had problems in finding the required health information. They explain these results by suggesting that students asked more specific questions, required higher quality results, and were unwilling to except poor quality information, such as advertising or unevaluated pages .
Heinlen et al. evaluated 136 online offers regarding the adherence to quality criteria. It was found that very few providers adhered to the ethical codes and standards of the National Board for Certified Counselors (NBCC). Indeed, there was no single provider that complied with all 12 revised standards. Also, 49 providers (35%) had not attended the necessary training or had no formal approval. Finally, data protection had been largely neglected, with only 22% of providers using encrypted communications .
Debatin finds it difficult for the user to assess the credibility and veracity of a website because no current indicators for these qualities have been established on the internet . The Health on the Net Foundation tries to establish the HON Code of Conduct . The same applies to the transparency criteria of the German Health Information System Action Forum (afgis) . Both institutions award a quality seal, with which the operator of a website agrees to comply with these institute’s criteria. Validating the adherence of websites to these quality control criteria is difficult to assess, making these quality seals for websites generally controversial . Baur and Deering also argue that quality labels alone are not sufficient to ensure the quality of a site and its contents. The presence of seals of approval and certifications without rigorous verification can mislead consumers and provide a false sense of security . Forsström and Rigby identify the problem in the lack of periodic review, which allows sites that were previously awarded seals to later contain misinformation that goes without re-evaluation or revocation of the seal .
Thus, the case for quality management of internet health information is evident . Some portals try to overcome these problems by providing reviewed and assessed information. They incidentally attach quality labels or ratings to support transparency. This has the potential to ensure quality of health information on the internet, but as Jadad and Gagliardi assert, there is no agreed upon standard for assessment and labeling. They identify 47 organizations providing means of quality assessment or labeling for health information on the internet in 1998 . Four years later, they found this number increased to 98; however, only nine of the original 47 still existed. Further, in 1998, 14 of the 47 organizations published their assessment criteria, but that number dropped to 11 of 98 by 2002 .
Additionally, portals usually do not exchange this valuable information, requiring the user to search more than one search engine to find information of interest . Governmental meta-search engines like the American healthfinder.gov, the Australian www.healthinsite.gov.au, the British www.library.nhs.uk, or the French www.has-sante.fr can overcome this problem but still leave the user with an unspecific information demand.
In 1999 the term “cybermedicine” was coined to describe a branch of consumer health informatics that explores the information demand of users and patients and consequently implements information systems that support them in disease management, prevention, and health promotion . Since then much research has been done to find ways to determine user demand, support successful search strategies, and develop mechanisms to ensure only quality information is provided to users. The result of this past research is a list of requirements describing a system. Unfortunately, such a system has yet to be developed.
Several researchers have weighed-in on the necessity of changes to internet health information searches and proposed some ideas to enact these changes. Eysenbach proposes the use of electronic questionnaires to develop a collection of user-specific information, resulting in a tailored information supply. For the future, he sees intelligent software agents, supplying the user with relevant health information based on the contents of the patient’s web-based health record . Hurrelmann and Leppin think that improving the quality of the health information supply is a crucial challenge . Jordan calls for the development of tools and procedures to deal with the overwhelming flood of information .
The higher efficiency of tailored, computer generated information has already been shown . Mühlbacher, Wiest, and Schumacher postulate a goal-oriented, user profile-based information supply . This approach is also supported by Köhler & Hägele who think that tailoring the content to the patient is particularly important. This is both a basis for self-management and a way to increase adherence to treatment . Goldsmith and Safran also see a positive impact on trust, the relationship between physician and patient, and confidence regarding medical treatments resulting from tailoring health information to the patient’s needs. They call for interactive tools that help patients dealing with sickness and health support . Brennan sees indeed major innovations in the field of e-health associated with an increased understanding and better control of patient’s health, but the potential for innovation that is implemented remains limited. She aims at a rapid and comprehensive implementation, especially in the use of computer systems for patient support . Deshpande and Jadad think that the technical requirements for this have already been met. They describe their idea of a global personal health information center, a web-based tool to support personal health management . In their latest work they state that patients already have started to work on this problem using web 2.0 technologies . Various scientists do research in this area [37, 38].
The first step was completed in May 2008, when Google launched its service ‘Google Health’. Registered users can store their online health profile with diseases, allergies, medications, etc. in an electronic patient record. They also have access to different portals and health clinics. However, there is no module for support in the search for health information. Coiera, who already recognized in 2003 that the computer has strong limitations, goes one step further to require an understanding of the semantics of web sites. He calls for the development of the semantic web, which supports intelligent programs or software agents in processing the semantics of websites on the basis of information on information, so called meta-information . Eysenbach also proposed the semantic web, e.g. as a way to improve the performance of search engines, stating, "Search engines will not only better 'understand' what a user is looking for, but also what the web pages they are indexing are about" ( P.220).
To enhance patients’ communication skills Cegala and Broz emphasize the importance of information seeking and verifying . In this context the question arises whether structured and guided user initiated systems give better results for the users. This promising approach has been proposed for a long time by different scientists but has yet to be implemented [18, 29, 39].
The demonstration that expert systems are capable of better informing patients and thus contribute to better disease prevention and patient cooperation with the health system was accomplished in 1995. At that time a DOS-based system was developed that carried out an anamnesis of migraine patients and then informed them about the background of the specific category (diagnosis) from which they suffered. A major drawback identified by the users was the lack of information about specific topics due to an incomplete knowledge base, a drawback which could be dealt with by incorporating new findings .
But there is an acceptance problem regarding expert systems. The validity of expert systems should not only be demonstrated, but the user must also be convinced of it. This requires a thorough documentation of the completeness and correctness of the knowledge base and an explanation component that informs the user about the nature and the way in which the system draws conclusions. These requirements are, however, only met by a minority of medical expert systems . Additionally, manual data input often is necessary, which is assumed to be a source of errors even though it is the physician himself entering the data. Furthermore, many expert systems do not have a user friendly or intuitive user interfaces and output styles that are too complex for typical patients . This assessment is confirmed by another study showing that the expert system diagnosis process took longer than the one by the physician .
For this study a prototype website has been developed . It is directed toward German speaking adults searching for information about headaches. Applying methods of artificial intelligence a frame-based expert system is used to determine the patients’ information demand. The prototype was realised by the web-based information system depicted in Fig. (11). The HTML web-interface guides the user through the search process by querying the information demand. This is done by an integrated expert system implemented with the web programming language PHP supported by a SQL database. The PHP code provides the HTML sites and forms to be presented to and, if applicable, to be filled in by the user. The expert system is also implemented in PHP and uses a rule based inference to determine the diagnosis. This is based on frames stored in the SQL database. The user management works analogue to this: the front-end is provided by PHP and the user data is stored in the SQL database.
Depending on the results of the expert system, an assortment of information according to the IHS classification  is gathered from portals and other trustworthy sources. This is done by a meta-search with a list of reliable websites that hold the quality seal of HON  or afgis . The set items are then arranged by relevance and labelled quality. The results are finally presented to the user. The information system was intensively evaluated and tested. A detailed description of the development and evaluation has already been published .
The aim of this study was to figure out, whether medical laymen can find a more accurate diagnosis for a given anamnesis via the developed prototype health information system than via ordinary internet search. This research is supposed to contribute to the overarching question whether expert system guided internet meta-search provides a better information supply for patients seeking health information online than this is possible using ordinary search engines or health portals. The research did not investigate the influence of either ethical or legal aspects of the internet health information supply.
A study was developed to assess the excess value of the prototype. In doing so the influence of the independent binary variable ‘way of internet research’ (either by prototype or by established means like search engines or portals) on the dependent variable ‘quality of results’ is examined with an intervention and a control group.
Basically, the study was designed as follows: participants are randomly allocated to the intervention or control group. Then, a pre-filled-in anamnesis form is given to the participants by a fictitious male close relative who asks them to search the internet for the specific kind of headache he suffers from. In the intervention group the prototype information system is used. The control group uses common search engines or portals. By comparison of the proportion of diagnosis matching the pre-determined diagnosis in the treatment and control groups the excess value of the prototype is determined. The study design is depicted in Fig. (22).
Headaches have been chosen as the anamnesis - the history and symptoms of the complaint - is deemed to be of substantial importance for a successful diagnosis as compared to physical examination . Moreover, headaches can be classified by an internationally agreed system released by the International Headache Society (IHS) . The symptoms are to be input via a drop down input box and can be specified on two levels of the IHS classification, group (level 1) and headache type (level 2). The study is conducted in two parts. Part one concentrates on a very common kind of headache, the “Frequent episodic tension-type headache” (IHS 2.2) . The second part focuses on a more seldom and complex kind of headache, the “Medication-overuse headache” (IHS 8.2) .
To determine other influences beyond the independent variable some control variables are introduced. First of all, participants are asked to complete socio-demographic details like age, gender (1=female, 2=male), and education. Additionally, participants have to specify the general frequency of internet usage, search for health information in the media (except the internet), search for health information on the internet, the number of studies they have already taken part in, and how they felt during the study. Moreover, the time they spent from the beginning of their participation in the study to the end is measured. To identify participants sticking to their first impression an estimated diagnose is requested right at the beginning of the study. For evaluation purpose only the internal diagnoses of the expert system is stored.
Approximately 1,000 non-medical students and employees of Bamberg University have been personally approached by the study staff to participate. Due to the fact that participation in the study was time-consuming (20 min on average), only 140 persons participated in the study. As the majority opposed to participation, the study could be classified as self-selecting. Due to the randomization this should not be a problem for the validity of the results.
The data has been analyzed with PASW (SPSS) version 17. In the descriptive portion, frequencies, histograms, means, medians, maximum values, and minimum values were determined. Histograms were plotted for selected variables. To evaluate the correlation of the control variables, a test according to Spearman was conducted due to the ordinal characteristic of most of the variables. The evaluation of the statistical significance was done with a χ2 test and Fisher’s exact t-test. This is most appropriate statistical analysis due to the two trial groups and the ordinal characteristic of the dependant variable. Fisher’s t-test could only be used in the second part of the trial because the dependant variable in this part is dichotomous.
A total of 140 (60/80) participants took part in the two study sections. Seventy-one were female, 63 were male, and 6 did not specify their gender. Ages varied from 19 to 61 with a mean of 23.35 years, a median of 22 years, and a standard deviation of 4.9 years. Eight participants did not provide their age. The median of the highest education level was the German University entrance qualification (“Abitur”, 113/133); seven participants did not specify their education level. One hundred eleven participants reported daily internet use, five reported weekly internet use, and 24 did not report internet usage. Approximately 45% seldom (median) read (in books, journals, or newspapers) or watched (on TV) health-related information. The same applied to searches for health information on the internet. The number of studies the participants had taken part in (“study experience”) differed from 0 to 50 studies with a mean of 2.69, a median of 1, and a standard deviation of 8.5.
As Fig. (33) shows, participation time varied from 3 to 38 min with a mean of 16 min, a median of 15 min, and a standard deviation of 7 min. Table 11 depicts the statistical analysis of the participation time for the two study sections and the two groups.
The participants were also asked about their impression of the study, with ”1=a burden”; ”2=too complex”; “3=OK” ; and “4=good experience” as the possible answers. The majority of participants (87) rated their participation as ”OK”. “Too complex” and “a burden” each were rated 7 times. Thirty-seven participants thought the study was a “good experience”. Two participants did not rate their participation in the study. Table 22 gives an overview of the main study results.
In the first part of the study, the participants had to diagnose an “Episodic tension type headache” (IHS 2.2) . In each (the intervention and the control group) there were 30 participants. Most participants rated the participation as “OK” (Ø = 3.08). The participants of the intervention group rated slightly worse (Ø =2.90) than the ones in the control group (Ø =3.27).
In the intervention group, 10 (33%) diagnoses were correct, while 24 (80%) were correct at least on the first level of the IHS classification. The internal diagnosis of the expert system was correct on the second level in 25 (83%) cases. Five participants gave a false diagnosis on level 1, even though the information system presented the correct information. Two of these chose the same diagnose that they had already presumed at the very beginning of their participation. The diagnosis of 12 participants was correct on the first level but not on the second.
Despite a wrong diagnosis by the expert system once on the first level and once on the second, the two participants provided a correct diagnosis on the second level. The participant with the wrong Information on the first level provided by the system kept his presumed diagnosis. The mean participation time in the intervention group was 18 min and 29 sec; those with the correct diagnosis had participated on average 20 min and 7 sec and those who were correct only on the first level had a mean participation time of 19 min and 35 sec. Participants who provided an incorrect diagnosis had participated on average 14 min and 4 sec.
In part I, 10 (33%) diagnoses were correct in the control group. At least on level one, 22 (73%) diagnoses were correct. The mean participation time in the control group was 12 min and 55 sec; those with the correct diagnosis had participated on average 16 min and 36 sec and those who were correct only on the first level had a mean participation time of 13 min and 42 sec.
Participants who provided a wrong diagnosis had participated on average 10 min and 13 sec. All significant correlations are depicted in Table 33. Relevant for part I is the weak correlation of group with participation time (r=0.433) and with study rating (r=-0.267) and the weak correlation of study experience (r=-0.257) and participation time (r=0.292) with the correct diagnosis. The statistical significances for both study sections are depicted in Table 44. The number of correct diagnoses is the same in the intervention and control group. The difference of the correct diagnoses on the first IHS level is not statistically significant (χ2: P=0.542).
In the second part the diagnosis “Medication-overuse headache” (IHS 8.2)  was to be determined. As the IHS classification uses the same parameters of analgesics overuse also for “Headache as an adverse event attributed to chronic medication” (IHS 8.3)  this diagnosis was also be considered entirely correct. In each the intervention and the control group there were 40 participants. In part 2, all diagnoses were either correct or incorrect. No diagnosis was correct only on the first level. The most frequent rating of the participation was “OK” (Ø = 3.14). The participants of the intervention group rated their experience slightly worse (Ø =3.00) than the ones in the control group (Ø =3.28).
In the intervention group, 19 (41%) diagnoses were correct. The internal diagnosis of the expert system was correct in 20 (50%) cases. One participant diagnosed incorrectly although the recommendation of the expert system was correct. The mean participation time in the intervention group was 20 min and 42 sec; those with the correct diagnosis had participated on average 22 min and 22 sec. Participants who provided a wrong diagnosis had participated on average 19 min and 11 sec.
In the control group, 10 (25%) diagnoses were correct. The mean participation time in the control group was 12 min and 43 sec; those with the correct diagnosis had participated on average 13 min and 35 sec. Participants who provided an incorrect diagnosis had participated on average 12 min and 26 sec.
Relevant for part II is the moderate correlation of the group and the participation time (r=0.592) which also correlates weakly with the correct diagnosis (r=0.226). The difference of correct diagnoses in the intervention and control group is statistically significant (χ2: P=0.036, Fisher’s one-sided: P=0.031).
This study evaluated an information system based on an expert system and a meta-search of quality controlled websites. It has been shown that the user’s demand could be determined by the system [16, 45]. The system meets most of the requirements proposed in past studies [8, 10, 12, 18, 29, 30, 31, 33, 34]. The drawback of single source searches [11, 29] is addressed by the meta-search. As the search terms are already implemented the problem of creativity [4, 6], strategies , and literacy [5-7, 9] are also solved. The advertisement issue [12, 13] is addressed by a filter and the validation [5, 9, 10, 17, 20] is done by only providing quality controlled websites characterized by the seals of HON  and afgis . However, regarding the quality seals one has to keep in mind that their effectiveness and long-term validity have been called into question [23-25, 27, 28].
Concerning the study, a major difference is seen in the results depending on the complexity of the task. Very common headaches were easily diagnosed via internet research by non-medical university students and employees without further assistance. This is in line with the findings of Tang and Ng who found in their study that in 15 out of 26 diagnostic cases (58%) medical doctors could determine a correct diagnosis by searching with Google . A complex scenario made it more difficult to diagnose both for the control and for the intervention group. But the information system provided a significant degree of support to the user. Thus, the problems of information overflow, misleading information, or advertisement were not as strong as in the control group. This confirms the findings of Buchanan et al. that expert systems could be capable of better informing patients. In addition, the problem of the limited knowledge base has been addressed .
On the other hand it is evident that in part II the internal diagnosis of the expert system was only correct in half of the cases. This was due to false answers by the participants in the expert system dialogue. This could have been possible either due to disregarding the anamnesis form of the patient or because the questions of the expert system were too complicated. The latter was not a problem during the testing of the system . However, this leads to a discussion about the convenience of the user interface relative to the perceived complexity of the tasks. The problem of a user friendly and intuitive user interface and simple output still remains to be addressed .
The correlation of the group and the participation time can be explained by the character of an expert system. As stated before, the data input phase is time consuming . This fact appears to have contributed to the poor ratings in the intervention group as compared to the control group, assuming that time is precious for both university students and employees. Two users even did not believe in the expert system’s advice but preferred their gut instinct. This confirms there is still an acceptance problem using expert systems .
Issues that have not been addressed are the usage of semantic web technology [8, 18] and the expert language of the provided websites . The latter could be an explanation for the six participants finding a wrong diagnosis although only websites describing the correct diagnosis were provided.
Some effort has been invested to conduct a study representing a more diverse population than university students and employees. Unfortunately, it turned out to be very difficult to recruit enough participants to attain a representative sample . Additionally, headaches are only one disease category, leaving the question of whether the described technique could be applied for other diseases as well. In the area of headaches only two types of headaches have been chosen. It can only be assumed how the information system would perform for other types of headaches. Another limitation was the recording of the time spend on the study by the participant. It would have been better additionally recording the time spent on the expert system dialogue (for the intervention group only) and on the explicit search. From the experiences of the 20 beta testers it can only be said that the dialogue lasts on average more than 15 minutes. This leads to the assumption that the actual search time in the intervention group was much shorter than in the control group.
Many of the requirements for a consumer health information system are met by the prototype information system. The user demand is determined and the supplied information is tailored and quality controlled [8, 18, 29, 39]. The approach using medical expert systems in combination with a portal searching meta-search engine to provide reliable patient-tailored information could fill the long lasting gap in user information supply. For the first time the desired information is determined by an intelligent system guiding the user to find the desired information.
This raises the question of how such a system could be generated. Three ways appear to be feasible. First, the development is done in the context of research projects at universities. Second, the information system is developed by a syndicate of companies or institutions perhaps granted by the government. And third, it is to be developed in a web 2.0 project [34-37] analogous to Wikipedia. In either case quality assurance is crucial. Regarding the acceptance problem and the one of an appropriate interface, users should be more involved in the design process of the expert systems. Additionally, semantic web functionality  could be implemented in the future.
Physicians should be aware of the fact that patients have possibly diagnosed there own disease via internet research even if they are medical laymen. When the positively evaluated system described in this paper will have been enhanced, extended and made available on the internet, it is even more likely that patients visit the practitioner with a clear idea of their sufferings and possible treatments. Co-operative practitioners can use this patients’ knowledge to improve compliance.
We are grateful to the students and employees of Bamberg University who participated in this study. We also thank Marie-Antonia Wolf and Holger Neujahr for their support in preparation of the information system and Harald Meyer for his help with the statistical analysis. For her improving comments and the fruitful discussions we also thank Ivonne Honekamp. We finally thank the reviewers for their supporting comments.