Twelve middle school students and high school students in southeast Michigan participated. Students ranged in age from 12 to 17 years old, with a mean of 14 years. Half of the students were female. Of the 12 students, 7 were white, 2 were African American, 1 was Indian American, 1 was Hispanic, and 1 was Asian American. Of the 12 students, only the 6 oldest students had searched for health information on the Internet before. The variation by age is consistent with other findings that youth age 15 to 17 years are significantly more likely to have looked up health information (32%) than youth age 12 to 14 years (18%) [23
]. All of the students, however, had computers and access to the Internet at home. Students reported using a computer from 1 hour per week to 3 hours per day, with a mean of 12.3 hours per week.
Eleven students attempted all 6 searches, while the remaining student attempted 3, for a total of 69 searches. One search was not included since the Internet connection was not working properly, making a total of 68 searches that were analyzed. Searches took an average of 5 minutes and 41 seconds, ranging from just under a minute to nearly 24 minutes. This time frame is essentially the same as Eysenbach recorded for adults [15
]. Although direct comparison is inappropriate since different questions were asked, the similar order of magnitude is suggestive.
Overall Search Strategy
As students thought aloud, the researchers got a sense of what students were looking at on each page. Students seemed to skip around a lot, and didn't skim results pages or specific Web sites in any methodical or thorough ways, sometimes missing links or text that contained the answer to questions. This is also consistent with findings from non-health-related searching behavior as summarized in Hsieh-Yee [24
Students used multiple methods to locate Web sites that they believed contained answers to the 68 questions. In 60 cases, the student started looking for an answer by visiting a search engine and entering in a search term or phrase. In 2 cases, the student started by selecting from directory menus (eg, choosing the topic health
). In 6 cases, the student started by entering a URL (other than a search engine) directly into the browser address bar. In total, there were 215 attempts to access non-search-engine or directory Web sites. Nearly all of these attempts were made by following a link from a search engine either after a search or through the use of a directory. Of the 215 attempted site visits, 4 were broken links, 3 were blocked by the filters utilized at certain schools, and 5 were PDF files (read by Acrobat Reader) which students either could not download or chose not to download because downloading was too slow. This left 203 sites that were viewed with an average of 1.8 pages viewed per site. The distribution of pages visited per site is shown in . Note that the distribution is roughly consistent with a power law as observed in previous studies [25
]. At a reviewer's request, this data was looked at on an individual student level. Students varied a great deal in the total number of visited sites. Eleven of the 12 students went only 1 page deep on the majority of visited sites. Although the individual-level data is not large enough to analyze more rigorously, the power law seems to operate on an individual level as well as the aggregate level.
Distribution of pages viewed per site
Even when students found a Web site that contained the answer to a question, they did not always find the answer. One example is the Alcoholics Anonymous site [26
] where 8 of the 11 students ended up while searching for a local meeting. Although there was a link to a site that contained local information, only 3 of the 8 students were able to find the link, 1 of whom only found it on the second visit to the Alcoholics Anonymous site, after viewing a total of 16 pages within the site. Similarly, 6 of the 11 students who searched for whether or not Paxil causes drowsiness visited the official Paxil site [27
]. Only 3 of the 6 students were able to successfully answer the question based upon the information they found at the site. Two of them failed to find the list of side effects and 1 of them found the list but did not understand it enough (or read it carefully enough) to answer the question correctly.
Search Engine Tactics
Seven search engines were used, including 2 meta-search engines (Dogpile and Locate.com). The meta-search engine Locate.com offers the user a number of search engines to choose from. Searches performed from the Locate.com Web site that utilized another search engine (eg, Yahoo!) are reported as if the search occurred on the destination search engine (eg, Yahoo!). summarizes the number of times that a particular search engine was used. If a search engine was used multiple times while searching for an answer to the same question, it is only counted once. Because students occasionally switched search engines while trying to answer the same question, there are more searches using a search engine (79) than there are attempts to answer questions (68). In total, 6 of the 12 students used only Google, 1 used only Yahoo!, and the remaining 5 changed search engines at some point.
A total of 132 search phrases were entered into the various search engines. Only 104 of those search phrases were unique. The most-frequent 2 phrases used were "diabetes" and "Paxil," each of which had 5 occurrences. There was an average of 3.6 words typed in per search phrase and 80% of the time there were 4 or fewer words per search phrase.
Of the 132 search phrases, 30 contained at least 1 word that was misspelled (eg, "tatoo," "Alchoholics," or "smokeing"), despite the fact that students could read the correctly-spelled word on the index card containing the question. Some search engines (eg, Google) offer a feature that recommends an alternate search string with the correct spelling of a word. For example, if a student typed "alchoholics anonymous," the first page of results began with, "Do you mean 'alcoholics anonymous?'" Students were offered a new search string with correct spelling on 15 separate occasions, but only noticed and used it 6 times. The remainder of the times they used the results that were offered for the incorrect spelling. Of the 7 students who were offered corrected spelling suggestions, only 2 ever used them.
Once a search string was entered into a search engine, students varied in the number of results pages that were viewed. Students viewed only the first results page 78% of the time and 4 pages or less of results 93% of the time. Because search engines report a different number of links per page of search results, reports how often links were selected from the first 10 results, the second 10, and so on. Only 3 blocked links were encountered during all of the searches, suggesting that blocking software did not have a significant impact on these results.
Distribution of search-result links viewed
Successful Searching Characteristics
Of the 68 questions that students attempted to answer, 7 searches were abandoned after the student gave up or, in 2 cases, when the class period ended. Of the remaining 61 searches, 47 were successful in finding a complete, correct, and useful answer to the health question and the remaining 14 were unsuccessful. Six of the unsuccessful answers were completely incorrect and not useful, 4 were useful but only partially correct, and 4 were fully correct but not useful.
Several factors contributed to the success of finding a correct, complete, and useful answer. One important factor was the individual who was performing the search. Although every student answered at least 1 question correctly there was wide variation in the number of correct answers. Two students successfully answered 6 out of 6 questions, 3 students successfully answered 5 questions, 4 students successfully answered 4 questions, and the remaining 3 students only successfully answered 1 or 2 questions. While our sample of students was too small to draw conclusions from, no distinct patterns were observed that would indicate that race, gender, Internet experience, or health searching experience were significant determinants of success. However, the older adolescents (16-17 year olds) were successful 87% of the time (26 of 30) as compared to 68% (21 of 31) for the younger adolescents.
Another important factor was the difficulty level of the questions themselves. shows the failure rate for each question. The 4 partially-correct answers were split evenly between the Alcoholics Anonymous and tattoo questions. All 4 of the correct but not useful answers resulted from the HIV test question.
Unsuccessful searches by search topic
Certain search actions led to sites that contained the answer more often than others. Overall, students found answers on 22% of the sites they accessed (47 of 215). They accessed sites in 5 ways. Although not often taken, the action with the highest probability of success (47%; 7 of 15) was following a link from 1 non-search-engine site (eg, www.aa-intergroup.org) to another site (eg, www.alcoholics-anonymous.org). In most of these cases, the student accessed the first site directly from a search engine. Clicking on search engine results led to a site where students found an answer 21% of the time (35 of 166). Success rates were similar for following a recommended link from a list or menu provided by the search engine (18%; 4 of 22). Directly typing in a URL, bypassing search engines entirely, was successful only 9% of the time (1 of 11). A sponsored link from a search engine was followed only once, and the student found an incorrect answer on that site.
Another contributing factor related to success was misspelling of search terms. Of the 14 completed but unsuccessful searches, 29% (4 searches) had at least 1 misspelling compared to only 15% (7 searches) of the 47 successful searches. Perhaps even more telling, both successful and unsuccessful searches with misspellings took students 1.5 minutes longer on average than searches without misspellings. Observations confirmed that some students were unable to find an answer until they discovered and corrected their misspelling, resulting in higher quality and more-relevant results.
Other search characteristics did not have statistically significant impacts on whether searches were successful, although this may have been due to small sample sizes. For example, the search engines were not significantly different in their percentages of successful searches. Similarly, the average number of words per search string was not significantly related to search success rate. (Data not shown.)
Certain common behaviors of the adolescent searchers were observed which were not apparent from the quantitative analysis.
First, the students were very comfortable and confident while searching online for health information. Most students knew where they wanted to start the search and navigated using quick mouse clicks and shortcut keys. However, this characteristic was likely over-represented in our population due to their strong academic performance and Internet proficiency.
Second, several searchers did not take much time in formulating a search strategy or (when applicable) choosing search terms. Instead, these searchers seemed to type in the first search string that came to mind. If the results were not what were anticipated, another search string was typed in, sometimes without even clicking on any results from the first search string. The overall approach was a trial-and-error method with frequent backtracking. The most-common problem with search strings was that they were not specific enough. For example, 2 different students typed in the search string "hiv" when looking for a place that administers free and confidential HIV tests.
Third, most students quickly scanned pages, jumping from place to place within a page, rarely reading an entire paragraph. In some cases the answer to a question was contained on a page, but the student left before finding it. In other cases a link that would have led to the answer was missed. This finding supports prior research on adolescent search behavior related to nonhealth topics [7
Fourth, students mentioned that they purposefully avoided sponsored links and advertisements, despite the fact that many of the search engines present these results first. The qualitative data confirmed this practice, as only 1 sponsored link was ever selected.
Finally, little to no attention was paid to the source of the answer. In the vast majority of cases, once an answer was located, it was simply assumed to be correct.