We identified forums by searching with the term “weight loss forum” on March 22, 2006 using Google (http://www.google.com
), the most popular search engine in the US.13
This approach identified forums likely to be found by patients, rather than a random sample of forums. From the first 50 search results, we selected freely-accessible English-language websites containing weight loss forums pertaining to general weight loss or weight loss modalities of diet, exercise, medications, or surgery. The forums had to have at least ten initial messages in a specific month in early 2006, the time period to which we restricted our study (exact month withheld to preserve confidentiality). An initial message is a message which starts a discussion thread.
The units of analysis were requests for and provisions of weight loss advice. Requests for advice were included if they could be addressed by weight loss guidelines from the National Heart, Lung, and Blood Institute2
or the American College of Physicians3
). We included questions on how to start losing weight (“Need help getting started!”), choose weight loss modalities (“Should I take a diet pill?), and perform weight loss modalities (“How much should I exercise?”). Requests for advice were excluded if they fell outside the realm of the guidelines (“Do oranges have more calories than apples?”) or if participants stated they were younger than 18 years old.
To assess inter-observer reliability of dichotomous outcomes, we computed Prevalence and Bias Adjusted Kappa (PABAK) which adjusts for rater bias, expressed as the Bias Index (BI), and relative probabilities of “yes” and “no” responses, expressed as the Prevalence Index (PI) . We used PABAK to adjust for the low prevalence of some responses. Like kappa, PABAK=0 represents chance agreement, while PABAK=1 implies maximum agreement. BI and PI values range from 0 to 1 (absolute value) with increasing BI and PI values respectively indicating increasing rater bias or prevalence.14,15
K.H. selected requests for advice which could be addressed by the guidelines. To evaluate the reliability of this process, a random sample of 12% of all initial messages, stratified by forum, was independently reviewed by K.F. Agreement was 85.4%, kappa 0.41, and PABAK 0.71 (PI = − 0.72 and BI = 0.11), indicating very good agreement after adjusting for prevalence and bias. K.H. determined whether advice was in response to questions about diet, exercise, medications, surgery, or general weight loss (nonspecific question or multiple weight loss modalities).
Provisions of advice in reply messages were evaluated for congruence with weight loss guidelines, and categorized as supported by guidelines (“accurate”), contradictory of guidelines (“erroneous”), or not addressed by guidelines. If any part of the advice was erroneous, the advice was erroneous. If none of the parts were addressed by guidelines, the advice was considered not addressed. If all parts were accurate, or if some parts were accurate while other parts were not addressed, the advice was accurate. Advice was further categorized as potentially harmful (“harmful”) or not likely harmful (“not harmful”). In the absence of a gold standard for evaluating the potential harmfulness of advice or information, we used the clinical judgment of two independent reviewers. Advice was considered potentially harmful if the requestor of advice would likely come to harm by following that advice. The reviewers determined whether erroneous or harmful advice was corrected in a subsequent post within the same thread.
K.H. and K.F. independently assessed the advice in two separate stages to minimize the influence of correction of advice when evaluating whether the advice was concordant with the guidelines. In the first stage, advice was divided into the three categories (“accurate,” “not addressed,” or “erroneous”), with agreement 83% and kappa 0.58, indicating moderate agreement. When advice was categorized as erroneous or not erroneous, agreement was 92%, kappa 0.47, and PABAK 0.83 (PI = 0.83 and BI = 0.003), indicating very good agreement after adjusting for prevalence and bias. Advice was categorized as harmful or not harmful during this first stage. Percent agreement was 94%, kappa 0.51, and PABAK 0.89 (PI = 0.88 and BI = − 0.002). In the second stage, the reviewers determined whether erroneous or harmful advice was corrected in a subsequent post. For correction of erroneous advice, agreement was 80%, kappa 0.57, and PABAK 0.61 (PI = − 0.30 and BI = − 0.02), indicating good agreement. For correction of harmful advice, agreement was 88%, kappa 0.72, and PABAK 0.77 (PI = − 0.42 and BI = − 0.02). Disagreements were resolved by consensus.
We performed chi square (χ2) tests of independence to examine the relationship between the topic of advice (diet, exercise, medications, surgery, or general) and occurrence of erroneous advice or harmful advice. The reference category was general advice.
We examined relationships between the dependent variables of erroneous advice, harmful advice, correction of erroneous advice, and correction of harmful advice and the forum-specific independent variables of forum activity, presence of moderators as participants, and age of forum. Independent variables were dichotomized. Two forums containing 56.4% (16731 of 29684) of all messages were high-activity forums. The remaining were low-activity forums. Since these two high-activity forums were from the SparkPeople website, analysis of relationships between activity and dependent variables may have been confounded by website-specific characteristics. Therefore, we employed the same sequence of logistic regression analyses using an alternative cutoff point between high- and low-activity forums. This alternative cutoff point was 1000 total messages per month, since it lay within a large gap dividing lower and higher activity forums. Six forums from five websites were categorized as high-activity with this cutoff, accounting for 87.9%% (26104 of 29684) of all messages.
For moderators as participants, forums were categorized as with or without moderators who can post messages. For age, forums were categorized as starting before February 2004 or during February 2004 and later, with data available for 16 of 18 forums.
We performed standard logistic regression analysis for each dependent variable, followed by hierarchical regression analyses if the omnibus χ2
from the standard logistic regression analysis was significant.16
When logistic regressions with all three predictors did not converge, we used relevant combinations of two predictors. We entered in the final logistic regression model the independent variables which were consistently significantly related to the dependent variable. Logistic regressions for erroneous and harmful advice were performed with the full data set. Logistic regressions for correction of erroneous and harmful advice were limited to forums with erroneous or harmful advice. Statistical analyses were performed with the SPSS 14.0 (SPSS Inc., Chicago, IL). Level of significance was set at α < 0.05.
The protocol was deemed exempt by the Institutional Review Board at the University of Texas Health Science Center at Houston.