|Home | About | Journals | Submit | Contact Us | Français|
Evaluate patient opinions on acceptable risks in exchange for a given degree of weight loss and their implications for sample size determination in obesity randomized clinical trials (RCTs).
Survey of patients entering RCTs for weight loss in a university based clinical research setting and power calculations based on their responses.
Men (N= 8) and women (N= 66) between 24 and 73 years of age with body mass indices (BMIs; kg/m2) ranging from 26.8 to 40.5.
Survey responses to questions assessing the added risk of serious adverse events (SAEs) or death one is willing to assume for a given degree of weight loss.
For 5% and 10% weight loss against risk for death per se, the mean acceptable risk tended to be around 3.5%, but the median (0.00) and mode (0.00) suggested that for most individuals, only a risk of ≤ 1% would be acceptable. Figures, estimated dropout rates and base rates of SAEs (including deaths) from recent obesity trials indicate that 1-year 2-group obesity RCTs would need tens of thousands of participants per group to have 80% power to detect risks that are meaningful to patients at the 2-tailed 0.05 α level.
Patient education is needed to explain which risks are realistically detectable in RCTs so that patients may provide truly informed consent or RCT standards should be modified to meet patients’ implicit expectations.
Obesity is associated with reduced lifespan,1 many aspects of poor health,2 and is increasingly prevalent.3 Current non-surgical treatments are only modestly effective.4 New treatments, particularly pharmaceuticals, are therefore actively being studied in randomized clinical trials (RCTs). 5–12 On September 8th, 2004, the US Food and Drug Administration (FDA) held a special Advisory Board Meeting of the Endocrinologic and Metabolic Drugs Advisory Committee.13 The purpose was to reconsider the FDA’s procedures for reviewing new drug applications for obesity treatment and to provide FDA with advice on how to enhance this process, including what to recommend to companies when designing trials.
One question that was extensively discussed, with no clear consensus being reached, was how many participants should be included in weight loss trials. The discussion centered around safety assessment because determining sample size to evaluate efficacy is much more straightforward and generally requires fewer participants than does a rigorous assessment of risk of serious adverse events (SAEs) that are likely to be relatively rare. The expert panel was only able to offer personal subjective views on what risk they wanted obesity RCTs to be able to detect (see Table 1).
A lack of either empirical evidence or shared a priori understanding of what degree of increased risk weight loss trials should be designed to detect or rule out appeared to contribute to the inability to near a consensus on sample size recommendations. To our knowledge, no empirical data directly address this question and none were presented at the Sept 8th hearing. FDA’s current draft (Feb 2007) guidance document recommends that approximately 3,000 subjects be randomized to active doses of the investigational drug and at least 1,500 to placebo for a minimum of one year.14 FDA estimated that this sample size would provide 80% power with 95% confidence an approximately 50% increase in the incidence of an adverse effect that occurs at a rate of 3% in the placebo group (i.e., 4.5% vs 3%). For adverse events that are significant and/or serious, but not occurring as often as 3% or more, this sample size may not be adequate.
In June 2007, FDA held a meeting seeking advice of its Endocrinologic and Metabolic Drugs Advisory Committee regarding approval of rimonabant, first in a new class of drugs that block CB1 cannabinoid receptor, for treatment of obesity.15 An analysis of 74 cases of suicidality in a sample of approximately fifteen thousand subjects suggested that patients exposed to rimonabant (20 mg) had a higher incidence of suicidality with an odds ratio of 1.9 (1.1, 3.1); when the analysis was limited to obesity trials only, the odds ratio was 1.8 (0.8, 3.8). Although the overall risk was very small, given the lack of demonstrated benefits of weight loss drugs on hard outcomes such as death and serious cardiac events, there was a heightened concern about higher risk of psychiatric adverse events, particularly depression and suicidality. The advisory committee voted unanimously to recommend that rimonabant be not approved at that time.16 More recently, on July 10, 2008, FDA held a meeting with its advisory committees17, seeking advice regarding the risk of suicidality with antiepileptic drugs (AEDs), some of which are indicated to treat not only epilepsy, but also migraine and psychiatric disorders. The FDA analysis suggested that there were 2.1 per 1000 (95% CI: 0.7, 4.2) more patients treated with AEDs experience suicidal behavior or ideation than placebo patients.18 Interestingly, the epilepsy patients treated with AEDs had the largest estimated odds ratio (3.5 [95% CI: 1.2, 12.1]) compared to those given these drugs for a psychiatric indication (1.5 [95% CI: 0.9, 2.4]). Nevertheless, the committee recommended that physicians and consumers be educated of the risk with bolded warning, but not a black box (highest category warning). These two examples describe the thinking that a certain degree of risk would be acceptable or tolerated, based on the degree of expected benefit and the severity of the disease state for which the treatment is indicated.
In the case of rimonabant, published papers of specific obesity trials failed to clearly identify the psychiatric risks. A Cochrane meta-analysis of the 4 published obesity trials with this drug did not identify the higher risk of psychiatric adverse events.19 In a subsequently published meta-analysis, Christensen et al. reported that patients given rimonabant were 2.5 times more likely to discontinue the treatment because of depressive mood disorders relative to those given placebo (OR=2.5; p=.01; number needed to harm=49 [19−316]).20 According to data presented at the June 2007 FDA advisory committee meeting, three cases of completed suicide were identified in the entire rimonabant clinical trial database of more than twenty thousand patients, with all deaths occurring among those receiving rimonabant, and none on placebo. This case further exemplified the need for studying larger populations to identify extremely rare risks that might not be acceptable for the given benefit.
Many factors and perspectives could be considered in determining sample size, but a critical one is clearly the maximum degree of increased risk that would be considered tolerable by a consumer/patient. We therefore collected preliminary data on the degree of increased risk of SAEs that the consumer finds tolerable and, from that, calculated how many patients would be needed in RCTs to detect such risks.
The sample consisted of 74 adults entering obesity RCTs in 2005 and 2006. Their educational levels were: high school (23), College (26), Masters (22), and Doctorate (3). Mean (SD) of BMI for men and women, respectively were 34.8 (2.6) and 33.6 (3.6). Mean (SD) of age for men and women, respectively were 46.4 (14.0) and 48.9 (11.4). All participants provided written informed consent prior to participation. The study was approved by the institutional review board of the University of Pennsylvania.
Participants completed a “risk tolerance” survey that required them to rate how much of an increased risks of, (1) an SAEs and, (2) death that they would be willing to tolerate in order to lose 5%, 10%, 20%, 30%, 40%, and 50%, respectively, of their current body weight. The results presented in this manuscript involve only the 5% and 10% weight loss, the magnitudes of weight loss inducible by currently available and investigational anti-obesity drugs. For each individual participant, the percentage of weight loss was also expressed as pounds individualized for each person in order to make the imagined weight loss more meaningful. That is, obese patients more typically enter weight loss programs stating their weight loss goals in terms of pounds rather than percentages. Further, to help respondents have a common understanding of increasing risk, they were provided the following examples: 0.01% risk = the probability of getting 13 heads if you flip a fair coin 13 times; 0.10% risk = the probability of getting 10 heads if you flip a fair coin 10 times; 1.0% risk = the probability of getting 7 heads if you flip a fair coin 7 times; 10.0% risk = the probability of getting 5 or more heads if you flip a fair coin 6 times; 20.0% risk = the probability of getting 4 or more heads if you flip a fair coin 5 times; and 50.0% risk = the probability of getting 1 head if you flip a fair coin 1 time.
For each imagined weight loss level, participants circled the increased risks of (1) an SAEs and (2) death that they would be willing to tolerate. Responses were reported on a visual analogue scale that ranged from 0% to 50% increased risk (see Appendix A for Instructions and the Visual Analogue Scale). The definition of SAE provided to participants was based on the definition used by the FDA and delineated in the US Code of Federal Regulations.
We calculated descriptive statistics (means, medians, mode and standard deviations) for the risk of SAE or death for each percent weight loss. Risks were also calculated by sex to find differences that may exist between males and females. The baseline descriptive statistics were calculated based on a sample size of 74 people. The sample size decreases by 3 in the rest of the analyses due to missing data. For the sensitivity analysis, we removed any observations that did not have a monotonically increasing tolerable risk as percent weight loss increased. This was done to eliminate participants who might not have fully understood the questionnaire. This reduced the sample size to 48 for death risk and 34 for SAE risk which we did not analyze separately by sex due to the small sample size.
We calculated sample size for a parallel group RCT with equal allocation to each of two groups (treatment and control), duration of 1 year, an exponential distribution of drop-out with 50% dropping at 1 year, and a 2-tailed α level of 0.05. In considering drop-out rates, we noted that “A major obstacle to the evaluation of the clinical trials (for obesity) is the potential bias resulting from low study completion rates. Completion rates varied from 52.8% of phentermine recipients in a 9-month study, to 40% of fenfluramine recipients in a 24-week comparative study with phentermine and 18% of amfepramone recipients in a 24-week study. One-year completion rates range from 51% to 73% for sibutramine and from 66% to 85% for orlistat”.21 In a recent trial of rimonabant, drop-out rates were roughly 50% at 1 year.5
For sample size calculations, we provide the number needed to be treated under the assumption that with respect to SAEs, those completing the study are similar to those dropping out, so called non-informative censoring.
Sample sizes were calculated assuming a two sided Type I error of 0.05, power of 80% using the following formula22:
α is the type I error rate, β is the type II error rate, p1 is the population proportion in the placebo group, p2 = the population proportion in the intervention group, p = ½(p1+p2), and p = p1= p2 under H0. The required sample size, n, was then rescaled to account for a 50% drop-out rate by being doubled.
Participants (male and female combined) were more willing to risk an SAE than death for both 5% and 10% weight loss (Table 2). On average, males were willing to take a greater risk of death than females for both 5% and 10% weight loss (Table 3). Males were willing to take a greater risk of SAEs than females at 5% weight loss while females were willing to take a slightly higher risk of SAEs than males at 10% weight loss (Table 3). For weight losses of 5% and 10%, females were willing to take a greater risk of SAEs than death. Similarly, males were willing to take greater risk of SAEs than death at 5% weight loss; however, males were willing to take greater risk of death than SAEs for a 10% weight loss. Somewhat paradoxically, the mean self-assessed tolerable risks for death and SAEs sometimes decreased as the percent weight loss increased. On the other hand, the medians provided a much more stable picture of the ‘majority’ view than did the means. The medians did not exhibit the seemingly-paradoxical decrease in tolerable risk for larger weight losses (Table 3). As the decreasing trend is counterintuitive, we took a subset of our 71 participants who had a consistently decreasing risk of death or SAE with increasing weight loss. There were 48 people who consistently said they would take a greater risk of death for a greater weight loss, while 34 people consistently said they would take a greater risk of SAE for a greater weight loss. In this subset analysis (see Table 4), the mean risk of death decreases drastically to a 0.23% risk to lose 5% of one’s weight, while the mean tolerable risk of SAE decreases to about 0.68%.
The medians were used in selecting risk inputs for the power analysis because the power analyses were designed to reflect the number of participants needed to satisfy the majority of the participants, not the average of the majority and a minority willing to accept very high risks. The sample size per group to detect an increase in the SAE or death rate over the control rate (doubled to account for a 50% dropout rate) with 80% power and a 2-tailed test was calculated to detect the maximum absolute increase in risk that participants would find tolerable (see Table 5). As can be seen, even if the control SAE rate is as small as 0.02, to detect an increase of only ½% (actually larger than the medians reported as tolerable by our participants), then over 27,000 participants would be needed per group.
The present study had several strengths, the first of which was the sample on which data were collected. Participants were actual obese patients enrolled in a university-based weight loss program and, therefore, were very motivated to lose weight. This is the very group of individuals for whom questions of risk tolerance are the most pertinent and the population to whom results would be generalized. Second, the questionnaire asked risk tolerance levels for mortality and SAEs, rather than focusing on just one. Because weight loss treatments could have differential effects on mortality and occurrence of SAEs, it is valuable to have separate ratings of risk tolerance for both. Third, because no prior study has addressed this question, these findings provide the first data to address a topic of major public health significance especially given the interest in the development of new anti-obesity agents. Also, our results are consonant with those of a similar recent study23 that found that most patients undergoing obesity surgery were willing to assume only very small risks of death, with probabilities judged to be tolerable of a similar order of magnitude to those selected by our participants.
In the article by Christensen et al., 20 serious adverse events occurred more frequently in the rimonabant group (5.9%) compared to the placebo groups (4.2%). This was a significant increase in the odds of a serious adverse event = 1.4 (p=0.03). This is about the magnitude of increased risk that has been reported for increase in the odds of an MI with Vioxx.24 If we were to design a trial to study this risk prospectively, a two group chi-square test with a 0.05 two-sided significance level with 80% power to detect this difference between the rimonabant Group SAE proportion of 0.059 and the placebo Group SAE proportion of 0.042 (odds ratio of 1.43) would require a sample size in each group of 2604 or a total of 5208 patients with evaluable results unadjusted for dropouts or noncompliance.
Numbers needed to treat (NNT) are useful quantities to enable clinicians to have some idea of the importance of differences in outcomes that can be attributed to the improvement in therapy. The number needed to treat is the number of patients who would have to undergo the therapy to yield one additional person with a positive benefit. Similarly numbers needed to harm (NNH) are the number of patients treated that would yield one additional person with a serious adverse event over the comparator treatment. When different treatments are compared, the NNTs may not be comparable. NNT is computed by taking the inverse of the absolute risk reduction or the inverse of the absolute difference in the proportion of events. In the meta analysis by Christensen et al., the NNH would be 1/(.059−.042) = 1/(.017) = 58.8 or 59. That is for every 59 patients treated with rimonabant, 1 additional SAE would occur over a patient treated with a placebo.
It should be noted that NNTs and NNHs are useful for comparisons when the trials are very similar, but can be misleading when the event rates are very different, such as can happen when very different patient populations are studied. For example, if the SAE rates in a very healthy population were 0.016 versus 0.008 for the rimonabant versus control, the relative risk increase would be 2 and the NNH would be 1/(0.016−0.008) = 1/0.008 = 125. If the rates were from a population of sicker patients, such that the SAE rates were 0.06 and 0.03 respectively, the NNH would be 1/(0.06−0.03) = 1/0.03 = 33.3 or 34. Thus, while the relative risk is still 2 in both cases in this example, the number needed to harm requires only about a fourth of the patients before we would see one additional case. Thus, the NNT and NNH are solely functions of the absolute difference between the rates and have little relationship to risk reduction or increase in risk.
A limitation of the study includes the use of a sample that was from a single site from the northeast. Thus, generalizability of results needs to be established in larger samples from other geographic regions. We also note that the proportion of men in our sample was low. That being said, data show that most (roughly 80%) individuals who participate in obesity treatment trials are women. For example, Brennan et al.25 reviewed 5 studies with 73–84% female in RCTs testing Sibutramine, Muls et al.26 had 80% females, and McMillan-Price et al.27 conducted a dietary RCT for weight loss with 75% females. Thus, while our sample may not be representative of the general population or even the obese population overall in terms of gender, it is representative of the population to which we wish to generalize, namely obese individuals who enter RCTs. Hence the low proportion of men in the study may be considered normative. Given that this was the first study to address the problem of obese patients’ perception of risk and how much weight they might want to lose considering the risk associated with the weight loss, this seems an appropriate sample. Future research should study larger samples of men to obtain better estimates for men. Finally, our findings may not pertain to perceptions of tolerable risks for childhood obesity treatments, which may be even lower for many individuals.
A second concern is that some participants gave responses that seemed counterintuitive to the investigators which suggested that our questionnaire may have tapped constructs other than pure risk tolerance or that our preconceived perceptions of risk tolerance are not in accordance with risk perceptions for many people. One study indicated that participants may misunderstand risks due to confusion with probabilities and stated that subjects actually perceiving a risk of 1/200 as more likely than one of 1/100.28 This suggests the possibility that many subjects simply do not understand risk quantification well.
Alternatively, these seemingly counterintuitive results may be valid indicators of subjects’ perceptions. Clinical staff had communicated to subjects that they should only expect modest weight losses during the RCTs. Therefore, the 5 and 10% amounts of weight loss are reasonable amounts to focus on for this study group. Others have stated that the 5 and 10% weight losses are reasonable targets when measuring the effects of anti-obesity pharmaceuticals.21 Our subjects were, for the most part, below BMI 40 which is a commonly used cut-off for bariatric surgery.29 Hence, to expect this study group to anticipate and realistically entertain a 20–50% weight loss may not have been reasonable. Indeed, Fabricatore et al.30 recently reported that “Respondents' weight loss expectations for their upcoming attempt (8.0% reduction in initial weight) were significantly more modest than their goals for that attempt (16.8%), and smaller than the losses that they expected (12.0%), and achieved (8.9%) in their most recent past attempt… Results suggest that overweight and obese individuals can select realistic weight loss expectations that are more modest than their ideal goals.” Thus, while obese people may desire large weight losses, in most cases, they do not realistically expect them. With this in mind, the initially counterintuitive results may make sense from a perspective of how individuals determine how much they would be willing to pay for a benefit like weight loss (i.e., an inquiry into subjective valuation as practiced by economists). Individuals may make this type of determination based on the desirability of the benefit as well as the plausibility of its attainment. Although our questionnaire implicitly asked subjects to assume that each weight loss level was attainable, subjects may have been unable or unwilling to engage in such a suspension of disbelief. In turn, this may have caused them to discount the value of the larger weight losses. With these large weight losses discounted, it is then rational to be willing to pay much less (i.e., assume less risk) for the weight losses. In other words, the subjects may have truly been willing to pay less risk to lose more weight because they did not think losing large amounts of weight was realistic.
The implications of these results are clear. They suggest that the maximum added risks that most participants are willing to tolerate in exchange for realistic weight losses are far smaller than the added risks that can be detected in most extant obesity RCTs. This suggests that either patients do not have realistic expectations regarding the safety assessments that current obesity RCTs can offer, or patients are not really expecting obesity RCTs to rule out risks they find excessive. If the latter, then there may be no need for action. However, if the former is true, it may suggest that patients do not have a good understanding of risk and what current obesity RCTs are capable of doing and therefore may not be providing truly informed consent. The possibility that patients do not have realistic expectations or a clear understanding of risks and the safety assessments that current obesity RCTs can offer is made all the more plausible by the odd pattern of results obtained for some participants when they stated that they were counterintuitively willing to assume larger risks in exchange for smaller weight losses. If this is true, a better education campaign for participants in RCTs involving investigational drugs and perhaps patients taking marketed drugs may be needed. Subsequently, participants may need to be willing to accept greater uncertainty once risks they judge to be meaningful have been ruled out or RCT sizes may need to be increased. Finally, future research should attempt to develop better ways of asking patients in obesity RCTs about the risks they find tolerable in order to replicate and extend this research.
Financial support for this study was provided in part by a grant from by NIH T32HL007457 and P30DK056336 and the Almond Board of California. The funding agreement ensured the authors' independence in designing the study, interpreting the data, writing, and publishing the report.
This questionnaire asks about the maximum increase in risk of death that you would consider tolerable to lose weight. We ask this question to better understand the magnitude of risks that people wishing to lose weight are willing to expose themselves to in order to achieve that weight loss. This questionnaire addresses this by asking you how much of an increased risk you would be willing to tolerate to lose six different amounts of body weight – these amounts are specified below. For each amount, you will be asked to circle the increased risk of death that you would tolerate on a response line that looks like a ruler. Higher scores represent a higher increased risk that you would tolerate. We say “increased risk” because there is always a risk of death for anyone living, and so we are interested in knowing about how much of an increased or additional risk you would tolerate to lose weight. Although these are hypothetical or imaginary situations, please consider thoughtfully.
This questionnaire asks about the maximum increase in risk of a serious adverse event that you would consider tolerable to lose weight. We ask this question to better understand the magnitude of risks that people wishing to lose weight are willing to expose themselves to in order to achieve that weight loss. This questionnaire addresses this by asking you how much of an increased risk you would be willing to tolerate to lose six different amounts of body weight – these amounts are specified below. For each amount, you will be asked to circle the increased risk of a serious adverse event that you would tolerate on a response line that looks like a ruler. Higher scores represent a higher increased risk that you would tolerate. We say “increased risk” because there is always a risk of a serious adverse event for anyone living, and so we are interested in knowing about how much of an increased or additional risk you would tolerate to lose weight. Although these are hypothetical or imaginary situations, please consider thoughtfully. [Definition of Serious Adverse Event: A Serious Adverse Event (other than death) includes any of the following outcomes: a sickness or illness, being in the hospital over night or something that causes you to have to stay in the hospital longer than expected, a permanent or major disability, or a inherited birth defect or events that require medical or surgical treatment to stop one of these listed in this definition.
No conflict of Interests with funders of the study.