|Home | About | Journals | Submit | Contact Us | Français|
Understanding patient-specific differences in risk tolerance for new treatments that offer improved efficacy can assist in making difficult regulatory and clinical decisions for new treatments that offer both the potential for greater effectiveness in relieving disease symptoms, but also risks of disabling or fatal side effects. The aim of this study is to elicit benefit-risk trade-off preferences for hypothetical treatments with varying efficacy and risk levels using a stated-choice (SC) survey. We derive estimates of “maximum acceptable risk” (MAR) that can help decisionmakers identify welfare-enhancing alternatives. In the case of children, parent caregivers are responsible for treatment decisions and their risk tolerance may be quite different than adult patients' own tolerance for treatment-related risks. We estimated and compared the willingness of Crohn's disease (CD) patients and parents of juvenile CD patients to accept serious adverse event (SAE) risks in exchange for symptom relief. The analyzed data were from 345 patients over the age of 18 and 150 parents of children under the age of 18. The estimation results provide strong evidence that adult patients and parents of juvenile patients are willing to accept tradeoffs between treatment efficacy and risks of SAEs. Parents of juvenile CD patients are about as risk tolerant for their children as adult CD patients are for themselves for improved treatment efficacy. SC surveys provide a systematic method for eliciting preferences for benefit-risk tradeoffs. Understanding patients' own risk perceptions and their willingness to accept risks in return for treatment benefits can help inform risk management decision making.
Economists generally advocate using benefit-cost analysis to improve the efficiency of resource allocation decisions. Benefit-cost analysis requires calculating monetary values for both costs and benefits of possible alternatives. Monetizing benefits can be controversial, particularly for nonmarket goods such as environmental and health services (Pacala & Socolow, 2004; MacKinnon, 1986; Kelman, 1981). While monetary values provide a convenient numeraire for evaluating welfare consequences of decisions, alternative numeraires may be more relevant and more acceptable in some institutional settings. We derive and estimate a willingness-to-pay (WTP) analog called “maximum acceptable risk” (MAR) for benefit-risk analysis that can help decisionmakers identify welfare-enhancing alternatives.
Physicians and regulators often are faced with weighing the benefits of medical interventions against the potential risks. These judgments may be informed by data on the incidence of favorable and unfavorable outcomes. Because outcomes are dissimilar, however, benefit-risk judgments inevitably require attaching subjective decision weights to heterogeneous outcomes. Regulatory authorities such as the Food and Drug Administration (FDA) employ ad hoc methods for weighing benefits against possible safety risks for new pharmaceuticals. Regulators use professional judgment to gauge the benefits of the treatment and the “acceptable” level of risk to patients. The regulatory process does not currently require any systematic assessment of the weights that patients themselves attach to benefits and risks. Regulators sometimes have approved relatively high-risk treatments for particularly serious conditions, such as HIV and cancer, but have been reluctant to accept such benefit-risk tradeoffs for other conditions (Mendeloff, 1995). FDA decisions also have resulted in several recent, controversial withdrawals of new treatments because of safety concerns (McClellan, 2007).
Evidence of patients' willingness to trade off risk for benefits collected using systematic methods could help in evaluating such regulatory tradeoffs. Although many patients appear to be willing to tolerate small risks of serious adverse health outcomes in exchange for health improvements, individual preferences regarding such tradeoffs have not been carefully quantified. Typically, decisionmakers consider anecdotal evidence of patient preferences presented at hearings or through interested third parties, if patient preferences are considered at all. However, patients have heterogeneous preferences over risks and benefits, and to make informed decisions require a better understanding of how such preferences differ with respect to individual characteristics and circumstances. One important issue, for both policy and economics, is how the preferences of individuals and caregivers differ. In this study, we focus on how benefit-risk preferences differ between adult patients and parents of juvenile patients. Understanding patient-specific differences in risk tolerance for new treatments that offer improved efficacy can assist in identifying appropriate treatments and in informing welfare-enhancing regulatory decisions.
In this article we elicit the preferences of two groups-–adult patients and the parents of juvenile patients. The literature on valuing improvements in children's health comes at this issue from several angles. Typically, children do not have control over money or their activities, and economists are unsure about the ability of children to express valid preferences over topics like medical care and mortality risk. Because children's preferences cannot be modeled like adults', incorporating children into economic theory or benefit-cost models has always posed a challenge, especially in the area of estimating the monetary benefits of health improvements (Dickie & Gerking, 2003). Most studies designed to estimate the child's willingness to pay for improvements in health acknowledge the theoretical and practical difficulties of eliciting meaningful values from children and instead elicit the parent's willingness to pay. Often, the researchers are interested in identifying the ratio of willingness to pay for improving child versus parent health that could be used to scale estimates for other health effects where only adult data exist. Several studies in environmental and health economics have elicited parents' preferences for improving or protecting their children's health, and in some cases, these preferences have been compared to adults' preferences regarding their own health. Most of these studies have specifically measured parents' WTP to improve their children's health and do not include risks of serious side effects in the analysis (Chenevier & LeLorier, 2005; Dickie & Messman, 2004; Agee & Crocker, 2004; Dickie & Gerking, 2003; Sorum, 1999; Viscusi et al., 1991). Dickie and Messman (2004), for example, estimated marginal rates of substitution between a mild case of illness for a child and a parent and found that parents value improvements in their children's health about twice as highly as their own. Dickie and Gerking (2003) investigated the relative value of reducing parents' own risk of skin cancer versus their children's risk of skin cancer from sun exposure using risk-risk tradeoffs. They found that parents are willing to accept about a 2.5% increase in risk of nonmelanoma skin cancer for themselves in return for lowering their children's risk by 1%.
The problem of measuring the value of health improvements for children also has been approached from the standpoint of intrahousehold allocations of resources. A large literature in both developed and developing country settings models the factors that affect household allocations. Dickie (2005) evaluated how family resource allocations affect children's health, and the value of children's health. He found that children whose parents invest in preventive and remedial medical care experience fewer days of illness, and estimated WTP to avoid one day of illness-induced school loss is about $100 to $150.
The implications of the WTP studies for parents' benefit-risk tradeoffs for their child's medical treatments are not clear. The previous studies elicit tradeoffs between efficacy and money, and tradeoffs between risk to parents and risks to children. However, none of these studies elicit tradeoffs between efficacy and risk. Although these studies provide evidence of differences between adults' preferences for their own health versus their children's health, they are less informative about differences in preferences regarding benefit-risk tradeoffs related to treatment choices. It is possible that parents may be more cautious about risks to their children since they are more risk averse for their children. On the other hand, placing a high value on improving their children's health could be consistent with both high WTP and high risk tolerance.
The appropriate regulatory and clinical balance between benefits and risks may vary among patients with different characteristics. Patients may vary both in the likelihood that they will benefit from treatment and the likelihood that they will be harmed by treatment. In addition, they may vary in their tolerance for treatment risks. It is possible that some patients with a relatively favorable balance between likely beneficial and likely harmful outcomes would nevertheless be too risk averse to accept even small treatment risks in return for significantly better efficacy. Thus, regulators may be interested both in evidence on the distribution of benefits and risks among groups of patients and in evidence on these groups' tolerance for bearing risk. Parents of juvenile patients may be particularly averse to treatment risks, and thus could represent a possible lower bound on risk tolerance for a particular condition.
In this study, we evaluate and compare benefit-risk preferences related to treating Crohn's disease (CD), a condition that causes chronic inflammation of the gastrointestinal tract. CD is often diagnosed at a relatively young age, and peak incidence is between 15 and 30 years of age (Hanauer et al., 1998). Many CD treatments carry risks of serious adverse events (SAEs). The SAE risks associated with some CD treatments include lymphoma; serious infections, such as tuberculosis and pneumonia; and progressive multifocal leukoencephalopathy (PML) (Sandborn et al., 2005; Colombel et al., 2004; Su et al., 2004; Yousry et al., 2006).
The objective of the study is to use stated-choice (SC) survey techniques, also referred to as conjoint-analysis or discrete choice experiments, to estimate and compare the MAR of SAEs that adult patients and parents of juvenile patients with CD are willing to accept in exchange for a specified level of relief from CD symptoms. As applied to health-care decision making, SC is a systematic, survey-based method of eliciting tradeoffs. It is based on the hedonic principle that medical interventions are composed of a set of features or attributes and that the utility obtained from a particular intervention is a function of these attributes. SC data yield quantitative estimates of the marginal rates of substitution among treatment attributes. Thus, it is possible to quantify mean MARs for different SAEs, different treatment benefits, and different populations. Our methods can assist regulatory authorities in identifying decisions that yield positive net benefits to identifiable patient subgroups such as children and adults.
In the next section, we describe our methodology and the estimation of mean MAR for each type of mortality risk. Section 2 describes the development of the survey instrument. Sections 3 and 4 present the empirical method and the results of the analysis, respectively. The final section presents a discussion of these results and the study.
The SC method has been widely used and validated in marketing research, transportation, and environmental economics (Cattin & Wittink, 1982; Louviere, 1988; Bateman et al., 2002; Bennett & Adamowicz, 2001). SC is increasingly used in health economics to elicit patients' and physicians' stated preferences for health-care interventions, treatment alternatives, and health-care services. (See Ryan and Gerard (2003) for a recent review of health applications.)
Subjects answer a series of trade-off questions that vary according to an experimental design with known statistical properties. The resulting data make it possible to estimate a utility function with treatment attribute levels as arguments (Louviere et al., 2000). The marginal utility estimates indicate perceived rates of substitution among treatment attributes. Moreover, SC utility estimates obtained from trade-off data can identify response nonlinearities and can incorporate both clinical and nonclinical factors that influence perceived utility. Finally, the resulting parameter estimates can provide estimates of risk-equivalent welfare measures analogous to income-equivalent welfare measures.
According to hedonic-utility principles, all commodities over which consumers make choices (including therapeutic treatment options) can be thought of as being composed of a set of attributes. When applied to evaluating health preferences, SC multiattribute choice alternatives usually are framed as “health profiles” associated with possible treatment options. Each alternative j is described according to a vector of distinct attributes (Xj). This vector might include several different, but not necessarily mutually exclusive, therapeutic benefits, mild-to-moderate side effects, and SAE risks associated with the treatment alternative. Here, Xj may include both continuous and categorical attributes, and the categorical attributes may or may not be naturally ordered.
In a choice-format SC survey, respondents are presented with a series of evaluation tasks involving choices between two or more treatment options. Each treatment option is described according to the same attribute categories but the levels of these attributes are varied across options and across choice tasks. Applying a random utility modeling (RUM) framework, the random utility associated with each alternative is assumed to be a function of these attributes plus a random error term:
We can incorporate benefit-risk trade-off preferences by letting Xj consist of (1) a vector of health condition attributes, XBj, such as the severity of day-to-day chronic disease symptoms, the frequency of flare-ups (brief periods of intensified symptoms), and (2) a single fatal SAE risk, pj. The expected utility of a treatment option j, can then be expressed as:
Here, βR can be interpreted as the disutility of experiencing the SAE with certainty. We multiply beneficial outcomes by (1 − pj) in estimation since benefits are realized only if the SAE outcome is not realized. Equation (3) defines the increase in risk Δp* that equates the difference in expected utility between two treatments, i and j:
The MAR associated with this higher efficacy benefit is the increased SAE risk from treatment j, Δp* that would make an individual indifferent between treatment i and treatment j. Analogous to WTP, the increased probability of the SAE associated with treatment j that would exactly offset the benefits of treatment j, such that ΔV = 0.6 The estimated parameter vector can be used to estimate MAR for a specific, clinically relevant SAE in exchange for specified treatment benefits if treatment j provides a higher level of efficacy than treatment i, such that (ΔX)βB > 0.
Solving this equation for Δp* gives us the MAR associated with switching from treatment i to treatment j.
Using a statistical design with known properties to design the choice questions and a discrete choice estimator such as conditional or random parameters logit to effects-coded attribute levels, the observed pattern of respondent choices allows the estimation of preference parameters for all attribute levels (McFadden, 1981; Hensher et al., 2005). Since SC data allow us to estimate the relative magnitudes of the parameters in the β vector, including the βB and βR parameters, MAR can be directly calculated from the RUM model results using Equation (1).
Two surveys were developed, one for adult patients with CD and one for parents of juvenile CD patients. The adult-patient version of the survey was designed first, and then modified for administration to parents. Modifications were minimized to maintain comparability between the two surveys. The main component of the survey instrument is a set of SC trade-off tasks that incorporate salient and clinically relevant features of CD treatments. These features were identified in a review of the literature, consultations with medical experts, and interviews with CD patients. Patients and medical experts, including physicians, might have different perspectives on therapies and the important aspects of the therapies. Medical experts know what is clinically relevant and what information is needed to compare alternative treatments to evidence from clinical trials. However, we have confirmed with patients the relevance of the medicine features through pretest interviews. Based on this information, the treatment attributes include four measures of treatment efficacy (severity of daily symptoms and activity limitations, frequency of flare-ups, prevention of serious complications, and the need for oral steroids) and the risks of three potentially fatal SAEs (death or severe disability from PML, death from serious infections, and death from lymphoma). Table II lists the seven treatment attributes and the range of levels selected for the survey questions.
For each of the three SAEs, we selected four levels of risk that ranged from 0% to 5% for experiencing the SAE within the next 10 years. The 10-year time frame was advantageous from both conceptual and methodological perspectives. A multiyear time period is clinically consistent with the chronic nature of CD and its treatment. Ten years is long enough to account for chronic symptoms and risk exposure. In addition, a 10-year period increases the magnitude of risks to a range that individuals can evaluate. Previous studies indicate that most people have difficulty conceptualizing probabilities less than one in a 1,000 (Krupnick et al., 2002). Face-to-face pretest interviews confirmed that 10 years is a reasonable planning horizon for both CD patients and parents. The 10-year time span introduces the possibility that individuals may be discounting future risks, which is discussed further below.
Because individuals may have difficulty understanding and evaluating low-level risks represented by percentages, the survey describes probabilities in two ways. The first approach expresses risks in absolute terms using a “risk grid” (Krupnick et al., 2002), in which each square of a 1,000 square grid represents one patient. This method helps subjects conceptualize small probabilities by using shaded squares, where the shaded squares indicate patients who would die from an SAE, and the unshaded squares indicate patients who would avoid or recover from the SAE.
The second approach expresses risk in comparative terms, using a “risk ladder” (Smith, 1988). This method places the risks of interest in the context of more familiar risks with similar severity and probability of occurrence. In the patient survey, risk ladders specific to a 50-year-old man or woman (depending on the gender of the subject) presented mortality risks from accident, heart attack, cancer, and all causes. The risks were expressed as the average number of deaths out of 1,000 individuals that would result over the next 10 years. The highest risk level used in the SC questions, 5% risk of death, is less than the risk of death from all causes for a 50-year-old man and about equal to the risk of death from all causes for a 50-year-old woman. Adapting these graphs for parents of juvenile CD patients presented two challenges. First, a 15-year-old girl or boy faces a risk of death for all causes over the next 10 years of less than 0.5% and a little over 1%, respectively, which is significantly lower than the highest SAE risk presented in the survey (5%). Second, the two most common mortality risks for children are not diseases but accidents and homicide/suicide. However, pretests with parents (described below) suggested that the risk ladders were useful in conveying comparable risks to the parents. The risk ladders in the pretest and final versions of the survey were specific to a 15-year-old girl or boy (depending on the gender of the child), and the comparable risks were all causes, homicide/suicide, accident, and cancer.
The initial draft of the patient survey instrument was pretested to explore two main issues: (1) the patients' ability to understand and accept the treatment attributes and levels presented to them in the questionnaire; and (2) their willingness to trade off treatment efficacy against SAE risks in the SC questions. The pretests also were used to test the length and wording of the survey instrument and to evaluate the two methods used to communicate SAE risks.
The first pretest for the patient survey employed face-to-face interviews with 10 patients (5 males and 5 females) between the ages of 24 and 71 years who had been diagnosed with CD. The participants were encouraged to “think aloud,” describing their thoughts as they completed the survey instrument. The second pretest was a pilot survey administered to 51 CD patients drawn from registered users of an informational web service, HealthTalk, which includes patients and others who are interested in CD. The pilot survey used a simplified set of SC tasks designed to investigate subjects' willingness to trade off changes in symptom severity and daily activity limitations against changes in SAE risks and subjects' willingness to trade off the different SAE risks against each other. While information from these pretests was largely supportive of the initial draft survey, minor adjustments were made to the content and ordering of the text in the final survey to improve comprehension.
The parent survey was adapted from the patient version, which had been pretested and administered before the parent survey was designed. The parent version of the survey was pretested using face-to-face interviews with eight parents of children with CD. The pretest focused on assessing whether the existing set of attributes and levels was appropriate for assessing parents' preferences, the risk tolerance of parents relative to adults with CD, and the ability of parents to answer questions about their children's symptom severity and daily activity limitations.
Parents responded to the survey in a manner similar to patients, so the final parent survey presented the same range of SAE risks as the patient version (0-5%). The main concern expressed by parents was the level of confidence with which they were able to report their children's symptoms, which depended on the age and gender of the child. In the introductory text for the parent survey, we encouraged parents to consult with their children about symptom severity and daily activity limitations; however, this problem is unavoidable in surveys of parents and other surrogate respondents. However, to the extent that it is the parent who makes medical decisions, their perceptions of symptom severity play a role. Parents' willingness to trade off benefits and risks reflects more than their concern about their child's perceived welfare, which itself differs from the child's experienced utility. Parents must also consider their role as care-givers, the welfare of the household, and other factors that the juvenile patients themselves would not consider.
The trade-off tasks in the final survey asked subjects to choose between two treatment options with differing treatment attributes. Table I provides an example of the SC question format from the patient survey. We employed a variation of a commonly used algorithm to construct a near-optimal experimental design resulting in 45 pairs of treatment options (Huber & Zwerina, 1996; Kanninen, 2002; Zwerina et al., 1996). To reduce respondent burden, the tradeoff tasks were blocked into five sets of nine questions. Each subject was randomly assigned to receive one of the five sets of questions. Each subject also completed an additional choice in which one treatment alternative was unambiguously better than the other alternative to test whether subjects understood the evaluation task.
In addition, the survey included standard demographic items (e.g., age, gender, education) as well as a number of items about the patient's experience with CD and CD medication history. To assess the current severity of the patient's condition, questions were included from the Crohn's Disease Activity Index (Best et al., 1976) and the Short Inflammatory Bowel Disease Questionnaire (SIBDQ) (Irvine et al., 1996).
In addition to measuring subjects' preferences, the set of SC questions used in the survey also was designed to provide internal validity tests using consistency patterns in each subject's trade-off choices. These tests do not examine the relationship between the preferences expressed in response to the survey questions and “real” preferences (or the decision an individual would actually make); instead, they test the internal consistency of the responses with the basic principles of utility theory. Stated-preference consistency can be detected using transitivity, stability, and logic tests. Transitivity requires that if subjects indicate they prefer treatment A to treatment B at one point in the question sequence and indicate they prefer treatment B to treatment C at another point, then they should also prefer treatment A to treatment C in a third question. The stability of preferences requires that subjects who prefer treatment A to treatment B at one point in the sequence of choice questions should prefer treatment A to treatment B at any subsequent point. The logic test determined whether patients understood the evaluation task sufficiently to indicate a preference for an unambiguously better treatment.
Subjects also may adopt decision rules to simplify the trade-off tasks that are inconsistent with the assumptions used to interpret and analyze the data. Noncompensatory decision rules, such as selecting treatment alternatives based on the best level of a single attribute rather than evaluating all treatment advantages and disadvantages, can bias preference estimates if the choices do not reflect real preferences for that single attribute level. We test for this dominant-preference response pattern and control for it statistically in estimation.
The surveys were programmed for web administration by MRxHealth, an international market research and consulting firm (www.MRxHealth.com). We administered the adult-patient version between September 2005 and January 2006 and the parent version between February and March 2006. Subjects were informed that a $20 donation would be made to the Crohn's and Colitis Foundation of America in recognition of their contribution to the project.
Respondents were U.S. residents at least 18 years of age. Adult patients had to be diagnosed with CD (self-reported), while the parent sample included parents or guardians of a child 18 years of age or younger diagnosed with CD who reported being involved with making medical decisions for their child. The web-based survey instrument was administered to two samples. The first sample was constructed using both CD adult patients and parents drawn from a list of subscribers to the HealthTalk website. These subjects received an email invitation that provided a direct link to the online survey. The second sample involved adult patients only, all of whom were enrolled through an invitation extended by clinical practice sites. These patients arguably may be more representative of the general patient population than subscribers to the HealthTalk website. However, there were no significant differences in preference estimates between the HealthTalk and clinic samples, so we have pooled both groups in this analysis.
Of the 357 adult patients who completed the survey, 342 adult patients passed data quality checks and were included in the survey analysis. For the parent sample, of the 107 subjects who started the survey, 105 parents were included in the analysis. The excluded subjects consisted of those who did not complete all of the SC questions or had no variation in their answers to the trade-off questions (always chose alternative A or B).
We used multivariate, random parameters or mixed-logit regression to estimate coefficients for each attribute level. Mixed logit avoids potential estimation bias from unobserved taste heterogeneity in discrete choice models by estimating a distribution of tastes for each preference parameter (Revelt & Train, 1998). In addition, because each subject provided responses to multiple trade-off questions, we estimated a mixed-logit panel model to account for within-subject correlation. Regressors in the mixed-logit model included all variables listed in Table II.
The utility of a given treatment i is a function of the categorical β coefficients for efficacy attributes and α coefficients for 10-year risks of SAEs. Thus, the empirical specification for utility Vi is:
The benefits are realized only if the SAE outcome is not realized. Thus, we multiply benefits by (1 − ΣP) in estimation, where ΣP is the total SAE risk. The experimental design included random combinations of risks and risk levels.
Table III summarizes patients' experiences with CD. Adult patients have significantly more severe symptoms than juvenile patients (p < 0.05). A majority of the adult patients in the sample (66%) have been diagnosed with CD for 6 or more years; a majority of the juvenile patients (90%) have been diagnosed with CD for less than 6 years. Around 45% of the adult patients and 65% of the juvenile patients were reported to have general well-being at the “well” level during the last 7 days. The mean SIBDQ score for 27% of the adult patients and 59% of the juvenile patients exceeded 170, a value generally accepted as corresponding to a state of CD remission. Mean ages of the adult patients and the juvenile patients were 44 years (SD = 13) and 14 years (SD = 3), respectively. Around 94% of the adult patients and 97% of the parents of the juvenile patients were white. The annual household income in the adult patient sample and in the parent survey sample was $75,000 (SD = 47,000) and $104,000 (SD = 47,000), respectively.
Following established practice for SC data, attribute levels were effects coded rather than dummy coded. The parameter for the omitted category is the negative sum of the included-category parameters (Hensher et al., 2005). Thus, zero is the mean effect for each attribute and positive and negative coefficients are interpreted relative to the mean. Table IV contains the effects-coded mixed-logit coefficient and standard deviation estimates with their standard errors for both the adult patient and parent surveys. The coefficient estimates for the efficacy and SAE attributes generally are consistent with the natural ordering of the categories. The only exceptions were for the “reduce” and “no effect on” complications, and the 0% and 0.5% risk of lymphoma in the parent survey; however, the coefficient estimates for these attribute levels were not significantly different from each other. Less severe symptoms, higher efficacy in reducing complications, longer time to next flareup, and no need for oral steroids have significantly larger utility than the worse levels of these attributes. Similarly, coefficient estimates are smaller for higher SAE risks. Mixed-logit standard deviation estimates indicate the amount of unobserved taste heterogeneity in the data. The coefficient estimates from the parent survey usually have higher standard deviation estimates than patients, suggesting greater heterogeneity in their preferences. The parameter estimates are confounded with the scale factors, so we cannot compare estimates from the two data sets directly (Swait & Louviere, 1993). However, since the MAR estimates are free of scale, we compare MARs in the next section.
The relative contribution of different attributes to overall utility can be represented by the marginal utility parameter estimates. To facilitate comparisons, the relative contribution of each attribute was measured as the percentage contribution of the difference between the best and worst level for each attribute to the overall utility. The treatment attribute with the greatest overall importance is symptom severity and daily activity limitations (0.30 for adult patients and 0.31 for parents) followed by the risks of the three SAEs (Fig. 1). The least important attribute in both samples is the range of time shown to next flare-up (0.03). Looking at preferences over the three SAEs, the range of risks for PML indicates that it is the most important adverse event for adult patients (0.20), followed by risk of lymphoma (0.15) and risk of serious infections (0.14). Parents judged the relative importance of the range of risks shown from serious infection (0.19), lymphoma (0.18), and PML (0.17) to be very similar. In addition to the trade-off questions, we asked subjects how worried they would be if the chance of dying from a SAE was 1%. Thirty-eight percent of the adult patients reported that they would be quite or extremely worried about PML, 32% for lymphoma, and 25% for serious infection. For parents, 69% reported they would be quite or extremely worried about PML and lymphoma and 61% for serious infection. These results support the findings from the choice experiment that parents appear to be equally concerned about risk of death from any cause to their children.
MARs for five clinically relevant benefit levels are provided in Table V.7 On average, both adult patients and parents in our sample were willing to accept higher levels of risk in return for improvements in symptom severity and daily activity limitations (holding all other attributes constant). For the benefits shown in Table V, 10-year MAR estimates ranged from less than 1% to more than 10%. For the largest possible improvement from severe to remission (no symptoms), the estimated MARs for adult patients are highest for lymphoma, followed by serious infection and PML, while the ranking for parents is lymphoma, PML, and infection. For benefit categories starting from a severe state, the MAR point estimates are higher for parents than for patients in six out of nine cases. For benefit categories starting from a moderate state, the MAR point estimates are lower for parents than for adult patients in all six cases, with an average difference of nearly 2%. These results suggest possible greater risk tolerance on the part of parents for treating severe CD symptoms in their children, and smaller risk tolerance for treating moderate CD symptoms. However, these differences are not statistically significant because of the relatively small sample size for the parent survey.
Fig. 2 presents a similar result, displaying the predicted choice probabilities for adult patients and parents at given levels of serious infection risks for improvements from severe CD to remission and from moderate CD to remission, respectively. As anticipated, the predicted choice probability decreases as the level of risk increases and decreases as the clinical benefit decreases. The probability of an adult patient accepting a 10-year risk of 5% is more than 50% in return for improvements from either severe or moderate disease states to remission. The probability of a parent accepting a 5% risk is also more than 50% for an improvement from severe CD to remission for their children. However, for an improvement from moderate CD to remission, the predicted probability of a parent accepting risk declines much faster for the smaller treatment benefit and lies below the patient choice probability for 2% and 5% risk. The probability of a parent accepting a 0.5% risk is over 90%, but the probability of a parent accepting a 5% risk is about 20% for the lower treatment benefit.
SC tasks are cognitively challenging, particularly in cases where the level of utility in a pair of treatments is similar. Even attentive participants may give some inconsistent responses. The validity results from the adult-patient sample and the parents sample were quite similar. Using the consistency patterns described above, around 8% of adult patients and 6% of parents failed the transitivity test. Around 8% of adult patients and 10% of parents failed the stability test (reversed their choices in the second question). For the logic test, the failure rate was 13% and 8% for the adult patients and parents, respectively. The results of these internal validity tests compare favorably with other studies. In this study, patients with CD performed significantly better than patients with diabetes and bipolar disorder in other stated-preference health surveys (Johnson & Mathews, 2001; Johnson et al., 2008). Around 61% of patients with diabetes failed the stability, and 26% failed the logic tests. In the study of patients with bipolar disorder, around 20% of patients failed the stability test, and 8% failed the logic test. To confirm the reliability of our results, we excluded responses from participants who failed each of these internal validity tests. We found that there were no significant differences in the point estimates after excluding inconsistent subjects, and differences in confidence intervals were negligible.
We found some evidence of dominant preferences; 19% of adult patients and 22% of parents appear to have based their decisions solely on the symptom severity attribute. We included interaction terms between treatment efficacy and a dominant-preference dummy to control for the influence of subjects who focused exclusively on symptom severity and daily activity limitations. To the extent these subjects' responses actually indicate a strong preference for reductions in symptom severity and daily activity limitations and a high tolerance for SAEs, these statistical controls result in lower bound MAR estimates.
The implications of previous research on parents' WTP to improve their children's health are unclear for treatment-related benefit-risk tradeoffs. It is possible that parents could both be more cautious about risks to their children and also place a high value on improving their children's health. Our results suggest that adult patients and parents of juvenile CD patients in our sample were willing to accept similar levels of SAE risk. However, the results also suggest that while most parents are willing to tolerate low risks, they are much less willing to tolerate high risks for small benefits compared to adult patients. It may be that parents' concern for their children's quality of life leads them to state a greater willingness to accept low levels of risk, but at higher levels of risk, parents are more cautious than adult patients.
These results might also be explained by adaptation theory. Patients with less serious cases have to imagine what it would be like to have more serious symptoms. Because patients with more severe cases of the disease often learn to adapt over time, more serious symptoms may not be as detrimental to quality of life as the less severe patient imagines. Johnson et al. (2007) found that patients with more severe CD are less tolerant of SAE risks than patients with less severe CD. They also found that patients whose CD symptoms had little or no effect on their activities of daily living were willing to take more risks compared with patients who reported considerable problems or inability to engage in daily activities. In this study, the juvenile patients had less severe symptoms than the adult patients. This may explain why parents of juvenile patients were willing to take more risks than adult patients for improvements from severe CD.
The relative importance of the three SAE risks was almost identical for parents, while adult patients viewed PML risk somewhat differently than the other two risks. The preferences of the adult patients over the three risks are consistent with the empirical literature on risk perception, which suggests that characteristics other than probability affect people's assessment of the seriousness of risks (Slovic, 1987). In our sample, parents were less sensitive to the cause of death. It is possible that the preferences of the parents reflect their aversion to any risk that threatens the life of their child regardless of the source.
To our knowledge, this is the first study to compare the tolerance for treatment-related risks between adult patients and parents of juvenile patients. Our results have several important implications. First, responses to the SC survey provide strong evidence that adult patients and parents of juvenile patients are willing to accept tradeoffs between treatment efficacy and risks of SAEs. Overall, the choices indicate a systematic preference for treatments that provide improvements in the four treatment efficacy attributes and smaller exposure to risks of SAEs. However, subjects clearly balance the risks and benefits of treatments, and the preference for improved efficacy can outweigh concerns about SAE risks. While this seems obvious, regulatory decisions are currently made without any data on the rate at which patients are willing to trade off risks and benefits. Without a systematic understanding of patient preferences, regulators may assign little or no value to the benefits of a drug and thus conclude that no one would be willing to accept the risk of an SAE.
Second, the pattern of choices observed in the trade-off tasks allows us to calculate the maximum level of SAE risk subjects are willing to tolerate to achieve treatment benefits. For example, for a treatment offering a decline in symptom severity and daily activity limitations from moderate CD to remission, estimates of mean 10-year MAR risk of PML death or severe disability were around 3.95% for adult patients and 3.62% for parents. For a treatment offering a larger improvement from severe CD symptoms to remission, MAR increases to 7.00% for adult patients and 8.26% for parents.
Finally, the treatment attribute that had the largest overall effect on preferences was symptom severity and daily activity limitations. This was followed by the range of risks shown for PML for adult patients and serious infection for parents. The attribute with the smallest overall effect was the time between flare-ups. Consistent with the finding that parents do not perceive differences among the three types of risk, there is no clear distinction among the MARs for the three SAEs among parents.
These results are subject to several limitations and qualifications. First, while stated-preference methods are widely used in health economics to elicit utility estimates, assess health-related quality of life, and evaluate drug development strategies, they have limitations. One inherent limitation is that SC trade-off tasks ask subjects to evaluate hypothetical treatments. These tradeoffs are intended to simulate possible clinical decisions, but do not have the same clinical, financial, and emotional consequences of actual decisions. Thus, differences can arise between stated and actual choices. We have minimized such potential differences by offering alternatives that mimic real-world tradeoffs as closely as possible. We also have estimated models that understate risk tolerance by controlling for respondents who focused only on efficacy levels.
The survey instrument provided balanced information on treatment attributes that was reviewed by experienced clinicians, but some subjects may have had difficulty assimilating and applying the information provided. In particular, although we provided numeric and graphic representations of SAE risk and comparable risk probabilities, numeracy skills in the general population are poorly developed. Subjects may have applied simplifying heuristics in comparing probabilities that are inconsistent with actual numeric magnitudes. Nevertheless, the observed levels of risk tolerance appear plausible relative to actual mortality risks such as accidents and heart attacks, and it is likely that similar errors in judging risk would occur in real-world decisions. Finally, to the extent that respondents do not see treatments as a set of attributes that can vary independently, the hedonic framework does not accurately model their preferences. Respondents who reject the implications of the hedonic framework may provide inconsistent responses, reject certain combinations of attributes, or dominate on an attribute.
To emphasize the chronic nature of CD and to raise the risk level high enough to facilitate trade-off evaluations, we defined the risk exposure over a 10-year period. It is not clear to what extent possible discounting or assumptions about the timing of the risk during this period affected our estimates. Sensitivity analysis that placed the risk at the beginning and the end of the 10-year interval did not affect our qualitative results. However, if adult patients and parents view discounting differently, this might account for some of the differences between the two samples.
Patients and parents enrolled from the HealthTalk website were not screened to confirm their or their child's reported CD diagnosis. However, the willingness to accept risk of adult patients enrolled from the HealthTalk website were similar to those of adult patients enrolled from clinical practice sites (Johnson et al., 2007). Nevertheless, the proportion of subjects in HealthTalk Internet panel who may not have been diagnosed with CD is unknown.
Finally, our comparison of the preferences of adult patients and the parents of juvenile patients has limitations. The adult CD patients and parents of juvenile patients in this sample are different groups of people. Other studies, which asked the same individual about their WTP for themselves and for their child, found that parents are willing to pay more to improve their children's health than their own. It is possible that the parents in our sample would be willing to accept more risks than they were willing to accept for their child if they were diagnosed with CD.
It is important to emphasize that we report only mean values for risk tolerance estimates for a particular sample of patients and caregivers. Individual subjects' actual risk tolerance may be greater or less than these estimates, and our results should not be interpreted as a guide to therapeutic practice. However, licensing authorities such as FDA do not regulate therapeutic practice. The focus of licensing decisions is on societal welfare and the average patient. Licensing involves a judgment about the overall societal value of making a treatment available to patients. The appropriateness of a particular treatment for a particular patient is left to the professional judgment of an individual physician and the benefit-risk preferences of individual patients. Quantitative estimates of preferences for combinations of risks and benefits developed using rigorous and theoretically sound techniques such as those used in this study may assist regulatory authorities in evaluating new treatments and making the rationale for decisions more transparent.
Regulators and physicians must consider a variety of evidence when they make decisions about approving or prescribing a treatment. Without information about the relative weight patients and caregivers place on alternative outcomes, these decisionmakers can assume patients are risk neutral and place equal weight on alternative outcomes or impose their own assumptions about the “correct” weight that should be put on outcomes. Experience eliciting WTP in the context of economic benefit valuation provides systematic methods for soliciting and quantifying patient preferences over outcomes in either risk- or dollar-denominated measures. The results from our study suggest that SC models offer the opportunity to explore the heterogeneous risk preferences of individuals and can contribute to more informed decision making.
This study was funded by Elan Pharmaceuticals, San Diego, California. The views expressed here do not necessarily reflect those of Elan.
6This directly corresponds to money equivalents such as compensating surplus.
7MARs for other combinations of efficacy attributes can be obtained from the authors by request.