|Home | About | Journals | Submit | Contact Us | Français|
A “utility” is a measure of health-related quality of life (HRQOL) that ranges between 0 (death) and 1 (perfect health). Disease-targeted utilities are mandatory to conduct cost–utility analyses. Given the economic and healthcare burden of irritable bowel syndrome (IBS), cost–utility analyses will play an important role in guiding health economic decision-making. To inform future cost–utility analyses in IBS, we measured and validated the IBS utilities.
We analyzed data from Rome III IBS patients in the Patient Reported Observed Outcomes and Function (PROOF) Cohort—a longitudinal multi-center IBS registry. At entry, the patients completed a multi-attribute utility instrument (EuroQOL), bowel symptom items, IBS severity measurements (IBS Severity Scale (IBSSS), Functional Bowel Disease Severity Index (FBDSI)), HRQOL indexes (IBS quality-of-life instrument (IBS-QOL), Center for disease control-4 (CDC-4)), and the Worker Productivity Activity Index for IBS (WPAI). We repeated assessments at 3 months.
There were 257 patients (79% women; age = 43±15 years) at baseline and 85 at 3 months. The mean utilities in patients with severe vs. non-severe IBS were 0.70 and 0.80, respectively (P < 0.001). There were no differences in utilities among IBS with constipation (IBS-C), IBS with diarrhea (IBS-D), and mixed IBS (IBS-M) subgroups. EuroQOL utilities correlated with FBDSI (r = 0.31; P < 0.01), IBSSS (r = 0.36; P < 0.01), IBS-QOL (r = 0.36; P < 0.01), CDC-4 (r = 0.44; P < 0.01), WPAI presenteeism (r = 0.16; P < 0.01), abdominal pain (r = 0.43; P < 0.01), and distension (r = 0.18; P = 0.01). The utilities in patients reporting “considerable relief” of symptoms at 3 months vs. those without considerable relief were 0.78 and 0.73, respectively (P = 0.02).
EuroQOL utilities are valid and reliable in IBS. The utility of severe IBS (0.7) is similar to Class III congestive heart failure and rheumatoid arthritis. These validated utilities can be employed in future IBS cost–utility analyses.
Irritable Bowel Syndrome (IBS) is a prevalent and expensive condition affecting 5–10% of the US population at a cost exceeding $ 20 billion annually (1–3). The health economic burden of IBS is amplified by the negative impact of the syndrome on health-related quality of life (HRQOL) resulting from a range of physical, psychological, and social stressors (4–7). Patients with IBS have a lower HRQOL compared with “normal” non-IBS cohorts (4). Moreover, IBS has the same physical HRQOL as diabetes, and a lower physical HRQOL compared with depression or gastroesophageal reflux disease (4). Perhaps more surprisingly, mental HRQOL scores are lower in IBS than in chronic renal failure—an organic condition marked by considerable physical and mental disability (4). This HRQOL decrement can, in some cases, be so severe as to raise the risk of suicidal behavior independent of comorbid psychiatric diseases (8). In short, IBS unquestionably has a negative impact on HRQOL, and failing to recognize this impact could undermine the physician–patient relationship and lead to dissatisfaction with care. As HRQOL decrements are common in IBS, the American College of Gastroenterology recommends that clinicians carry out routine screening for diminished HRQOL in their IBS patients (3).
Beyond the immediate clinical implications of HRQOL measurement, there are important health economic consequences of accurately measuring HRQOL in IBS. A health economic principle is that third party payers—and society itself—are willing to expend more money for conditions that severely impact HRQOL compared with conditions that do not (9). In the setting of limited resources, payers should first allocate funds to conditions that are prevalent and highly morbid, and then progress to conditions that are less prevalent and less morbid. The guiding economic principle is to provide the most good to most people within the constraints of a limited budget. The traditional litmus test for allocating funds, technically defined as the incremental cost-effectiveness ratio, or ICER (9), can be summarized with a more colloquial question: “Is the juice worth the squeeze?” In health economic terms, HRQOL is the “juice” and money is the “squeeze.”
With this background, it becomes clear why accurately measuring HRQOL is vital in IBS—a condition marked by considerable morbidity, but rarely mortality. If survival were the only outcome of interest in driving funding decisions, then IBS would not be a viable target for scarce resources. However, incorporating HRQOL into economic decision-making acknowledges the impact of non-mortal conditions such as IBS.
The process of incorporating HRQOL into budget allocation decisions requires valid and reliable utility scores. Utilities are a measure of HRQOL that range between 0 (death) and 1 (perfect health) (9–11). Disease-targeted utilities are the core metric for conducting cost-utility analyses and form the basis of quality-adjusted life-years. Quality-adjusted life-years account for both quantity and quality of life—not merely overall survival (9).
To inform future cost-utility analyses in IBS, we measured and comprehensively validated IBS utilities using the Euro-QOL—a multi-attribute generic utility measure that is widely used throughout healthcare (12). We specifically sought to: (i) establish whether the EuroQOL can yield valid utilities in IBS by measuring construct validity with a range of important IBS symptoms and indexes; (ii) compare utilities for IBS with constipation (IBS-C), IBS with diarrhea (IBS-D), and mixed IBS (IBS-M); (iii) compare utilities for severe vs. non-severe IBS, as defined by validated instruments; and (iv) establish a rule for interpreting EuroQOL change scores in IBS—both for clinical applications and future clinical trials incorporating the Euro-QOL as an end point.
We evaluated consecutive patients aged ≥ 18 years with Rome III-positive IBS (including IBS-C, IBS-D, and IBS-M) enrolled in the IBS Patient Reported Observed Outcomes and Function (PROOF) cohort. PROOF is an internet-based, longitudinal, observational registry of IBS patients identified within a network of eight geographically diverse US Centers. These included five University-based academic Centers (University of California Los Angeles, Mayo Clinic Scottsdale, Columbia University, University of Michigan, Beth Israel Deaconess), one community-based primary care clinic (Cedars-Sinai Medical Center outpatient care clinic), one Health Maintenance Organization general gastroenterology clinic (Kaiser Bell-flower), and one community-based private general gastroenterology clinic (Atlanta Gastroenterology Associates). PROOF does not mandate any specified treatments or protocols; patients receive the “usual care” of their healthcare providers. In this regard, PROOF is a natural history cohort outside the context of a traditional clinical trial. Patients are enrolled by PROOF investigators at each of these Centers using pre-printed, unique-numbered brochures, and complete the questionnaires online. Patients without internet access or unwilling to complete online elicitations are offered paper surveys. Each of the PROOF investigators is an experienced gastroenterologist with knowledge regarding the appropriate application of the Rome III criteria. The cohort is administered centrally through the University of California at Los Angeles/Veteran Administration (UCLA/VA) Center for Outcomes Research and Education (CORE). Patients access the online survey through the UCLA/VA CORE website and receive $ 10 for participating. Before enrollment, patients complete a set of introductory screens that present the Rome III diagnostic items for IBS. Patients who do not meet Rome III criteria at the time of survey are not allowed to enter the full survey, and are subsequently removed from the cohort. Thus, there are two lines of security to maximize the likelihood that all patients have Rome III-positive IBS: an initial screen by an experienced gastroenterologist, and a secondary application of the Rome III criteria at the time of the survey.
After a baseline survey, all participants receive a follow-up online survey at 3 months. Those failing to complete the online survey receive a paper survey by mail. The baseline PROOF questionnaire collects a wide range of biopsychosocial variables, including disease-specific and generic health HRQOL measures, psychological distress measures, resource utilization measures, worker productivity data (including absenteeism and presenteeism), severity indexes, intestinal and “extra-intestinal” comorbidities, concurrent treatments, and symptom profiles (Table 2). The study was approved by the University of California at Los Angeles Institutional Review Board and was conducted in accordance with the institutional guidelines regulating human subject research.
The main outcome of this study was the EuroQOL utility index (also known as the EQ5D) (12). The EuroQOL is a multi-attribute HRQOL instrument that is widely used throughout healthcare. It consists of five items covering five dimensions, namely mobility, self-care, ability to carry out usual activities, pain/discomfort, and anxiety/depression (Figure 1). Each item has three response levels, as portrayed in Figure 1, allowing for a total of 243 possible health state combinations. Each health state maps to a utility score rating from a representative societal sample, as established earlier in the validation of the instrument (12). Each patient completing the five-item EuroQOL is assigned a utilty score between 0 (worst HRQOL) and 1 (perfect HRQOL). For further information regarding the EuroQOL and its development, refer to the EuroQOL users manual, website (www.EuroQOL.org), and related publications (12). We obtained permission from the EuroQOL group to employ the instrument for this study, and registered the study with the Euro-QOL database.
Before calculating the mean utility scores in our IBS cohort, we first sought to establish whether the EuroQOL is a valid instrument in IBS. One method of establishing an instrument ’ s validity is to measure its relationship with other established biopsychosocial domains. Thus, in order for the EuroQOL to show baseline construct validity, we hypothesized a priori that its scores must significantly correlate with other predetermined IBS constructs—a list of constructs culled from the literature as key components of the IBS illness experience (Table 1). In other words, if the EuroQOL were unable to correlate with these key constructs, then it would be considered unrelated to the IBS illness experience, thus undermining the relevance of any utilities generated by the instrument in IBS patients. Specifically, we measured “IBS severity” with the IBS Severity Scale (IBSSS) (13) and the Functional Bowel Disease Severity Index (FBDSI) (14), disease-targeted HRQOL with the IBS quality of life instrument (IBS-QOL) instrument (15), generic HRQOL with the Center for disease control-4 (CDC-4) instrument (16), and worker productivity with the IBS version of the Work Productivity Activity Index (WPAI:IBS) (17). In addition, we measured a range of individual IBS symptoms, including abdominal pain, bloating, distension, stool frequency, stool form, urgency, disease duration, flare duration, and IBS subtype.
In addition to base-line correlations, we measured the ability of the EuroQOL to track longitudinally with our prespecified IBS constructs. Patients completed the EuroQOL and concurrent indexes at baseline and 3-month follow-up periods. We then measured correlations in longitudinal change scores between the EuroQOL and individual items and constructs. This set of analyses sought to establish whether the EuroQOL was sufficiently responsive to change—i.e., whether it could it longitudinally track in the same direction as concurrently measured biopsychosocial constructs.
We sought to establish whether the EuroQOL could significantly distinguish between clinically relevant subgroups. We conducted three sets of analyses: (i) compared utility scores between IBS-C, IBD-D, and IBS-M subgroups, using ANOVA (analysis of variance); (ii) compared utility scores with a t-test between severe and non-severe IBS patient groups, using dichotomized IBSSS as the stratifier to define severity groups; and (iii) compared utility scores with a t-test between those who reported having at least “considerable relief” of their overall IBS symptoms at 3 months vs. those with less or no relief of symptoms—i.e., a binary outcome similar to end points used in IBS clinical trials (22).
In order for any patient-reported outcome (PRO) to be successfully used in prospective trials, it is important to first establish an a priori responder definition for the PRO. When using a linear outcome, such as HRQOL, “responders” can be defined as those achieving a “minimally clinically important difference” or “MCID” on the response scale (23). For example, the MCID on the IBS-QOL, a commonly used HRQOL measure in IBS, is 10 points (24). Thus, any patient meeting or exceeding a 10-point change on the IBS-QOL is considered to be a “responder.” There are various ways to measure the MCID of a PRO, but the optimal approach is to link change scores to patient report of improvement using a balanced 13-point response scale, as described by Guyatt (23). In this technique, the response scale is administered at the follow-up period, and asks patients to consider their over-all health compared with the last time it was evaluated. The scale includes six levels of improvement, six levels of decrement, and one level for “almost the same” which balances the scale in the middle. Patients scoring + 1 (“a little bit better”) or + 2 (“somewhat better”) are considered to have minimally improved. The MCID is then defined as the mean change score in this subgroup of minimal responders. We employed this technique, using a Guyatt scale at 3-month follow-up, to calculate the MCID of the EuroQOL in our IBS population. We used Stata Statistical Software Release 8.0 (Stata Corporation, College Station, TX) for all the analyses.
There were 257 patients at baseline and 85 at 3 months. Table 2 provides an overview of key baseline characteristics of the sample. The patient profiles are consistent with earlier studies in IBS, namely the patients were primarily young (mean age = 43±15 years) and female (79%). The population was diverse across demographic characteristics, including race, education, and income. A total of 18% of the cohort had IBS-C, 33% IBS-D, and 49% IBS-M using Rome III criteria. On a 20-point severity numeric rating scale, the mean severity was 11±5. Using IBSS criteria for severity, 17%, 28%, and 55% of patients had mild, moderate, and severe IBS symptoms.
The mean EuroQOL utility was 0.75±0.14 at baseline and 0.74±0.14 at the 3-month follow-up period. Table 3 shows the baseline and longitudinal correlations of the EuroQOL with the a priori IBS constructs. At baseline, the EuroQOL significantly correlated with both measures of IBS severity (FBDSI, IBSSS), disease-targeted HRQOL (IBS-QOL), generic HRQOL (CDC-4), work presenteeism, hospital anxiety and depression scores, and key bowel symptoms, including abdominal pain and distension. The EuroQOL achieved statistically significant longitudinal tracking with changes in the FBDSI, IBSSS, IBS-QOL, CDC-4, WPAI:IBS (worker productivity activity index for IBS) presenteeism, and hospital anxiety and depression scores. In short, the instrument significantly correlated with a wide range of disparate instruments that jointly capture the illness experience of IBS, both at baseline and longitudinally.
There were no significant differences in EuroQOL utilities between IBS-C (0.76), IBS-D (0.76), and IBS-M (0.73) subgroups (P = 0.55). The mean utilities in patients with severe vs. non-severe IBS (using standard IBSSS definitions) were 0.70 and 0.80, respectively (P & lt;0.001). The utility in patients reporting “considerable relief” of symptoms at 3 months vs. those without considerable relief was 0.78 and 0.73, respectively (P = 0.02).
There were 21 patients who improved minimally during the 3-month period, as defined by a + 1 or + 2 improvement on the follow-up Guyatt response scale. The mean EuroQOL chance score in this subgroup was 0.03 points, suggesting that patients must improve by at least 0.03 points on a utility scale to meet the MCID benchmark.
Health economists generally prioritize funding to conditions that are mortal yet treatable. Once these conditions are supported, additional funding is then allocated to treatable conditions that affect morbidity but not mortality—conditions such as IBS. Within this group of conditions, health economists rank order their funding decisions on the basis of incremental cost-utility, a process that accounts for the HRQOL of untreated vs. treated disease (9). This is more than theory—it is the fundamental process by which health economic decision-making unfolds. In light of this reality, it is vital to accurately and reliably measure health utilities in conditions marked by morbidity but not mortality.
Earlier studies have measured utilities in patients with IBS. For example, Pare et al. (25) calculated a mean Euro-QOL score of 0.64 in a cohort of unclassified IBS patients in Canada, and Akehurst et al. (26) arrived at a mean utility of 0.68 in a cohort of Rome I IBS patients in the United Kingdom. However, these studies have limitations, including (i) the IBS patients were not fully described in terms of IBS subgroupings, symptom scores, severity indexes, and psychosocial variables, thus making it difficult to generalize to other populations; (ii) the studies did not measure the fundamental psychometric properties of the EuroQOL in IBS, leaving it uncertain whether EuroQOL-derived utilities are valid in the first place; (iii) the studies did not establish whether the EuroQOL could distinguish between important subgroups, including severity or response-status groupings; (iv) the studies did not measure utility scores in IBS bowel habit subtypes; and (v) the studies did not measure the MCID for EuroQOL utilities in IBS. In this study, we aimed to explicitly measure the psychometric properties of the EuroQOL in IBS, establish whether the EuroQOL utilities could distinguish between clinically relevant subgroups, and calculate an MCID for this questionnaire in the IBS population.
We found that the mean baseline utility for our IBS cohort was 0.75. To put this into perspective, this is equivalent to a 35-year-old IBS patient stating that she would be willing to forego 10 of her remaining 40 years of projected life in exchange for an immediate and lasting IBS cure. Moreover, the utility for severe IBS is 0.7—the same value as class III congestive heart failure (27). More than numerical values to plug into future cost-utility analyses, these scores highlight the considerable burden of illness engendered by IBS, and remind us that IBS is not simply a nuisance condition.
In addition, we found that utility scores do not vary significantly among IBS-C, IBS-D, and IBS-M subgroups, as defined by Rome III criteria. This is consistent with earlier studies that found no differences in generic HRQOL scores between groups (5). Similarly, recent data do not reveal differences in overall severity scores between these subgroups (18). Nonetheless, it is possible that a disease-targeted HRQOL instrument might detect meaningful differences between these sub-populations. In the meantime, this study adds to earlier data that the patient-reported illness experience is similar among the IBS bowel habit subtypes.
We found that the EuroQOL exhibits excellent baseline and longitudinal validity in IBS. Specifically, the EuroQOL correlates both cross-sectionally and longitudinally with a wide range of disparate IBS severity anchors, including bowel symptoms, HRQOL scores, severity scores, and even worker productivity (Table 3). In addition, the EuroQOL can discriminate between important subgroups, including global responders and severity groups. This indicates that the EuroQOL accurately captures the illness experience of IBS, and that EuroQOL utilities generated in IBS patients are sufficiently valid for use in future cost-utility studies, clinical trials, and even in everyday clinical practice where tenable.
In order for a PRO to be useful in a clinical trial, it is important to have an a priori definition of a “responder.” This is traditionally measured with the MCID, on the PRO scale (23). Using the technique described by Guyatt (23), we found that improvement by 0.03 points on a 0–1 scale defined “minimal improvement” in our IBS cohort. To help interpret the meaning of a 0.03-point improvement, it is equivalent to a patient with 33 years of projected lifespan saying he is willing to forego 1 of those years in exchange for an IBS cure—i.e., he would be willing to live only for 32 instead of 33 years, if it meant being cured of IBS forever. These “time trade-off” exercises (10), although seemingly academic, are at the heart of health economic decision-making. Moreover, they provide an immediate and important insight into severity of disease, because they require patients to value their disease in terms of life-years.
Our study has limitations. First, this is an observational cohort of patients, not a tightly controlled clinical trial. However, we believe there are important benefits of monitoring IBS patients outside of a clinical trial. For one, we have found that IBS patients in this cohort did not, in fact, improve over time as measured using EuroQOL. The mean utility fell from 0.75 to 0.74 over a 3-month period, reflecting a combination of some patients improving and others worsening. Although our study was not designed to answer the question of whether IBS patients “get better” over time, it is noteworthy that this cohort did not achieve a mean improvement in generic HRQOL over the course of a 3-month natural history study. This highlights the important differences between carefully conducted and monitored trials vs. everyday practice. Academic theory and clinical reality must be distinguished, and this observational cohort shows that treatment of IBS in everyday clinical practice may not fully comport with the results of clinical trials. More to the point of our study, an observational cohort is well suited for the purpose of psychometric validation of PROs. It is by no means mandatory to validate PROs in the context of a clinical trial. In fact, it is arguably suboptimal to use clinical trials as a platform for psychometric validation.
A second limitation is that, we only monitored patients over a 3-month period. As IBS is a chronic condition, it is possible that utility scores might vary with longer follow-up periods. However, the Rome guidelines recommend a minimum 4 to 12-week period for purposes of clinical investigation (28), and our study complies with this standard. Nonetheless, we plan to continue following this cohort over an extended 12-month period to evaluate for subsequent changes.
Our study is further limited by the relatively small number of patients at follow-up vs. baseline. This occurred for several reasons, including (i) some patients opted to only participate in the baseline survey, and requested no further communications thereafter; (ii) patients only received payment for initial enrollment—not for follow-up assessments; (iii) as our recruitment is a rolling process, some patients had not completed the 3-month follow-up at the time of analysis, and (iv) many patients did not respond to our follow-up email or paper mail requests. To check whether the non-responders were systematically different from the responders, we compared baseline demographics (age, gender) and severity score (IBSSS, FBDSI, severity numeric rating scales) between groups. There were no significant differences between these groups at baseline.
Finally, the seemingly low utilities of our IBS cohort might be a reflection of a sampling bias of severe patients. However, although our cohort included patients from tertiary care referral centers, it also included patients from a range of secondary care centers. Moreover, the mean severity score was only 11 on a 20-point scale, indicating a spread of severity without a skew to more severe patients. Similarly, using standard IBSSS criteria, 45% of the cohort had non-severe symptoms. In short, our sample seems to generalize well to other physician-seeking IBS patients, not only on the basis of overall severity scores, but also on the basis of sociodemographic characteristics, as presented in Table 2.
In conclusion, EuroQOL utilities exhibit excellent construct and discriminant validity in IBS. Scores on the questionnaire can be readily interpreted with an MCID of 0.03 points—a value that could potentially serve as an end point for clinical trials, or even for everyday use in practices that monitor HRQOL scores like a “vital sign.” The utility of severe IBS (0.7) is on par with class III congestive heart failure (27). These validated utilities can now be used in future cost-utility analyses in IBS in an effort to further define the health economic implications of competing treatment strategies for this expensive, prevalent, and oftentimes morbid disorder.
Potential competing interest: Brennan Spiegel received grant support from Takeda. He is a consultant for Prometheus, Takeda, McNeil, and Rose Pharmaceuticals.
Financial support: Spiegel is supported by a Veteran’s Affairs Health Services Research and Development Career Development Award (RCD 03-179-2), and the CURE Digestive Disease Research Center (NIH 2P30 DK 041301-17). Chang, Naliboff, and Mayer are supported by NIH grant no. P50 DK64539, and Spiegel, Mayer, and Naliboff are supported by NIH Center Grant 1 R24 AT002681-NCCAM from the UCLA Center for Neurobiology of Stress.
CONFLICT OF INTEREST
Guarantor of the article: Brennan Spiegel, MD, MSHS.
Specific author contributions: Spiegel and Chang formulated the hypotheses and aims of the study, wrote the study protocol, and prepared the manuscript. Spiegel carried out the analyses in concert with Chang and Bolus. Chang, Chey, Dulai, Esrailian, Harris, Karsan, Lembo, Lucak, Talley, and Tillisch assisted with patient recruitment and review of the manuscript. Naliboff and Mayer provided intellectual input to the manuscript.
Disclaimer: The opinions and assertions contained in this article are the sole views of the authors and are not to be construed as official or as reflecting the views of the Department of Veterans Affairs.