|Home | About | Journals | Submit | Contact Us | Français|
Cost-effectiveness analysis of a randomized plus observational cohort trial
Analyze cost-effectiveness of Spine Patient Outcomes Research Trial (SPORT) data over 4 years comparing surgery with non-operative care for three common diagnoses: spinal stenosis (SpS), degenerative spondylolisthesis (DS) and intervertebral disc herniation (IDH).
Spine surgery rates continue to rise in the US, but the safety and economic value of these procedures remains uncertain.
Patients with image-confirmed diagnoses were followed in randomized or observational cohorts with data on resource use, productivity and EQ-5D health state values measured at 6 weeks, 3, 6, 12, 24, 36, and 48 months. For each diagnosis, cost per quality-adjusted life year (QALY) gained in 2004 US Dollars was estimated for surgery relative to non-operative care using a societal perspective, with costs and QALYs discounted at 3% per year.
Surgery was performed initially or during the 4-year follow-up among 414/634 (65.3%) SPS, 391/601 (65.1%) DS and 789/1192 (66.2%) IDH patients. Surgery improved health, with persistent QALY differences observed through 4 years (SpS QALY gain 0.22; 95%CI: 0.15, 0.34; DS QALY gain 0.34, 95%CI: 0.30, 0.47; IDH QALY gain 0.34, 95%CI: 0.31, 0.38). Costs per QALY gained decreased for SPS from $77,600 at 2 years to $59,400 (95%CI: $37,059, $125,162) at 4 years; for DS from $115,600 to $64,300/QALY (95%CI: $32,864, $83,117); and for IDH from $34,355 to $20,600/QALY (95%CI: $4,539, $33,088).
Comparative effectiveness evidence for clearly defined diagnostic groups from SPORT shows good value for surgery compared with non-operative care over 4-years.
The American Recovery and Reinvestment Act of 2009 mandated a $1.1 billion investment in comparative effectiveness research, defined by the Institute of Medicine (IOM) as “…the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition….”1 While the role of economic endpoints in comparative effectiveness research remains controversial, the marked growth in complex spine surgery and accompanying expenditures in the US population over the past two decades has prompted concern regarding spine surgery’s value for both individual patients and society. 2-4 Begun more than a decade ago, the Spine Patient Outcomes Research Trial (SPORT) addresses IOM priority conditions and addresses the comparative effectiveness of surgery and non-operative care using clinical and economic endpoints from both randomized and observational study cohorts.5-10
SPORT was designed with a secondary objective of assessing the cost-effectiveness of spine surgery for patients with back and/leg symptoms for three specific clinical conditions. The economic value of surgery relative to non-operative care at 2 years compared favorably with many health interventions.11,12 However, surgery for degenerative spondylolisthesis was somewhat more costly than for patients with stenosis alone (mean cost per QALY gained of $115,600 vs. $77,600 for stenosis alone). This was largely due to differences in the initial cost of surgery for patients with degenerative spondylolisthesis; these patients often undergo fusion surgery, which is more costly than decompressive laminectomy alone (the most common procedure in patients who have only stenosis). In contrast to prior literature, we hypothesized that surgery’s value–in these well-defined conditions–would improve over time. This would occur if health gains remained durable, especially if patients receiving surgery had lower ongoing health care costs relative to non-operatively treated patients, taking into account the offsetting cost of repeat surgeries, which would have the potential to diminish surgery’s cost-effectiveness. In this paper we report SPORT 4-year cost-effectiveness outcomes for all patient groups.
Details of SPORT’s design and conduct are provided elsewhere.6-8,10,13 In brief, participants enrolled in either a randomized or observational cohort from 13 participating U.S. multidisciplinary spine practices in 11 states between March 2000 and March 2005 and were followed for outcomes over 4 years. Participants in the randomized group were assigned treatment while those in the observational cohort chose their treatment. Eligible participants were aged 18 and older with well-defined symptoms, physical findings and imaging-confirmed diagnosis of spinal stenosis either alone (SpS) or associated with degenerative spondylolisthesis (DS), or diagnosis of intervertebral disc herniation (IDH). Non-operative treatments were ‘usual care’ determined by patients’ and physicians’ choice. For SpS, the protocol surgical intervention was a standard posterior laminectomy. For DS, the protocol surgery was the same procedure with or without bilateral single-level fusion with or without instrumentation. For IDH, the protocol surgical intervention was a standard open discectomy. An independent Data Safety and Monitoring Board oversaw the study and a human subjects committee approved the protocol at each institution.
For the cost-effectiveness analysis, treatment effectiveness was measured using quality-adjusted life years (QALYs), which account for both length and quality of life,14 by weighting time spent in each health state by a health state value. Health state values–reflecting societal health preferences on a scale where a year in best imaginable health is assigned a value of 1 and death is assigned a value of 0–were obtained using the EuroQol EQ-5D instrument with US scoring15,16. Secondary analyses used the SF-6D (UK scoring) health state values derived from SF-36 health status instrument. 17 Mean health states were estimated at baseline, 6 weeks, 3, 6, 12, 24, 36 and 48 months.
Participants were given health care diaries to assist them in tracking both medical resource use and time lost from work and other activities. Total costs included direct medical costs (based on patient-reported utilization; limited to spine-related services except for physician visits and hospitalizations) and indirect costs (based on patient-reported time away from work and/or usual activities due to spine-related problem(s)). Information was collected from patients with questionnaires at each time point, using either a 6-week (at 6 weeks and 3 months) or 1-month recall period. Reports of hospitalizations, surgeries and device use were not confined to a recall window.
Direct medical costs included any emergency room or outpatient visits (surgeons, chiropractors, other physicians, physical therapists, acupuncturists, or other health care providers) and spine-related diagnostic tests (radiograph, computed tomography scan, magnetic resonance imaging); electromyography; injections; devices (e.g., braces, canes, walkers); medications; and rehabilitation or nursing home days. To estimate direct medical costs, unit costs were assigned to each visit, test, and procedure based on 2004 Medicare national allowable payment amounts18 (see Appendix Table), with medication costs based on 2004 average wholesale prices.19 For each participant, medical resource use was multiplied by unit costs to estimate total direct medical cost at each time point. All costs are expressed in 2004 US dollars.
Surgery costs depended on the procedure performed and occurrence of complications, which in turn determined the diagnosis-related group. The observed 2004 Medicare mean total diagnosis-related group payment was used to reflect hospital-related surgery costs. Surgeon costs were based on 2004 Medicare allowable amounts using the resource-based relative value scale.20 Anesthesiology costs were estimated using operative time. For hospitalizations not associated with a spine surgery, costs were based on the diagnosis-related group using mean observed 2004 Medicare payments.
At each follow-up the impact of spine-related problems on productivity was assessed. Participants were asked to report missed work days if employed outside of the home and missed homemaking days if housekeeping was designated as the primary work activity. Use of unpaid caregivers for spine-related problems (including spousal care giving) was also obtained. Costs were estimated using the standard human capital approach21 by multiplying the change in hours worked by the gross-of-tax wage rate based on self-reported wages at study entry. Costs for missed days of housekeeping and unpaid caregivers were valued based on average wages plus non-health benefits for individuals ages 35 and older.22-24
Data were analyzed separately by disease group according to treatment received for the pooled SPORT randomized and observational cohorts, using longitudinal regression models fitted with generalized estimating equations 25,26. Separate models were fit for EQ-5D and for 30-day costs measured at each follow-up time point after surgery or the beginning of non-operative therapy. If a visit was missing, all other available visits for that patient were included in the analysis.
The treatment indicator (surgery versus non-operative care) was a time-dependent covariate, allowing for variable surgery times. Following surgery, outcomes were assigned to the surgical group, with follow-up times measured from the date of surgery. To adjust for potential confounding in each model and the possible effects of missing data, baseline variables associated with missing data or treatment received were included as covariates. All models included a fixed effect for center. To account for correlations among repeated measurements for individuals, including observations before and after surgery, the longitudinal regression models were fit with PROC GENMOD (SAS version 9.1 Windows XP Pro, Cary, NC), specifying a compound symmetry assumption for the working covariance matrix.
The primary cost-effectiveness endpoint was the incremental cost-effectiveness ratio (ICER) estimated as cost per QALY gained for surgery relative to non-operative treatment. For stenosis patients with or without degenerative spondylolisthesis we report cost per QALY gained by surgery type relative to non-operative care.
Mean total costs and QALYs from baseline to 4 years were estimated for each diagnosis and treatment group using a 3% annualized discount rate for both endpoints. Discounting is used to weigh near-term costs and health more heavily in the analysis than those occurring in the future. A time-weighted average was used to estimate the difference in QALYs between the surgical and non-operative treatments based on adjusted mean differences in EQ-5D estimated from longitudinal regression models at each follow-up. QALY differences between treatment groups were estimated using a common baseline EQ-5D value. For costs, mean differences were based on adjusted mean costs summed across time points for each treatment group. To estimate a confidence interval for the cost per QALY gained, a bootstrap method was applied using 1000 samples taken with replacement from the original sample with the individual as the unit of observation.
Sensitivity analyses of analytic assumptions included: restricting analyses to the randomized or observational cohort; limiting costs to direct medical costs only to facilitate comparison with Reference Case recommendations of the US Panel on Cost-effectiveness in Health and Medicine; increasing surgery costs to 70% of the amount billed to Medicare; using SF-6D to estimate effectiveness; and accounting for observed mortality. We also performed a sensitivity analysis in which patients who received surgery more than 2 years after study entry were censored from the analysis at the time of surgery.
A total of 414/ 634 (65.3%) SpS, 391/601 (65.1%) DS and 789/1,192 (66.2%) IDH participants underwent surgery. Examination of baseline participant characteristics by treatment received over 4 years (Table 1) shows that surgically treated patients, in general, were younger; more frequently perceived that their problem was getting worse; and had a definite surgical preference compared with non-operatively treated participants.
Among surgically treated patients, reoperations were not common. For SpS, 43 (10.4%) patients underwent 47 additional surgeries; for DS, 48 (12.3%) patients had 52 additional surgeries, and for IDH 70 (8.9%) patients had 82 repeat surgeries. In each case, the majority of repeat surgeries were within 2 years of the initial surgery with a substantial minority occurring after 2 years, including 32.6% of SpS, 20.8% of DS and 24.4% of IDH repeat procedures.
Higher health state values were observed over time among surgically treated patients, than among non-operatively treated patients (Figure 1). Mean quality-adjusted life years over the 4-year study period ranged from 2.66 to 3.24 (Table 2). QALY differences between treatment groups over 4 years were 0.22 (95%CI: 0.15, 0.34) for SpS; 0.34 (95%CI: 0.30, 0.47) for DS; and 0.34 (95%CI: 0.31, 0.38) for IDH.
Adjusted total mean costs remained higher for surgically treated patients than for non-operatively treated patients across all patient groups (Table 2). Cost differences between treatment groups over 4 years were $13,147 (95%CI: $9,168, 21,716) for SpS; $22,127 (95%CI: $13,149, $38,317) for DS; and $6,994 (95%CI: $1,900, $11,237) for IDH. Examination of costs by treatment received showed somewhat different patterns over time (Figure 2). Ongoing direct medical costs were observed for all groups (Figure 2A) with similar expenditure patterns between treatment groups within each disease category (data not shown). The largest ongoing costs occurred for DS patients, who had higher ongoing indirect costs among non-operatively treated patients (Figure 2B).
The cost per QALY gained for surgery relative to non-operative care was lowest for those with IDH ($20,600) and highest for those with DS ($64,300) (Table 2). Only 23 DS patients underwent decompression alone and only 47 SpS patients underwent fusion surgery, making definitive comparisons between procedures within disease groups impractical. Among those with SpS, fusion surgery’s cost per QALY gained relative to non-operative care was $257,600 with a very wide confidence interval (Table 2). Among those with DS, fusion surgery’s cost per QALY gained relative to non-operative care was $66,300.
When type of instrumentation was examined for DS patients who underwent instrumented fusion, no statistically significant differences in QALY outcomes were found. The cost-effectiveness of each type of instrumentation relative to non-operative treatment was comparable at approximately $65,000 to $75,000 per QALY gained.
In sensitivity analyses, mortality adjustment, method of QALY estimation, and limiting the analysis to surgeries occurring within 2 years had little impact on cost-effectiveness estimates (Table 3). While study cohort (randomized vs. not) had little impact on cost-effectiveness for DS or IDH, in the SpS group the randomized cohort cost per QALY gained was somewhat higher at $124,700. Estimates remained below $125,000 per QALY gained across disease groups when higher surgery costs were used.
We used longitudinal patient-reported data on resource use, productivity loss, and health-related quality of life to evaluate the cost-effectiveness of surgery relative to non-operative care for three well-defined clinical cohorts. Compared with findings over 2 years, when assessed over 4 years the value of surgery improved for all groups, and most notably for individuals with DS. This finding warrants examination of both the effectiveness and cost sides of the cost-effectiveness equation in comparison to previously reported 2-year outcomes.11,12
QALY differences at 2 years between surgically and non-operatively treated individuals of 0.17 (95%CI: 0.12, 0.22) for SpS; 0.23 (95%CI: 0.19, 0.27) for DS; and 0.21 (95%CI:0.16, 0.25) for IDH were previously reported.11,12 Using these 2-year differences as benchmarks, the 4-year QALY results reported here continue to favor surgery (additional QALY differences of 0.05 for SpS; 0.13 for both DS and IDH for 4-year minus 2-year outcomes). However, the magnitude of the difference for SpS patients diminished far more than can be explained by the 3% per year discount rate employed in our analysis. For SpS patients, differences favoring surgery in years 3 and 4 were reduced by approximately 75% from those observed over 2 years. For DS and IDH QALY gains remained comparable over time.
Net costs over 2 years were higher for surgically vs. non-operatively treated patients with reported differences, inclusive of initial and repeat surgery costs, of approximately $13,000 for SpS; $22,000 for DS; and $7,000 for IDH under Medicare costing.11,12 Both DS and IDH patients who were non-operatively treated tended to have ongoing costs in years 3 and 4 that were higher than those observed for surgically treated patients, with this effect being most pronounced for individuals with DS, mainly due to productivity losses. By contrast, SpS patients had costs that were fairly comparable between surgically and non-operatively treated patients, with costs during years 3 and 4 being slightly higher in operatively treated patients.
Following effectiveness and cost patterns over time resulted in improved estimates of surgery’s value–particularly for DS and IDH patients. These findings highlight the importance of following health and economic outcomes longitudinally to determine value over an extended time horizon.
While the randomized clinical trial remains a cornerstone for comparative effectiveness research, it is widely recognized that alternative study designs, including observational cohorts, are necessary to support comparative evidence development for many diseases and population subgroups. Our analyses included SPORT’s randomized and observational cohorts and adjusted for factors known to affect treatment received due to the cross-over between treatment groups. We acknowledge, however, that our analytic approach cannot control for any un-measured differences between the two patient groups. To assess the potential impact of treatment selection on cost-effectiveness results in sensitivity analyses we reported results for the observational and randomized cohorts separately and also undertook an analysis where surgeries occurring beyond two years were removed from consideration. Mean costs per QALY gained remained fairly stable and in all cases fell within the 95% confidence interval reported for the primary analysis.
Previous randomized and non-randomized observational studies have shown a diminution in effect of surgery over time and cost-effectiveness reports have been based on decision-analytic models and/or incomplete longitudinal data.27-29 For example, a model-based analysis compared types of surgery 30 for stenosis patients, but did not assess surgery’s value relative to non-operative care and did not use longitudinal resource utilization data. Other economic analyses have addressed the value of spinal fusion for various populations or lumbar discectomy for IDH but have not reported outcomes using the recommended effectiveness measure, QALYs. 4,14,29,31 Although our study addressed many of these shortcomings, several limitations warrant mention. First, our study relied on patient self-report of resource use and work/ activity limitations over time. While these have not been validated against an external source, comparisons between treatment groups using the same self-report methodology are likely to provide reasonable estimates of cost differences between treatment groups over time. Second, although the SPORT cohorts represent the largest groups followed to date with health-related quality of life outcomes collected prospectively at multiple follow-ups, we are limited in our ability to make comparisons between types of surgery due to the predominance of one type of surgery within each disease group. For example, 79% of stenosis surgeries involved decompression without fusion while 91% of surgeries in those with degenerative spondylolisthesis involved fusion. Likewise, our study was not powered to examine surgery by fusion type (instrumentation vs. not; type of instrumentation, etc), yet the SPORT study represents the best available evidence to date with outcomes reported over 4 years. Finally, it is also important to emphasize that our results address the value of spine surgery in individuals with well-defined indications for surgery and cannot be generalized to other populations such as individuals with degenerative disc disease in whom surgery has become increasingly common. 4
Early cost-effectiveness results from SPORT suggested good value for surgery relative to non-operative care for IDH and SPS, while the value of surgery for DS was not quite as favorable. 11,12 However, it was noted that longer-term follow-up would be essential to fully characterize the cost-effectiveness of surgery for these specific indications. With follow-up over two additional years, it is evident that surgery for IDH has very favorable value regardless of the approach to costing that is undertaken. The cost-effectiveness of surgery for stenosis improved slightly, while the cost-effectiveness of surgery for DS improved markedly and now falls within the range of many commonly accepted medical and surgical practices. Continued follow-up of surgically and non-operatively treated patients is necessary to provide further evidence regarding the clinical effectiveness and cost-effectiveness of surgery over time. These data provide a basis for promoting fully informed choice for patients with disc herniation or spinal stenosis with or without degenerative spondylolisthesis who face the difficult decision of whether or not to undergo spine surgery.
The cost-effectiveness of spine surgery for patients in three well-defined clinical groups (SpS, DS, IDH) was assessed over 4 years among SPORT participants.
We gratefully acknowledge Catherine C. Lindsay, S.M. for extensive work in the development of the cost weights used in this analysis, Tamara S. Morgan for creation of the patient diaries and for her assistance in preparing this manuscript, and Loretta Pearson for editorial assistance.
Grant Support: The authors acknowledge funding from the following sources: The National Institute of Arthritis and Musculoskeletal and Skin Diseases (U01-AR45444), and the Office of Research on Women’s Health, the National Institutes of Health, and the National Institute of Occupational Safety and Health, the Centers for Disease Control and Prevention. The Multidisciplinary Clinical Research Center in Musculoskeletal Diseases is funded by NIAMS (P60-AR048094).
Grant funds were received in support of this work. One or more of the author(s) has/have received or will receive benefits for personal or professional use from a commercial party related directly or indirectly to the subject of this manuscript: e.g., honoraria, gifts, consultancies, royalties, stocks, stock options, decision making position.