|Home | About | Journals | Submit | Contact Us | Français|
The cervical total disc replacement (cTDR) was developed to treat cervical degenerative disc disease while preserving motion.
Cost-effectiveness of this intervention was established by looking at 2-year follow-up, and this update reevaluates our analysis over 5 years.
Data were derived from a randomized trial of 330 patients. Data from the 12-Item Short Form Health Survey were transformed into utilities by using the SF-6D algorithm. Costs were calculated by extracting diagnosis-related group codes and then applying 2014 Medicare reimbursement rates. A Markov model evaluated quality-adjusted life years (QALYs) for both treatment groups. Univariate and multivariate sensitivity analyses were conducted to test the stability of the model. The model adopted both societal and health system perspectives and applied a 3% annual discount rate.
The cTDR costs $1687 more than anterior cervical discectomy and fusion (ACDF) over 5 years. In contrast, cTDR had $34377 less productivity loss compared with ACDF. There was a significant difference in the return-to-work rate (81.6% compared with 65.4% for cTDR and ACDF, respectively; P = .029). From a societal perspective, the incremental cost-effective ratio (ICER) for cTDR was −$165103 per QALY. From a health system perspective, the ICER for cTDR was $8518 per QALY. In the sensitivity analysis, the ICER for cTDR remained below the US willingness-to-pay threshold of $50000 per QALY in all scenarios (−$225816 per QALY to $22071 per QALY).
This study is the first to report the comparative cost-effectiveness of cTDR vs ACDF for 2-level degenerative disc disease at 5 years. The authors conclude that, because of the negative ICER, cTDR is the dominant modality.
ACDF, anterior cervical discectomy and fusion
AWP, average wholesale price
CEA, cost-effectiveness analysis
CPT, Current Procedural Terminology
cTDR, cervical total disc replacement
CUA, cost-utility analysis
DDD, degenerative disc disease
DRG, diagnosis-related group
FDA, US Food and Drug Administration
ICER, incremental cost-effectiveness ratio
IDE, Investigational Device Exemption
NDI, neck disability index
QALY, quality-adjusted life years
RCT, randomized controlled trial
SF-12, 12-Item Short Form Health Survey
VAS, visual analog scale
Cervical total disc replacement (cTDR) was developed to treat symptomatic cervical degenerative disc disease (DDD) while preserving motion. For single-level degenerative disc disease (DDD), noninferiority and cost-effectiveness (CE) of cTDR compared with anterior cervical discectomy and fusion (ACDF) have been demonstrated for several devices in numerous studies.1-8 For 2-level symptomatic DDD, ACDF had been the standard of care until a recent randomized controlled trial (RCT) exhibited superiority for cTDR, resulting in US Food and Drug Administration (FDA) approval.9 Similarly, the CE of 2-level cTDR was previously established looking at 2-year follow-up data from this trial.10 From the societal perspective, 2-level cTDR dominated ACDF at 2 years in this analysis, meaning that it was more effective at less cost.10 Although the original model allowed for extrapolation, this is less than ideal for determining long-term sustainability. Therefore, the current article reevaluates our previous conclusion by analyzing the same cohort with updated 5-year trial data.
To briefly review, 2 approaches are commonly used to conduct a cost-effectiveness analysis (CEA) for a clinical trial: simple incremental calculation vs decision analytical modeling. The simple calculation method is straightforward and uses quality-of-life data collected during the trial to calculate the aggregate for each arm to make comparisons between arms. Decision analytical modeling involves transforming the quality-of-life data into input parameters that are used to inform a decision model about the likelihood of clinical events occurring to trial subjects. The advantage of the latter is its flexibility. Decision analysis allows for time-frame extrapolation, subgroup analysis, and more robust sensitivity analyses to test generalizability. Obvious disadvantages include the time required to derive input parameters from the original trial data and the need for assumptions required to generate the model.11
This study models a cost-utility analysis (CUA), a type of CEA, because it allows for the estimation of the ratio between the cost of a health-related intervention and the benefit experienced by the patient. Therefore, the purpose was to use the best estimates of cost and utility to assemble a model that compares the 5-year CE of cTDR with ACDF for the treatment of 2-level cervical DDD. A secondary goal was to identify the value of indirect cost to society (productivity loss) with a more comprehensive return-to-work (RTW) analysis than previously evaluated.
Data were derived from a published RCT9 comparing cTDR (US FDA Investigational Device Exemption [IDE] trial for Mobi-C) with ACDF for 2-level DDD. Patients included in the trial (n = 330) had to be diagnosed with radiculopathy or myeloradiculopathy at 2 contiguous levels from C3 to C7 that was unresponsive to conservative treatment for at least 6 weeks or demonstrated progressive symptoms. The specific methodology from the RCT is not included here for the sake of brevity. All statistical analysis was conducted by using SAS version 9.3 (SAS Institute, Inc, Cary, North Carolina). All modeling was performed using a common decision-analysis software package (TreeAge Pro 2013; TreeAge Software, Williamstown, Massachusetts).
The study examined the cost and utility of cTDR vs ACDF during the 5 years after surgery. The analysis was conducted in accordance with the Panel on Cost-Effectiveness Health and Medicine convened by the US Public Health Service.12 We assessed both societal and health system perspectives. The societal perspective accounts for both direct and indirect costs, whereas the health system perspective accounts for direct costs alone. Direct medical costs included operating room time, hospital stay, postoperative medications, follow-up visits (scheduled and unscheduled), and surgery-related complications (details in the “Direct Medical Cost” section). Productivity loss was defined as lost workdays and/or the inability to continue working after having returned. Health-related utility outcome was expressed in quality-adjusted life years (QALYs). Both cost and QALY outcomes were discounted at a yearly rate of 3% to reflect their present value.12 The cost-utility outcome was calculated as the incremental cost-effectiveness ratio (ICER) for cTDR. By definition, an ICER is the difference in cost divided by the difference in QALY for 2 interventions. A value under the commonly accepted United States-based willingness-to-pay (WTP) threshold of $50000 per QALY was considered to favor cTDR in comparison with ACDF in terms of comparative CE.
A Markov model (shown in Figure Figure1)1) was constructed to analyze postsurgical costs and health-related utility outcomes. The model initially assumed a target population of patients aged 44 years (based on the mean age from the RCT9). Each Markov state depicted both a patient's health and work status at any given time point evaluated. Overall quality of life was represented by 5 distinct health states, each having an assigned differential utility score (described further in the “Health States” section). According to the published RCT,9 preoperative demographics and health status (neck disability index [NDI], visual analog scale [VAS], and the 12-Item Short Form Health Survey [SF-12]) were not statistically different between the surgical cohorts. Return-to-work (RTW) status was dichotomized as either yes or no. In addition, if RTW was established, it was further stratified by persistence of work at each follow-up visit. Details pertaining to RTW status are described further in the “Productivity Loss” section. The model used varying time intervals per cycle to account for the typical clinical course: patients with DDD often experienced rapid initial symptomatic improvement that tapered over time. The model begins with 6-week cycles from the initial surgery up to 6 months, followed by 6-month cycles thereafter. All model input parameters included in the model are provided in Table Table11.
The development of health states was described in detail in the 2-year article.10 In brief, they include mild, moderate, severe, crippled, and bedbound; they were created by stratifying the NDI and VAS for neck and arm pain, which are 2 principle quantitative measurements used in the RCT.9 The model then assessed transition probability between health states (see Tables, Supplemental Digital Content 1 and 2, http://links.lww.com/NEU/A841, http://links.lww.com/NEU/A842).
Each health state was assigned a health-related utility score measured in QALYs. The QALYs were evaluated by using SF-12 data collected preoperatively and at 6 weeks, and at 3, 6, 12, 24, 36, 48, and 60 months. As in the previous 2-year CEA,10 we opted to use SF-6D for converting SF-12 data into utilities.15 This SAS-based mapping algorithm was obtained from the University of Sheffield. Mean utility and standard deviations were computed for each of the 5 health states and were found to directly correlate.16 Preoperative and postoperative health states with associated utility values were then determined for each of the surgical cohorts.
Per the original RCT, data for productivity loss were limited to 3 years of follow-up for both surgical cohorts because no additional change in RTW was observed between the 2- and 3-year postoperative time points. Preoperatively, 204 (61.6%) patients worked; 74 (22.4%) were unable to work secondary to their disease; and 53 (16.0%) did not work for reasons other than the disease (ie, retired). We only included patients from the former 2 groups (n = 278) in our analysis. The proportion of patients working preoperatively did not significantly differ between surgical cohorts (58.4% in the ACDF group compared with 61.4% in the cTDR group; P = .313, 2-tailed z test). We considered 2 aspects when defining productivity loss: workdays missed (ie, RTW) and work persistence once returned.
For RTW, we calculated both the associated health state-specific and time-specific probabilities to inform the model (shown in Tables Tables22 and and3).3). With respect to work status persistence, we derived the health state-specific probabilities for continuing to work full-time, part-time, and complete cessation. We assumed part-time work to be 50% of full-time work hours. A monetary value for productivity loss was calculated by multiplying the 2013 national average wage14 (2014 average wage is not available as of this analysis) with the corresponding proportion of time away from work postoperatively.
The cost-related trial data contain the following elements: operative time, hospital stay, medication use, adverse event rates, and follow-up office visits. Operating room time and hospital stay could not be directly translated into dollar amounts because the unit cost (eg, operating room per hour cost) was unknown. Instead, we used 2014 national average Medicare reimbursement rates for the associated diagnosis-related group (DRG) and Current Procedural Terminology (CPT) codes, assuming that operational differences were reflected by the difference in the Medicare reimbursement rates (Table (Table1).1). The model also considered costs associated with adverse events, but only those requiring subsequent surgeries, such as revision, fixation, or reoperation and removal. The analysis of medication use and unscheduled office visits were believed to capture other adverse events, such as pain, dysphagia (not requiring surgery), and adjacent segment disease (not requiring surgery). Revisions occurring within 30 days of the initial surgery were excluded because they are bundled with the initial DRG.
With respect to medication costs, we hypothesized that the type and number of medications taken immediately postoperatively should differ from later follow-up periods. In addition, this difference (particularly with analgesics, antispasmodics, and neuroleptics) was associated with the patient's health status. For medications missing a start date, the middle of the month (ie, the 15th) was imputed. The number of medications prescribed by postoperative follow-up period and health status is illustrated in Supplemental Digital Content 3 (see Table, http://links.lww.com/NEU/A843). We cross-referenced our data with the 2011 Redbook MarketScan17 to calculate medication costs. The Redbook file contains the average wholesale price (AWP) for all drugs assigned a national drug code. Cost was estimated using AWP times 85% (seen in Table Table4).4). This multiplier was based on Medicare's 2010 reimbursement rate for medications. After 2010, Medicare changed to an average sales price reimbursement method that is not publicly available. The estimated AWP was inflated to dollars in 2014 by using the medical Consumer Price Index.18
Finally, costs associated with follow-up office visits (both scheduled and unscheduled) were included in the model. Scheduled office visits were those occurring at 6 weeks, and at 3, 6, 12, 24, 36, 48, and 60 months postoperatively. An analysis of the trial data revealed that the likelihood of having an unscheduled office visit is associated with worsening health state (P = .002). Therefore, we looked at health state-specific likelihood of office visits per 6 weeks (Markov cycle). As a result, for each scheduled or unscheduled office visit, the model assigned a CPT code based on the health state. For example, a normal office visit (CPT 99213) was used for mild and moderate disability states, whereas a higher-level visit (CPT 99214) was used for the severe and crippled health states.
To examine the uncertainty in our CUA, we conducted scenario, probabilistic, and threshold sensitivity analyses. First, we modified model inputs (such as age and time horizon) to examine costs and utilities under alternative scenarios (see Table Table5).5). The target population was varied to include 30-, 55-, and 70-year age groups to assess how lifespan (and productivity) affect the model. Age-specific mortality rates were based on the 2013 National Vital Statistics Report,13 with a maximum age set at 100 years. Standard retirement age was assumed to be ≥65, after which no postoperative productivity loss occurred. The model time horizon was also varied from the base case of 5 years to include both 2- and 8-year scenarios. Second, a probabilistic analysis was used to address parameter uncertainty. The uncertainty regarding the model input parameters were characterized by probability distributions.19 β distributions (best fit for binomial data) were assigned to all probability parameters based on their point estimates and 95% confidence intervals derived from the trial data. γ distributions were used for cost items with the standard deviation, by convention, assumed to be 25% of the national reimbursement rate.19 γ distributions were also assigned to decrements in QALYs, with lower and upper bounds of 0 and 1, respectively. With distributions assigned, we ran a Monte Carlo simulation with 3000 iterations, and the parameter values were simultaneously sampled based on their specified distributions. Simulation findings are presented in a CE acceptability curve (Figure (Figure2).2). Finally, a threshold analysis was used to determine the values of key clinical parameters required to cause an absolute change in CE between the 2 surgical arms (Table (Table6).6). Absolute change was defined as the amount of parameter manipulation required for the ICER to fall below or exceed the WTP threshold of $50000 per QALY.
The baseline analysis examined the cost and utility (in QALYs) of cTDR vs ACDF for a patient with 2-level symptomatic cervical DDD during a 5-year postoperative time horizon. The results are summarized in Table Table44.
With respect to direct medical cost, cTDR was projected to have a higher initial surgical cost ($20488) than ACDF ($16945), but lower costs associated with adverse events ($689), medications ($1736), and office visits ($546) compared with ACDF ($2292, $1944, and $591, respectively). Taken together, cTDR costs $1687 more than ACDF over 5 years. In contrast, cTDR had substantially less productivity loss than ACDF, $57447 vs $91824 over 3 years, respectively. This is likely secondary to significant differences in RTW rates between the 2 surgical cohorts (Table (Table3),3), a total of 80.6% compared with 65.4% for cTDR and ACDF, respectively. Looking specifically at the 6-week postoperative period, cTDR also demonstrated a statistical trend toward higher RTW rates compared with ACDF, 47.6% vs 36.5%, respectively (P = .059).
In terms of quality of life, on average, the model projected that a patient undergoing cTDR enjoyed 35.5 of 60 months in the health state of mild disability, a total of 10.4 months greater than the ACDF cohort. The ACDF cohort spent longer, on average, in all health states worse than mild disability. Consequently, cTDR patients had 3.574 QALYs compared with the ACDF patient with 3.376 QALYs at 5 years.
Therefore, from a societal perspective, thereby incorporating productivity loss, the ICER for cTDR was −$165103 per QALY at 5 years, indicating that cTDR dominated ACDF. From a health system perspective, and only considering direct medical cost, the ICER for cTDR was $8518 per QALY at 5 years.
When target age and time horizon were varied (Table (Table5),5), the ICER for cTDR compared with ACDF remained below the US WTP threshold of $50000 per QALY in all scenarios (−$225816 per QALY to $22071 per QALY). Results were similar with multivariate probabilistic sensitivity analysis (Figure (Figure2):2): cTDR was cost-effective in more than 95% of simulation iterations when the WTP threshold was >$20000 per QALY, regardless of perspective. In the final test of generalizability, extreme thresholds were required for ACDF to be cost-effective compared with cTDR (Table (Table6).6). Similarly, only when the cTDR DRG was reimbursed at $26217 or $62637 for the health system and societal perspectives, respectively, the ICER exceeded the WTP threshold.
In this study, we derived the ICER of cTDR compared with ACDF by using 5-year trial data and conducted a robust sensitivity analysis to assess generalizability. Similar to the previous 2-year review, CE was again established based on the commonly accepted WTP threshold of $50000 per QALY. This remains true while varying input parameters in multiple sensitivity analyses, reaffirming the stability of the model, and establishing the sustainability of the result. From a societal perspective, cTDR imparts greater quality of life at less cost than ACDF, suggesting that it is the dominant intervention. For the base case analysis, the ICER for cTDR at 5 years (−$165103 per QALY) far exceeded the previous 2-year result ($24594 per QALY). This is not surprising because the QALY improvement between cTDR and ACDF increased from 0.087 at 2 years to 0.198 at 5 years, respectively. Similarly, cTDR was $2139 more costly than ACDF at 2 years compared with a cost savings of $32690 at 5 years. Again, this highlights how cTDR appears to provide patients with better quality of life at a savings over time. Our results also appear consistent with recent publications. In an article by McAnany et al,8 authors analyzed the CE of cTDR vs ACDF for single-level DDD using the ProDisc-C trial data. They found that cTDR also dominates ACDF at 5 years for single-level disease (−$557849 per QALY).
The dramatic difference in the 2- and 5-year data was believed to be secondary to a more comprehensive RTW analysis than was previously conducted. Unlike the 2-year analysis, RTW data were available and not assumed. Furthermore, we calculated both the associated health state-specific and time-specific probabilities to inform the model and work status persistence. When comparing surgical cohorts, the overall difference in RTW rates was found to be significantly different. At 6 months, the cTDR and ACDF RTW rates were 47.6% and 36.5%, respectively. At 3 years, the RTW rates were 80.6% compared with 65.4%, respectively. Health status-specific RTW probabilities suggested that patients with less postoperative disability were more likely to return to and sustain work. However, the results cannot be explained by the enhanced RTW analysis alone. This is because the majority (68.1%) of patients who worked before surgery were able to return to work within 6 weeks postoperatively. More importantly, 50% of patients whose condition prevented them from working before surgery were able to return to work within 6 months postoperatively. Furthermore, because the RTW analysis was limited to 3 years, other costs, complications, and/or quality-of-life differences appear to be contributing to sustained CE past this time point. It is unclear how complete 5-year RTW data would affect our results; however, the authors assume the effect to be negligible because the rate of late complications (that could affect productivity) was markedly low and there was no documented change in RTW status between the 2 and 3 years of follow-up periods.
This study should be interpreted in the context of several limitations. It was conducted using decision analytical modeling and, as a result, has several inherent limitations. By definition, the Markov model is supposed to be conditional on the present state alone; future and past events are assumed independent. With disease processes, it is rarely plausible to assume that a patient's transition to another health state was not in some way dependent on their previous health state. The model also assumed that surgical cohorts began in similar health states. The authors believed that this was acceptable because the initial patient selection was randomized. Despite the stringent criteria used in the RCT,9 however, it is rarely possible to blind patients or surgeons in a surgical trial. Therefore, it is perceivable that patients receiving the novel cTDR intervention may have experienced more subjective improvement compared with the ACDF group. Similarly, surgeons may be biased toward 1 approach and make different intraoperative and postoperative decisions as a result. Although difficult to control, this was addressed by combining multiple outcome measures, such as NDI, VAS, and SF-12, all of which demonstrated reliability and consistency.
We also recognize that some cost data were not ascertainable. For example, because it is problematic to use hospital charge data to conduct a CEA, we used Medicare rates for diagnosis-related groups. As a result, differences in parameters (such as operating room time and length of stay) were not captured. However, it is likely that the marginal increases in operating room time associated with cTDR, and the resultant increased cost, is obviated by the shorter length of stay observed in this same group in comparison with ACDF.9 To be able to calculate the medication-related costs, we created estimates from AWP (×0.85) because updated Medicare average sales prices were not publicly available. Although this estimate was considered appropriate, it is impossible to determine if this actually overestimated or underestimated costs for both groups. Productivity loss also significantly contributed to cost. Although comprehensive, this analysis did not include aspects such as transportation costs, caregiver time/responsibilities, and educational days missed. Furthermore, in monetary terms, productivity loss was calculated by using the 2014 national average wage. It is unclear how these estimates may bias our conclusion. However, we contend that when the direction of the bias was unclear, at least both groups were treated similarly based on sound and collaborative clinical judgment.
Despite numerous limitations, univariate and multivariate sensitivity analyses were conducted in an attempt to test the strength and generalizability of the model. Despite multiple alterations of input parameters and multivariate analysis of thousands of permutations the ICER for cTDR compared with ACDF remained below the US WTP threshold of $50000 per QALY. This finding remained true from both health system and societal perspectives. We also reiterate the significance of our threshold analysis. Not only would cTDR need to cost markedly more than its current reimbursement rate, but critical input parameters would have to substantially deviate from their baseline values for a scenario to exist in which ACDF was CE compared with cTDR. For example, the cTDR complication rate would need to increase by >6281% of the base case value. In several instances, such as with ACDF complication rates or altering QALY values for disability health states, despite maximal modification and/or ranges, (ie, 0-infinity), it was not feasible to create a CE scenario for ACDF vs cTDR. The authors contend that this comprehensive sensitivity analysis provides adequate reassurance regarding the generalizability of our model and conclusion.
This study is the first to report the comparative CE of cTDR and ACDF for 2-level DDD at 5 years. It is also the first surgical CUA to include work persistence as part of an extensive RTW analysis for measuring societal impact. The results appear generalizable after an extensive sensitivity analysis demonstrates consistency with similar investigations.8,20 Therefore, the authors conclude that for patients with 2-level DDD, cTDR is not only a highly cost-effective surgical modality compared with ACDF, but, at 5 years, it is the dominant modality. In a rapidly changing medical climate with emerging practice paradigms such as pay for performance and value-based purchasing, surgeons and payers will naturally gravitate toward these analyses. In addition, as the health care system becomes more informed in an established setting of scarce resources and increasing expenses, sustainable surgical technologies that improve quality of life while saving costs require serious attention.
LDR Spine provided a consulting honorarium to analyze their RCT data for the purposes of health care economics and cost-effective analysis and institutional research support for this work. Dr Ament is a consultant for LDR Spine and received funding to support data analysis and preparation of the manuscript. Dr Stone has received institutional funding from LDR for cost data collection for this project and institutional funding for the randomized controlled trial (RCT) and other Mobi-C related research. Dr Nunley received institutional funding from LDR for cost data collection for this manuscript and is a paid consultant for training surgeons in the use of the Mobi-C disc, and is a patent holder for K2M and LDR Spine (specifically for the ROI-A anterior lumbar interbody fusion cage). Dr Kim receives royalties for LDR Spine. The other authors have no personal, financial, or institutional interest in any of the drugs, materials, or devices described in this article.
Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Web site (www.neurosurgery-online.com).
The authors present a cost utility analysis of 2-level artificial disc replacement (ADR) vs fusion with 5 years of follow-up. Decision analytical modeling was utilized with construction of a Markov model. The data were derived from a previously published US Food and Drug Administration Investigational Device Exemption (US FDA IDE) randomized controlled trial for Mobi-C, which included 330 patients. Data from the randomized controlled trial (RCT) was used for the model input parameters.
Direct medical costs favored anterior cervical discectomy and fusion (ACDF) over ADR by $1687; however, indirect costs favored ADR ($57447) over ACDF ($91824). The 3-year return-to-work (RTW) rates also favored ADR over ACDF with 80.6% vs 65.4%. The incremental cost-effectiveness ratio (ICER) favored ADR at −$165103 per quality-adjusted life years (QALY) at 5 years. All scenario sensitivity analyses favored ADR from the societal perspective (direct and indirect/productivity losses). The limitations of the study derive from both the underlying RCT and the need to simplify a complex course of events into a relatively simple model. For example, the underlying RCT was done in the context of a US FDA IDE trial. The strict inclusion criteria may limit generalizability, and it is possible that the patients who participated in the trial and received the ADR noted greater satisfaction/subjective improvement with the novel treatment option than those who were randomly assigned to fusion. The underlying RCT had limited productivity loss data (indirect cost) of only 3 years and limited direct medical cost data. For example, an analysis of medication use and unscheduled office visits was used to capture nonsurgical adverse events and adjacent segment disease.
The strengths of the study are that the data were derived from a randomized controlled trial of 330 patients with 5-year follow-up, the use of both direct/indirect costs, discounting to reflect present value, and sensitivity analysis utilizing various ages and time horizons. With the changing health care landscape, it will be increasingly important for spine surgeons to demonstrate value beyond that of simple statistical significance (ie, P value < .05).
Cheerag D. Upadhyaya
Kansas City, Missouri