|Home | About | Journals | Submit | Contact Us | Français|
How case volume and quality of care relate to each other and to results of complex cancer surgery is not well understood.
Observational cohort of 14,170 patients 18 or older who underwent pneumonectomy, esophagectomy, pancreatectomy, or pelvic surgery for cancer between 10/1/2003 and 9/1/2005 at a United States hospital participating in a large benchmarking database. Case volumes were estimated within our dataset. Quality was measured by determining whether ideal patients did not receive appropriate perioperative medications (such as antibiotics to prevent surgical site infections) both as individual ‘missed’ measures, as well as the overall number missed. We used hierarchical models to estimate effects of volume and quality on 30-day readmission, in-hospital mortality, length of stay, and costs.
After adjustment, we noted no consistent associations between higher hospital or surgeon volume and mortality, readmission, length of stay, or costs. Adherence to individual measures was not consistently associated with improvement in readmission, mortality, or other outcomes. For example, continuing antimicrobials past 24 hours was associated with longer length of stay (21.5% higher, 95% CI 19.5% to 23.6%) and higher costs (17% higher, 95% CI 16% to 19%). In contrast, overall adherence, while not not associated with differences in mortality or readmission, was consistently associated with longer length of stay (7.4% longer with one missed measure and 16.4% longer with 2 or more) and higher costs (5% higher with one missed measure, and 11% higher with 2 or more).
While hospital and surgeon volume were not associated with outcomes, lower overall adherence to quality measures is associated with higher costs, but not improved outcomes. This finding may provide a rationale for improving care systems by maximizing care consistency, even if outcomes are not affected.
The volume–outcome relationship — the association between improved surgical outcomes at sites that perform a procedure more often — has become the focus of payor-driven proposals to regionalize care to high-volume centers1. This relationship has been of particular interest in complex cancer surgery, where evidence suggests that care from a more experienced surgeon and hospital produce better outcomes2–7.
However, little is known about the specific mechanisms that explain variation in outcomes between high and low volume centers or surgeons, and to what extent these can be attributed to differences in quality as measured by adherence to recommended care processes8, 9. If care quality is the primary factor explaining differences in outcomes between high and low volume centers, then patients in need of cancer surgery could expect similar results at high and low volume centers with similar quality measure performance. Conversely, if high volume centers are better regardless of adherence to recommended practices, then travel to a regional referral center would be the wisest course of action10.
When contrasted with the impact of case volume on outcomes, associations between individual quality measures and outcomes have been small4 or absent11–14. Recent data from our group confirm inconsistent associations between individual quality measures and outcomes in coronary artery bypass surgery15. However, increasing overall performance on quality measures may have a powerful impact on mortality15 and is also associated with lower costs16.
We hypothesized that, for patients undergoing complex cancer surgery, advantages seen at high-volume systems would be related to greater adherence to recommended care practices. To explore this hypothesis, we analyzed data collected from adults undergoing cancer surgery (e.g. pelvic exenteration, esophageal resection, pancreatic resection, or pneumonectomy2) in a nationally representative sample of United States hospitals. Using these data, we first examined the relationship between patient outcomes, hospital case volume, physician case volume, and care quality measures. We then examined the degree to which overall quality (an all-or-none measure of system reliability) influenced mortality in relationship to volume measures.
Our data were collected on 14170 patients cared for by 1629 physicians at 266 hospitals participating in Perspective (Premier Inc., Charlotte, North Carolina), a database developed for measuring quality and health care utilization and which we have used in previous research5–7.
In addition to standard hospital discharge file data, Perspective contains a date-stamped log of all materials (e.g. serial compression devices used to prevent venous thromboembolism) and medications (e.g. beta-blockers) charged for during hospitalization. Perspective charge data are collected electronically and undergo comprehensive auditing as part of Premier efforts to ensure data validity. Previous research suggests that comorbidity indices collected using Premier data correspond closely to those collected from charts17.
Located in all regions of the United States, Perspective sites are representative of the US hospital population18–20, in that they are predominantly small to mid-size non-teaching facilities and serve a largely urban patient population. Perspective sites also have performance on publicly reported quality measures similar to non-Perspective sites. The institutional review board at UCSF approved our study, and our funder (California HealthCare Foundation) had no role in the development or execution of the study, or preparation of the manuscript.
Patients were initially eligible for our analysis if they were admitted between 10/1/2003 and 9/30/2005 and were 18 years of age or older. Patients in this cohort who underwent complex cancer surgery were then identified using International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) procedure codes and diagnosis codes by replicating methods used by Begg and colleagues2. Specifically, patients had to have a principal diagnosis of cancer and to have undergone one of the following surgeries as their principal procedure during hospitalization: Esophageal resection (ICD-9 =42.40–42.42, 42.51–42.56, 42.58–42.59, 42.61–42.66, 42.68–42.69), pancreatic resection (ICD9 = 52.51, 52.33, 52.59, 52.6, 52.7), liver resection (ICD-9 =50.22, 50.3, 50.4), pelvic exenteration (ICD-9 =57.71, 68.8, 48.4–48.6), or pneumonectomy (ICD9=32.5, 32.3, 32.4).
In addition to patient age, sex, race or ethnicity, insurance information, and principal diagnosis, we classified comorbidities using software provided by the Agency for Healthcare Research and Quality based on methods developed by Elixhauser21. Data regarding in-hospital deaths, discharge status (home vs. other), costs, length of stay, and readmission at the index hospital at 30 days were obtained from the Perspective discharge file. Three-quarters of the hospitals that contribute data to Perspective submit actual costs directly from their hospital cost-accounting system, while in the remaining 25% costs are estimated by the hospitals by applying the Medicare cost-to-charge ratio to hospital charges. Our data also included All Patient Refined Diagnosis Related Group Risk of Mortality scores (APR-DRG), an administrative data-derived risk adjustment methodology used to account for patient severity of illness22–24. Finally, the database contained information about hospital size, teaching status, and location.
Because some hospitals in our cohort did not contribute data for the entire study period, we estimated the annual case volume by dividing each hospital’s or physician’s observed patient count by the total number of months that the hospital or physician contributed patients to the dataset. These “annualized” volumes were then divided into quartiles so that one-quarter of the patient cohort was included in each quartile of volume, as done in previous work 4, 25–27.
Using charge data, we translated recommendations from national guidelines8 into a series of dichotomous quality measures representing whether a perioperative medication was received during hospitalization. These medications included whether antimicrobials were used to prevent surgical site infection on the operative day, whether an antimicrobial was continued inappropriately past the first day after surgery, and whether appropriate strategies were used to prevent venous thromboembolism on the operative day.
Because inpatient diagnosis codes cannot reliably distinguish between complications and preexisting conditions, we measured the proportion of ideal candidates for each care process who failed to receive them — a missed quality measure. For example, we considered the opportunity for beta-blocker use ‘missed’ if a patient did not receive the drug and did not have ICD-9 coded principal or secondary diagnosis of hypotension, heart block, or congestive heart failure recorded in their hospital record. In order to provide a more sensitive measure of system-level ability to provide reliable care4, 25–27, we also counted the total number of quality measures missed during hospitalization.
Cost, length of stay, readmission, discharge status, and mortality outcomes were obtained from Perspective discharge abstract data, as described. Length of stay and costs were log-transformed to account for skew and to stabilize variance of residuals in multivariable models. Beta estimates and 95% confidence intervals for log-transformed outcomes were converted to percent differences using the formula 100*(EXP(estimate)-1).
We first described study patients and hospitals using univariable methods. Multivariable alternating logistic models 28 (SAS PROC GENMOD) were used to account for clustering of patients within physicians and physicians within hospitals for dichotomous outcomes and calculate adjusted odds ratios and adjusted estimates. Mixed effect models (SAS PROC MIXED) were used to account for clustering of patients within physicians and within hospitals for continuous variables. Models were constructed using manual variable selection methods. Volume and quality measures were entered manually, while additional covariates (confounding factors) were selected for inclusion if they were associated with the outcome at p<0.05, if including them changed estimates for the primary predictors by more than 10%, or for face validity. All analyses were carried out using SAS version 9.1 (SAS Institute, Inc. Cary, NC).
14170 patients underwent one of our target surgeries at one of our study sites between 10/1/2003 and 9/30/2005. Mean age of patients was 66.2 years (standard deviation 11.0 years), and 56% were men. Most were white and had Medicare insurance. The most common Elixhauser-defined comorbidities in our cohort were hypertension (50.2%), metastatic cancer (23.8%), and chronic obstructive pulmonary disease (40.2%). Three percent (427 patients) died during the initial hospitalization or a subsequent admission to the same hospital, 11% were readmitted in 30 days.
The proportion of patients who did not receive our target medications varied. Few did not receive a beta-blocker (15%) or had no antimicrobial charges on the operative day (9%); two-thirds had no venous thromboembolism preventative measures and 62% had antimicrobials continued after the first postoperative day. Few patients (9%) had no missed quality measures, 35% missed one, and 55% missed two or more.
Most hospitals (174 hospitals, 65%) and physicians (913 physicians, 56%) were lowest-volume (e.g. 1st quartile of volume) providers. Hospital volume ranged from 13 (IQR 8,19) in the lowest quartile to 110 per year (IQR 105, 148) in the highest. Physician volume ranged from 4 patients per year (IQR 3, 5) in the lowest quartile, to 29 (IQR 24, 41) in the highest. The mean number of quality measures missed was similar across physician and hospital volume quartiles.
Lower hospital volumes were not associated with higher risk for mortality after adjustment, although odds ratios were all greater than 1 in lower volume sites. Similarly, there were no statistically significant associations between volume measures and readmission, after adjusting for patient factors. In contrast, lower volume sites and surgeons tended to have lower costs, after adjustment. There were inconsistent associations between individual quality measures and mortality, length of stay, costs, or readmission.
In analyses assessing the association between total number of quality measures missed during hospitalization and patient outcomes, there were no statistically significant associations between the number of measures missed and our key outcomes. However, both costs and length of stay were significantly increased if 1 or more measures were missed. Importantly, inclusion of overall quality in these models did not reveal any underlying associations between volume and any of our study outcomes.
In this cohort of patients undergoing complex cancer surgery, we observed no statistically significant associations between higher volume and improved outcomes, or between individual quality measures and improved outcomes. When quality was measured as an overall count, worse overall quality (indicated by the number of measures missed during hospitalization) was not associated with clinical outcomes, but was strongly associated with higher costs and length of stay. These findings suggest that quality improvement efforts aimed at improving the reliability of systems that provide care of cancer surgery patients may have substantial impact on costs of care.
A large literature describes the relationship between higher volume and better outcomes in cancer surgery2–7. This observation has led to endorsement of case volume as a way to identify preferred sites and improve patient outcomes27— an approach aptly termed ‘follow the crowd 1.’ However, regionalization of services poses practical problems29, and the evidence for volume benchmarks’ ability to accurately identify ‘best’ sites has limitations10, 30–33. We did not see a striking association between higher volume and better outcomes. This may be because we had a relatively small sample size compared with previous work2, 4, or because previous studies were able to include longer-term outcomes at fixed time periods34, 35. Longer periods of follow-up accrue more events, further increasing statistical power to compare events across volume strata, as well as potentially increasing sensitivity to the effects of high-quality postsurgical care for cancer patients provided at more specialized high volume centers, and which take place long after hospitalization. Although our methods selected cases using diagnosis codes used previously, pooling a number of fairly disparate surgical procedures in our study may have also limited our ability to detect a volume effect by attenuating our ability to discern hospitals’ or surgeons’ procedure-specific experience. With these limitations, it is important to note that others have found that the volume-outcomes relationship in cancer surgery may be weaker than previously described8, 36. This may be because secular trends in surgical outcomes are disproportionately affecting high-mortality centers32, or because the pressure to contract based on volume has already made substantial progress towards moving cases away from lower volume centers and towards higher volume ones. Although we used all available diagnosis code data for risk adjustment in our models and attempted to avoid pitfalls described by others37, we may have been unable to fully adjust for shifting of higher-complexity patients to higher-volume centers. This possibility is suggested by the observation that higher volume centers were more costly, although it seems equally possible that larger centers also tend to provide more costly care.
Few of our individual process measures were associated with improvements in outcomes or resource use. While this may be because we used measures that parallel but do not entirely replicate chart-abstracted process measures, our data are consistent with previous evidence suggesting that performance on publicly reported quality measures explains only a small portion of differences in patient outcomes38. In fact, early experience with Surgical Care Improvement Project (SCIP) measures — upon which our measures were based — suggests no relationship between better performance on individual quality measures and improved outcomes11–14 in colorectal surgery. Like the SCIP measures, our quality measures address a few key processes in perioperative care and do not capture other key elements of operative or perioperative care.
In contrast, overall quality is thought to be a measure of a systems’ ability to deliver care reliably7, 34, 39–41; reliability and consistency form the rationale for the growing use of checklists in clinical care42–44. In our study, overall quality represents the proportion of patients who did not ‘miss’ an opportunity to receive appropriate care. Our measure was developed (to the greatest extent possible with our data) using information that might represent appropriately withholding a medication, and as such our overall quality measure represents the cumulative impact of multiple appropriate clinical decisions, in addition to the reliability of the system of care. This study did not demonstrate the strong impact of overall care quality on mortality observed in our previous work15 (for reasons described earlier), though it is important to note that maximizing overall quality in this study would save approximately $3400 per patient. When applied to the more than 80% of patients in our study who missed at least 1 quality measure, such cost savings would have enormous economic impact.
Our study has a number of limitations. Because we used administrative data, we cannot easily distinguish complications from preexisting disease, and cannot replicate chart-based SCIP measures exactly. However, we constructed our quality measures to focus on patients with no documented contraindications, and we did not use comorbidities to define outcomes. Our quality measures focus primarily on inpatient medications and cannot distinguish between continuation of home medications and initiation of medications in hospital. This factor may be influencing the associations seen between beta-blocker use and outcomes, but is less likely to affect antimicrobial or serial compression device use. In addition, our quality measures were collected from electronic billing systems rather than chart abstraction, and have not been validated in a scientific study. However, because Premier’s business model focuses on provision of accurate benchmarking data to their members, all charge and diagnosis data are regularly audited for accuracy17. As an observational study, the results are subject to biases related to nonrandom assignment of patients to receive medications or devices, as well as documentation biases described. However, secondary analyses including adjustment for hospital-level likelihood of receipt of quality measures did not suggest this was a substantial threat (data not presented). Although Premier hospitals are similar to other US centers in terms of size, teaching status, and location, they may differ from non-Premier hospitals. While we constructed our volume measures to be consistent with those employed in previous work, they may not adequately represent expertise accrued if low volume surgeons were performing other complex surgeries. Although we selected our surgery types according to previous studies2, it is possible that our approach missed some procedures performed less frequently. In addition, our study had somewhat smaller number of high volume hospitals than other studies, a trend that may have further limited our ability to see strong volume effects. Our volume measures do not take into account cases that surgeons performed outside hospitals participating in Premier or cases of similar complexity within target hospitals, and as such may underestimate surgeon volume and experience. Finally, it is likely that some surgeries in our dataset were at least partially performed by fellows or residents. However, whether the surgery was performed at a teaching hospital was not a significant predictor of outcome in our models.
Our study represents a first view at the important relationship of how case volume, care quality, and outcomes of care are linked. Borderline associations between improved outcomes and higher volume in our data may be countered by higher costs at high volume sites. Quality of care as measured in our data has, at best, little association with patient outcomes, but worse quality care was far costlier. Efforts to simultaneously encourage patients to ‘follow the crowd’ and increase the quality of health care have strong face validity, but may have heterogeneous impact on the value of health care.
The study was supported by Grant #05-1755 from the California Healthcare Foundation. Dr Auerbach was also supported by a K08 Patient Safety Research and Training Grant (K08 HS11416-02) from the Agency for Healthcare Research and Quality during the execution of this project.
We would like to acknowledge Erin Hartman, MS, for her expert editorial assistance, as well as Denise Remus, MD and Kathy Belk for their work in assembling the dataset used for this analysis.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Disclosure Information: Nothing to disclose.