In this cohort of patients undergoing complex cancer surgery, we observed no statistically significant associations between higher volume and improved outcomes, or between individual quality measures and improved outcomes. When quality was measured as an overall count, worse overall quality (indicated by the number of measures missed during hospitalization) was not associated with clinical outcomes, but was strongly associated with higher costs and length of stay. These findings suggest that quality improvement efforts aimed at improving the reliability of systems that provide care of cancer surgery patients may have substantial impact on costs of care.
A large literature describes the relationship between higher volume and better outcomes in cancer surgery2–7
. This observation has led to endorsement of case volume as a way to identify preferred sites and improve patient outcomes27
— an approach aptly termed ‘follow the crowd 1
.’ However, regionalization of services poses practical problems29
, and the evidence for volume benchmarks’ ability to accurately identify ‘best’ sites has limitations10, 30–33
. We did not see a striking association between higher volume and better outcomes. This may be because we had a relatively small sample size compared with previous work2, 4
, or because previous studies were able to include longer-term outcomes at fixed time periods34, 35
. Longer periods of follow-up accrue more events, further increasing statistical power to compare events across volume strata, as well as potentially increasing sensitivity to the effects of high-quality postsurgical care for cancer patients provided at more specialized high volume centers, and which take place long after hospitalization. Although our methods selected cases using diagnosis codes used previously, pooling a number of fairly disparate surgical procedures in our study may have also limited our ability to detect a volume effect by attenuating our ability to discern hospitals’ or surgeons’ procedure-specific experience. With these limitations, it is important to note that others have found that the volume-outcomes relationship in cancer surgery may be weaker than previously described8, 36
. This may be because secular trends in surgical outcomes are disproportionately affecting high-mortality centers32
, or because the pressure to contract based on volume has already made substantial progress towards moving cases away from lower volume centers and towards higher volume ones. Although we used all available diagnosis code data for risk adjustment in our models and attempted to avoid pitfalls described by others37
, we may have been unable to fully adjust for shifting of higher-complexity patients to higher-volume centers. This possibility is suggested by the observation that higher volume centers were more costly, although it seems equally possible that larger centers also tend to provide more costly care.
Few of our individual process measures were associated with improvements in outcomes or resource use. While this may be because we used measures that parallel but do not entirely replicate chart-abstracted process measures, our data are consistent with previous evidence suggesting that performance on publicly reported quality measures explains only a small portion of differences in patient outcomes38
. In fact, early experience with Surgical Care Improvement Project (SCIP) measures — upon which our measures were based — suggests no relationship between better performance on individual quality measures and improved outcomes11–14
in colorectal surgery. Like the SCIP measures, our quality measures address a few key processes in perioperative care and do not capture other key elements of operative or perioperative care.
In contrast, overall quality is thought to be a measure of a systems’ ability to deliver care reliably7, 34, 39–41
; reliability and consistency form the rationale for the growing use of checklists in clinical care42–44
. In our study, overall quality represents the proportion of patients who did not ‘miss’ an opportunity to receive appropriate care. Our measure was developed (to the greatest extent possible with our data) using information that might represent appropriately withholding a medication, and as such our overall quality measure represents the cumulative impact of multiple appropriate clinical decisions, in addition to the reliability of the system of care. This study did not demonstrate the strong impact of overall care quality on mortality observed in our previous work15
(for reasons described earlier), though it is important to note that maximizing overall quality in this study would save approximately $3400 per patient. When applied to the more than 80% of patients in our study who missed at least 1 quality measure, such cost savings would have enormous economic impact.
Our study has a number of limitations. Because we used administrative data, we cannot easily distinguish complications from preexisting disease, and cannot replicate chart-based SCIP measures exactly. However, we constructed our quality measures to focus on patients with no documented contraindications, and we did not use comorbidities to define outcomes. Our quality measures focus primarily on inpatient medications and cannot distinguish between continuation of home medications and initiation of medications in hospital. This factor may be influencing the associations seen between beta-blocker use and outcomes, but is less likely to affect antimicrobial or serial compression device use. In addition, our quality measures were collected from electronic billing systems rather than chart abstraction, and have not been validated in a scientific study. However, because Premier’s business model focuses on provision of accurate benchmarking data to their members, all charge and diagnosis data are regularly audited for accuracy17
. As an observational study, the results are subject to biases related to nonrandom assignment of patients to receive medications or devices, as well as documentation biases described. However, secondary analyses including adjustment for hospital-level likelihood of receipt of quality measures did not suggest this was a substantial threat (data not presented). Although Premier hospitals are similar to other US centers in terms of size, teaching status, and location, they may differ from non-Premier hospitals. While we constructed our volume measures to be consistent with those employed in previous work, they may not adequately represent expertise accrued if low volume surgeons were performing other complex surgeries. Although we selected our surgery types according to previous studies2
, it is possible that our approach missed some procedures performed less frequently. In addition, our study had somewhat smaller number of high volume hospitals than other studies, a trend that may have further limited our ability to see strong volume effects. Our volume measures do not take into account cases that surgeons performed outside hospitals participating in Premier or cases of similar complexity within target hospitals, and as such may underestimate surgeon volume and experience. Finally, it is likely that some surgeries in our dataset were at least partially performed by fellows or residents. However, whether the surgery was performed at a teaching hospital was not a significant predictor of outcome in our models.
Our study represents a first view at the important relationship of how case volume, care quality, and outcomes of care are linked. Borderline associations between improved outcomes and higher volume in our data may be countered by higher costs at high volume sites. Quality of care as measured in our data has, at best, little association with patient outcomes, but worse quality care was far costlier. Efforts to simultaneously encourage patients to ‘follow the crowd’ and increase the quality of health care have strong face validity, but may have heterogeneous impact on the value of health care.