When using outcome measures to profile hospital quality, it is important to ensure that differences are fully adjusted for patient risk (9
). The National Surgical Quality Improvement Program (NSQIP) collects a comprehensive set of detailed clinical data for this purpose (7
). However, the present study provides empiric data demonstrating that a much smaller set of variables can be used without compromising risk-adjustment. Hospital O/E ratios based on a limited variable set were almost identical to the O/E ratios created using the comprehensive set of variables. This finding held true for both outcomes (morbidity and mortality) and was consistent across all 5 general surgery procedures included in the new procedure-specific NSQIP.
Although this study is the first to describe the adequacy of a limited model for general surgery, there are previous studies in other surgical populations. In a large cohort of patients having coronary artery bypass surgery in Ontario, Tu and colleagues compared a 6 variable risk-adjustment model to a more comprehensive 12 variable model (4
). With the additional 6 variables, the C-index increased minimally (0.77 to 0.79) and there were no clinically significant changes in hospital risk-adjusted mortality (4
). The authors concluded that risk-adjustment models could be simplified to make data collection more efficient. Our findings are consistent with this work, and extend the findings to general surgery.
There are two possible reasons why the limited and full models provide equivalent risk-adjustment. First, there may be a finite number of important risk domains that are adequately captured in the limited variable set. The additional variables in the “full” model may be redundant, adding little to the predictive ability of the model. Such redundant variables may represent an important risk domain, but the domain may be captured “better” by one of the variables already in the limited set. The most important variables in all models include functional status and ASA score, These variables have strong face validity as they represent important domains of risk: Frailty (functional status) and severity of comorbid disease (ASA score). These two variables also may be the strongest predictors of outcome because they are multidimensional constructs that actually represent many risk-domains. For example, impaired functional status may simply be the downstream consequences of having severe cardiac and pulmonary disease.
Another potential reason the limited and full risk models are equivalent is that some variables may not vary across hospitals. To confound hospital quality comparisons, a variable must satisfy two criteria. First, the variable must be associated with patient risk. A variable that does not relate to the outcome of interest cannot confound hospital comparisons. Second, the variable must vary across hospitals. A variable that is present in the same fraction of patients at every hospital—even if strongly associated with patient risk—cannot act as a confounding variable. For example, consider a variable that is strongly associated with mortality, such as preoperative sepsis. If one hospital has 1% of patients with preoperative sepsis and another has 20% this variable would need to be included in risk adjustment models. However, if all hospitals have 1% of patients with preoperative sepsis, this variable cannot confound comparisons and does not need to be included in risk-adjustment models. There are some empiric data supporting the idea that patients undergoing the same surgical procedure are relatively homogeneous. For example, we have previously shown that patient severity, as measured by the expected mortality rate, varies very little across hospitals performing cardiac surgery in both Pennsylvania and New York (3
The present study does have certain limitations. Although the NSQIP is rich in clinical detail, the present iteration does not collect some procedure-specific variables that could be important for optimal risk-adjustment (e.g., diverticulitis vs. colon cancer for colon resection). In the next iteration of NSQIP, there will be procedure-specific variables for each of the general surgery operations. These were selected through collaboration with surgeon experts to identify the most important variables for each procedure. The addition of these variables (approximately 3–5 per procedure) will likely dramatically improve the performance of risk-adjustment models. A second limitation of this study is a potential lack of generalizability. At present, the NSQIP disproportionately represents larger, teaching hospitals. As it expands to other hospitals, the variables needed in the “core” set may not be the same. Both of these limitations highlight the necessity of ongoing research to ensure the accuracy of more limited models over time.
This study highlights the need to consider the potential trade-off between the accuracy of risk-adjustment and the efficiency of data collection. There is no doubt collecting all possible patient characteristics would provide the most accurate data for risk-adjustment. Unfortunately, extracting these variables from the medical record is time consuming and expensive. Rather than treating this as a trade-off, we sought to develop a strategy that aims to trim the waste (e.g., eliminate variables that do not add predictive value) without compromising hospital quality comparisons. This study demonstrates that most if not all predictive power is captured by the most important variables. Thus, we can improve the efficiency of data collection without making significant trade-offs in the accuracy of risk-adjustment.
This study has important implications for the ACS NSQIP and other quality measurement platforms. Our findings imply that collection of risk factors can be dramatically reduced. Limiting collection of patient risk factors will decrease the work of those who collect data and reduce costs to hospitals. Since data collection is the single most expensive item for hospitals participating in NSQIP, this reduction in costs would make NSQIP more affordable. This reduction also creates an opportunity to make NSQIP more useful. With fewer risk factors to collect, hospitals can expand data collection to include other data elements that would help them improve quality. For example, the next iteration of NSQIP will also collect processes of care and outcomes specific to each operation. These will likely be more informative for quality improvement than the summary O/E ratios currently reported. Making NSQIP more affordable and more useful will help the program disseminate to a broader group of hospitals and help achieve the program’s goal of becoming the default quality improvement program for all United States hospitals.