Rates of Adverse Patient Outcomes and Length of Stay
In , we compare the length of stay and rates of adverse outcomes in the 11-state Medicare sample to those in the 11-state all-patient sample, and length of stay and outcome rates in the national MedPAR sample to those in the 11-state Medicare sample. We also examine the correlation of rates in the 11-state all-patient and 11-state Medicare samples, and examine the frequency with which hospitals report no adverse outcomes. These are compared separately for medical and surgical patients.
Comparison of 11-State All Patient, 11-State Medicare Only, and National Medicare Length of Stay and Rates for 11 Patient Complications
For both medical and surgical patients in the 11-state samples, length of stay is approximately 25 percent higher for Medicare patients compared with that for all patients. The rates for each outcome in the 11-state Medicare sample are also higher than in the 11-state all-patient sample. Medicare rates are 10 to 70 percent higher for medical and surgical patients. All differences are statistically significant.
In the national MedPAR sample, among the medical patients, all outcomes rates are slightly lower than in the 11-state Medicare sample. Although the differences are statistically significant for all but three outcomes, the values in the two samples are close, differing on average by 6 percent. Among surgical patients, however, the rates in the national MedPAR sample are substantially higher than in the 11-state Medicare sample, on average 40 percent higher, with the differences statistically significant for all but three outcomes. The higher rates in the national MedPAR sample may be due to the inability to restrict the analysis to patients operated on the first or second day of hospitalization.
Correlation of All-Patient and Medicare Rates
The correlation between hospital rates for the 11-state all-patient and the 11-state Medicare samples is quite high in the medical pool, consistently above .8, but lower in the major surgery pool, ranging from .6 to .9. These results suggest that Medicare-patient experience may be a better proxy for all-patient experience among medical patients.
Frequency with Which Hospitals Report No Adverse Outcomes
Given the low rates for some of the outcomes, it is possible that some smaller hospitals will have no patients with adverse outcomes. Zero rates are difficult to interpret in descriptive analysis, since they may result from high quality care, a small cohort of patients at risk, or poor coding. The count models used in the regressions of these outcomes on nurse staffing appropriately take into account zero rates due to the first two causes, but the power to detect statistically significant effects is diminished as the proportion of hospitals with rates of zero increase. Among surgical patients, the proportion of hospitals with zero rates can be quite high; for 4 of the 10 outcomes, more than 20 percent of hospitals in some samples have no events. For both medical and surgical patients, the proportion of hospitals with zero rates is always higher among Medicare patients, a smaller pool, than all patients.
Comparing rates in the two Medicare samples, for medical patients, for all outcomes except pneumonia, the proportion of hospitals with no events in the national MedPAR sample is always higher than in the 11-state Medicare sample. Among surgical patients, by contrast, except for metabolic derangement, the proportion of hospitals with no events in the national MedPAR sample is always lower than in the 11-state Medicare sample. This is consistent with the higher rates of complications in the national MedPAR sample for surgical patients.
Relationship between Patient Outcomes and Nurse Staffing
Means and standard deviations for the nursing variables used in the regression analysis are presented in . Average licensed hours of nursing staff hours and the mean proportion of hours provided by RNs were greater in the 11-state sample than in the MedPAR sample ().
Mean and Standard Deviation of Hospital Inpatient Nurse Staffing Measures, 1997
To assess whether using outcomes for Medicare patients leads to the same conclusions as those based on all patients, we regressed counts of outcomes for each of these samples of patients on nurse staffing and other hospital variables. We examined whether the results were comparable by assessing whether regression results were in the same direction and statistical significance, and whether the magnitude of the estimated effects were statistically equivalent in regressions where a statistically significant association was found. We first compared results in the 11-state all-patient sample to those in the 11-state Medicare sample, and then compared results in the national MedPAR sample to those in these two 11-state samples. We examined medical and surgical patients separately, and in reporting results, present results only for the two measures of nurse staffing—the proportion of licensed hours from RNs and licensed hours per day. Full regression results are in Appendix 1
(online version, which is available at http://www.blackwellpublishing.com/products/journals/suppmat/HESR/HESR02025/HESR02025sm.htm
Among the eight measures for medical patients (), there is complete agreement in the results in the 11-state all-patient and 11-state Medicare samples for four measures for which an association with nurse staffing is observed—length of stay, urinary tract infection, pneumonia, and shock/cardiac arrest. There is also complete agreement in the analyses for two measures in which no effect is observed—pressure ulcers and sepsis.
Regression of Length of Stay and Patient Complications on Nurse Staffing Variables, Medical Patients in 11-State All-Patient, 11-State Medicare Only, and National MedPAR Samples
There is disagreement for two measures. With respect to failure to rescue, in the 11-state all-patient sample, there is an association with the proportion of hours provided by RNs. The incidence risk ratio (IRR) on RN proportion for the 11-state Medicare sample is similar in magnitude to that for the all-patient sample, and the p-value for the IRR is .056, indicating that the results are very close. For UGI bleeding, we find a significant association in the 11-state all-patient sample for RN proportion, but not for licensed hours per day, while in the 11-state Medicare sample, there is no significant association with RN proportion but there is for licensed hours. However the IRRs for RN proportion and licensed hours are similar in magnitude across these two models, and the p-value for RN proportion in the 11-state Medicare sample is .052, and the p-value for licensed hours in the 11-state all-patient sample is .075. Thus, across all eight measures, the two models generate results that are similar even though not totally concordant.
Similarly, there is a high degree of concordance between the results of the national MedPAR analysis and the 11-state Medicare sample for medical patients. For five outcomes—pneumonia, shock/cardiac arrest, pressure ulcer, sepsis, and failure to rescue—the results agree completely. For two outcomes for which an association with RN proportion is found in the 11-state Medicare sample—length of stay and urinary tract infections—a statistically significant association is also observed in the national MedPAR sample, although the magnitude of the IRR is significantly closer to one in the national MedPAR sample. For another measure, UGI bleeding, a measure in which the results differed somewhat between the 11-state all-patient and 11-state Medicare samples, there is no observed association between nurse staffing in the national MedPAR sample.
Results show complete agreement for surgical patients () in the 11-state all-patient and 11-state Medicare samples for three measures in which an association with nurse staffing is observed—pneumonia, failure to rescue, and metabolic derangement—and three measures in which no effect is observed—length of stay, sepsis, and wound infection.
Regression of Length of Stay and Patient Complications on Nurse Staffing Variables, Surgical Patients in 11-State All-Patient, 11-State Medicare Only, and National MedPAR Samples
There is disagreement in four measures. For one—urinary tract infections—an association is observed with RN proportion in the 11-state all-patient sample, but not the 11-state Medicare sample. The magnitude of the IRRs are close, however, and the p-value on RN proportion in the 11-state Medicare sample is .062. For two measures—pressure ulcer and shock and cardiac arrest—we observe an association of licensed hours or RN proportion in the 11-state Medicare sample but not the 11-state all-patient sample. Here, too, the magnitude of the IRRs across the models is similar and the p-values on the corresponding IRRs in the all-patient sample are below .10 (pressure ulcer: p=.081; UGI bleeding: p=.077). For shock/cardiac arrest, however, the results differ substantially. An association is observed with RN proportion in the 11-state Medicare sample but not in the 11-state all-patient sample. The IRRs, while not statistically different, are much further apart than for other outcomes, and the p-value in the 11-state all-patient sample is greater than .10. While there is a high degree of concordance in results for surgical patients between the 11-state all-patient and 11-state Medicare samples, it is lower than that for medical patients.
There is substantial discordance in the results between the 11-state Medicare sample and national MedPAR sample because only two measures are consistent—pressure ulcer and shock/cardiac arrest. These are the two measures for which no statistically significant association was observed in the 11-state all-patient sample, suggesting that these may be more sensitive measures for Medicare surgical patients than patients in general.
For four measures, an association of at least one nursing variable and the outcome is found in the national MedPAR sample but not the 11-state Medicare sample—length of stay, UTI, sepsis, and wound infections. While the coefficients in the length of stay analysis and IRRs for the other three measures are not statistically different, only for sepsis are they the same magnitude and in the predicted direction. For length of stay, the coefficient on licensed hours is three times larger in the national MedPAR sample and the p-value for this variable in the 11-state Medicare sample is .73. For urinary tract infection, the IRR on licensed hours, significant with a value of .991 in the national MedPAR sample, is over one in the 11-state Medicare sample. For wound infection, the statistically significant association of RN proportion in the national MedPAR sample is not in the predicted direction.
For one outcome—metabolic derangement—we observe a statistically significant association of RN proportion in the 11-state Medicare sample but not the national MedPAR sample. The IRR in the national MedPAR sample, while not statistically significant, is over one, that is, not in the expected direction.
For the remaining three outcomes—pneumonia, UGI bleeding, and failure to rescue—we find statistically significant associations with one of the two nurse staffing variables in both the 11-state Medicare sample and national MedPAR sample, but the variable that is statistically significant differs across the two models. Only in the case of upper GI bleeding are the IRRs of comparable magnitude statistically equal and in the predicted direction for the two samples.
As noted above, there are substantial differences between the data for outcomes and nurse staffing used in the national MedPAR and 11-state Medicare sample. To determine whether differences in the datasets produced different results across the Medicare samples, we reran the regression analysis in the 11-state Medicare sample so that it more closely matched the national MedPAR analysis. Specifically, we dropped the day of surgery restriction originally imposed in the 11-state analysis, reestimated the counts of expected cases based on the less restricted definition, and, using the AHA staffing data used in the national MedPAR analysis, reran the count model regressions. Results from the 11-state Medicare sample analyzed using national staffing data and less restrictively coded complications do not match those from national MedPAR sample, and we conclude that the differences in staffing and outcome definitions in the two Medicare samples do not explain the differences observed (results not shown).
Assessing quality over time and across a large number of health care institutions are important to achieving improved quality in U.S. hospitals, and administrative datasets are important in this effort. The overall question motivating this study was whether measures of hospital quality can be constructed from data on Medicare beneficiaries alone, or whether data on all patients are required when examining correlates of quality using administrative data. We addressed this question in two ways. We examined the correlation between rates of adverse outcomes for the same hospitals for their Medicare patients and all patients at the hospital and found that correlations were high, although lower for major surgery patients. The lower correlation is likely due to the smaller pools of Medicare surgical patients and the larger number of hospitals with no cases of the adverse outcomes in the Medicare pool.
We also applied an operational test of whether comparable conclusions would be drawn from regression analysis involving Medicare-only samples and all-patient data. Comparing regressions of outcomes on measures of hospital nurse staffing, we found that results in an 11-state all-patient sample, an 11-state Medicare sample, and a national MedPAR sample were generally consistent for medical patients, but less consistent for surgical patients. Also among surgical patients, there were only two outcomes among the ten studied in which results in the 11-state Medicare and national MedPAR analyses agreed. Recoding the outcomes in the 11-state Medicare sample and using the same staffing data to make the analysis in the two Medicare samples more comparable did not resolve this conflict.
Overall, we conclude that outcome measures applied to medical patients that are implemented in Medicare-only datasets are likely to yield comparable results to those that would be observed in analyses using all-patient data. Thus, using national Medicare data from medical patients in studies of hospital quality is justified.
We would urge caution, however, in using quality measures in surgical patients in Medicare-only data; these measures may not provide results comparable to those from all-patient samples or across different samples of Medicare patients. The reasons for the differences across Medicare samples are not clear. The inability to implement day-of-procedure restrictions from public use data does not explain the differences. A more likely explanation is that the smaller size of the surgical pool of patients, their lower risk for many complications, and the higher proportion of hospitals with no reported complications among surgical patients make it harder to obtain consistent results in regression-based studies of surgical patients using administrative data. The three-sample approach to cross-validating measures presented here is one way to test the usefulness of Medicare-only analysis in these patients.
This paper assesses the ability of Medicare data to substitute for all-patient data in studies of correlates of quality using regression-based techniques. A second potential use of Medicare data as a substitute for all-patient data is in studies that assess quality in specific hospitals. The high correlation of the all-patient and Medicare measures presented in suggests that Medicare data might be usable for studies of hospital-specific quality. To fully assess this potential, additional analysis is required. This would include: examining the degree of agreement in ranking hospitals by rates of complications when using each dataset; determining whether observed disagreements are associated with specific hospital characteristics, especially the relative and absolute size of the hospital's Medicare patient population; and assessing how stable rates are for quality measures for hospitals with small numbers of patients.
In conducting the comparisons reported here, we encountered many challenges that arose principally from weaknesses of currently available data, particularly the well-known problems associated with using discharge data to construct quality measures (Geraci 2000
; Geraci et al. 1997
; Lawthers et al. 2000
; Weingart et al. 2000
). Because there is no reliable coding of “present on admission” status for secondary diagnoses reported on discharge abstracts, constructing coding and exclusion rules for each adverse outcome requires considerable clinical judgment and technical skill. Complications and adverse outcomes are likely to be underreported, and underreporting may be higher where staffing is low.
Despite these difficulties, we believe that administrative datasets offer a valuable tool for understanding factors influencing quality across hospitals. While more states are making available all patient discharge datasets, these are not universally available. Moreover, creating consistent data across many states can be both time-consuming and expensive. As a consequence of this, Medicare MedPAR data will remain a major data source for analyzing hospital quality. The CMS should take steps to improve the usefulness of these data, including adding day-of-procedure codes to public use datasets. The CMS, the Agency for Healthcare Research and Quality through its Healthcare Cost and Utilization Project (HCUP), and individual states should take additional actions to improve the usefulness of their discharge data for studying quality. They should require consistent and accurate coding of present-on-admission status for secondary diagnoses and identify a set of “must code” secondary diagnoses that are hospital acquired and related to quality. With these changes to discharge abstracts, the ability to monitor quality of care, whether using all-patient or Medicare data, will be enhanced considerably.