We found no systematic effect of resident duty work hour reform on potentially preventable safety-related events as measured by the AHRQ PSIs. Although we hypothesized that rates of PSIs related to continuity of care would worsen due to more handoffs and increased reliance on cross-cover arrangements, this did not appear to be the case among either VA or Medicare patients. We also had hypothesized that rates of technical-skilled PSIs would improve due to reduced fatigue among residents. However, there were no differences in the rate of change in this composite between more vs. less teaching-intensive hospitals in the VA. While we did see an increase in post-reform year 1 in the odds of the “Technical Care” composite among Medicare patients in more vs. less teaching-intensive hospitals, this increase was small in magnitude and no longer significant in post-reform year 2.
However, we saw higher rates of events in our “Other” PSI composite in more teaching-intensive hospitals, relative to less teaching-intensive hospitals, in post-reform year 2 in the VA. Because the absolute difference in risk was small and limited to the VA, this finding should be interpreted cautiously and in the context of our previous work, which suggested no systematic changes in mortality.3,6
There may be several explanations for the lack of any systematic change in the rates of PSIs. First, our original conceptual framework linking specific PSIs with broad domains of care may have been incorrect. Second, interventions intended to reduce physician work hours may have had unanticipated negative effects on nursing care, especially within the VA system, perhaps by reducing the availability of physicians for interdisciplinary communication or by imposing more ongoing burdens on nurses. Third, although residents may get more sleep, increased handoffs could have offsetting negative effects. Fourth, the duty hour reform still allowed 30 hours of continuous work, making residents prone to acute sleep deprivation. Finally, compliance may not have been high,31
although the data on this are limited.
Our study is the first national study to examine the association between duty hour reform and patient safety and to compare the degree of change across national samples of Medicare and VA patients. Other studies have found beneficial effects of reduced resident work hours primarily from direct observation of residents or self-report from frontline providers.5,14,15
Our study eliminates some of the methodological limitations found in other studies by comparing findings across federal and non-federal hospitals, including data for three years pre-reform and two years post-reform, utilizing indicators of patient safety developed specifically to capture potential safety-related events, and using a difference-in-differences approach to reduce the likelihood of confounding.
Despite the strengths of this study, there were limitations. We did not have clinical data for risk adjustment, limiting our analyses to administrative data, which lack clinical detail and are subject to variability in coding practices across providers.16,17
However, our difference-in-differences analysis essentially treated each hospital as its own control, factoring out inter-hospital differences in coding that were consistent over time. Nonetheless, a potential limitation with all difference-in-difference studies is unmeasured confounding due to contemporaneous interventions that may have differentially affected teaching or non-teaching hospitals. Another limitation was related to power. Despite using all available data for both the VA and Medicare as well as aggregating individual PSIs into composite measures because of the low prevalence of individual PSIs,16
our confidence intervals were still relatively wide, particularly in the VA.
We were also limited in our ability to measure patient safety using administrative data. Although the PSIs are standardized; demonstrate face, content, and predictive validity;16,32,33
and have been applied to numerous data sets,22,34,35
their criterion validity has not yet been established. It is possible that the PSIs are not sensitive enough to detect changes over time. The few published studies examining the criterion validity of the PSIs have been limited by small sample sizes or lack of a true gold standard.34–39
A recent study examining the criterion validity of five of the surgical PSIs in the VA found moderate sensitivities (19% – 56%) and positive predictive values (PPVs) (22% – 74%).40
Postoperative respiratory failure and postoperative wound dehiscence had the highest PPVs (74% and 72%, respectively) of all PSIs examined. Two current studies41,42
are examining the criterion validity of the PSIs; one study recently reported PPVs ranging from 40% for postoperative sepsis to 90% for accidental puncture or laceration.43
The addition of POA codes, which were added to Medicare data last year but have not yet been added to VA data, will help improve PPV in future applications.
These results, along with recent endorsement by the National Quality Forum of four PSIs (accidental puncture or laceration, iatrogenic pneumothorax, foreign body, and postoperative wound dehiscence),44
suggest that some of the PSIs, such as those in our “Technical Care” composite, may be ready to use in examining the effects of policy reforms over time. Poulose et al. (2005) also used the PSIs to evaluate a previous effort to reduce resident work hours, but they found worsening trends in accidental puncture or laceration and postoperative PE/DVT after implementation of work hour limits in New York State.7
Our findings related to the impact of work hour reform nationally are more reassuring.
At present, however, the PSIs are still regarded by both AHRQ and the user community principally as screening tools to flag potential safety-related events rather than as definitive measures.45,46
We also view the PSIs as indicators of potential safety-related events,32,40–43, 47,48
although their advantages in using administrative data make them attractive relative to other measures of hospital-safety performance. No easily-obtainable, objective, alternative measures of hospital-safety performance currently exist.49
In conclusion, our study showed that implementation of the ACGME duty hour rules did not have an overall systematic impact on potential safety-related events in more vs. less teaching-intensive hospitals. These findings do not suggest, however, that implementation of duty hour reform was a mistake. Rather, they highlight the importance of obtaining a more comprehensive understanding of what approaches to implementation have worked best and the mechanisms by which outcomes for some programs improved and others worsened. To improve safety, further study is needed to assess which interventions best minimize the negative effects of physician handoffs while maximizing the benefits of reduced fatigue. Gathering data on the contribution of different system-level approaches to duty hours, such as night floats, shift work, mandatory naps, or greater use of hospitalists and physician extenders, will help to inform future resident work hour reform efforts.50
Nonetheless, the question of how to optimally regulate resident duty hours will continue to provoke debate, and this will likely persist until we can demonstrate improvements in outcomes of care rather than maintenance of the status quo.