|Home | About | Journals | Submit | Contact Us | Français|
The objective of this paper was to present a comprehensive approach to help health care organizations reliably deliver effective interventions.
Reliability in healthcare translates into using valid rate-based measures. Yet high reliability organizations have proven that the context in which care is delivered, called organizational culture, also has important influences on patient safety.
Our model to improve reliability, which also includes interventions to improve culture, focuses on valid rate-based measures. This model includes (1) identifying evidence-based interventions that improve the outcome, (2) selecting interventions with the most impact on outcomes and converting to behaviors, (3) developing measures to evaluate reliability, (4) measuring baseline performance, and (5) ensuring patients receive the evidence-based interventions. The comprehensive unit-based safety program (CUSP) is used to improve culture and guide organizations in learning from mistakes that are important, but cannot be measured as rates.
We present how this model was used in over 100 intensive care units in Michigan to improve culture and eliminate catheter-related blood stream infections—both were accomplished. Our model differs from existing models in that it incorporates efforts to improve a vital component for system redesign—culture, it targets 3 important groups—senior leaders, team leaders, and front line staff, and facilitates change management—engage, educate, execute, and evaluate for planned interventions.
In the years 1999 and 2001, landmark reports from the Institute of Medicine (IOM) made deficiencies in quality of care and patient safety inescapably visible to health care professionals and the public (Institute of Medicine 1999, 2001). What have we accomplished since these reports? Are we safer; and if so, how do we know? Many say we lack empiric evidence to demonstrate improved safety (Wachter 2004; Brennan et al. 2005; Leape and Berwick 2005), with few measures to broadly evaluate our progress with improvements. Current publicly reported performance measures are likely insufficient for providers to evaluate safety. In many hospitals, these performance measures apply to <10 percent of a hospitals' discharges (Jha et al. 2005). We need scientifically sound and feasible measures of patient safety.
In light of these challenges, health care has turned to “high-reliability organizations” (HRO) (e.g., aviation), who achieved a high degree of safety or reliability despite operating in hazardous conditions (Weick and Sutcliffe 2001). Exactly what does reliability mean in health care and how do we know if we are reliable? These answers remain elusive.
Reliability is often presented as a defect rate in units of 10 and generally represents the number of defects per opportunity for that defect. In health care, an opportunity for a defect usually translates to a population of patients at risk for the medical error or adverse event. For example, within a health care institution, failure to use evidence-based interventions may occur in five of 10 patients, or a catheter-related blood stream infection (CRBSI) in four of 1,000 catheter days (McGlynn et al. 2003; CDC 2004). A fundamental principle in measuring reliability is focusing on defects that can be validly measured as rates, which is not possible for most patient safety defects. Rates need a clearly defined numerator (defect) and denominator (population at risk) and must be devoid of reporting biases (see framework below).
In addition to valid measures, HRO and health care safety experts recognize that the context in which work occurs, called “organizational culture,” has important influences on patient safety (Donald and Canter 1994 Hofmann and Stetzer 1996; Zohar 2000; Barling Loughlin and Kelloway 2002; Hofmann, Morgeson, and Gerras 2003; Sexton, Thomas, and Pronovost 2005). For example, the ability of staff to raise concerns or senior leaders to listen and act on those concerns can influence safety. In health care, communication failures are a leading contributing factor in all types of sentinel events reported to the Joint Commission on the Accreditation of Health Care Organizations (http://www.jcaho.org), with poor communication often occurring between the caregivers who interact most often—physicians and nurses (Sexton, Helmreich, and Thomas 2000). Valid measures of safety climate constructs can be made by systematically eliciting frontline caregivers' perceptions of the organizations commitment to safety (i.e., “safety climate”) using questionnaires (Sexton et al. 2004). Early evidence demonstrates that safety climate is responsive to interventions (Pronovost 2005). As such, strategies to improve reliability must occur in a culture that is conducive to change.
A clear framework to measure safety within a health care organization is lacking. Federal agencies, organizations, and some institutions have developed “score cards” or performance measurement reports. However, some have scores of measures and most measures either lack validity (e.g., overall hospital mortality) or target-specific patient populations (e.g., congestive heart failure), preventing generalizability of results to the entire organization (Thomas and Hofer 1999; Hayward and Hofer 2004; Lilford et al. 2004). A comprehensive approach to evaluate an organization's progress with patient safety efforts has not been clearly articulated.
In this paper, we describe a comprehensive approach for health care organizations to measure patient safety and then present an example of how this approach was applied to eliminate CRBSIs and improve safety culture in intensive care units (ICUs) in the state of Michigan (Pronovost and Goeschel 2005). Through this improvement example, we hope to highlight the importance of balancing the use of scientifically sound and feasible measures of patient safety with wisdom from front-line staff, noting that both are necessary and equally important.
Donabedian's model for measuring quality can also serve as a framework for measuring safety. In this model, structure (how care is organized) plus process (what we do) influences patient outcomes (the results achieved) (Donabedian 1966). We adapted this model to patient safety by adding a fourth element, culture (the context in which care is delivered). While most current measures of quality focus either on process or outcome elements, many safety measures involve the structure and culture of patient care delivery.
The recent focus on measuring safety has prompted consideration of new structural measures. Such measures can include institutional variables, such as how involved leaders are in patient safety efforts, or credentialing mechanisms to ensure staff competency. Other measures could be task variables, such as the presence of protocols, or team variables where staff lower on the hierarchy feel comfortable voicing concerns to team members higher up the hierarchical ranks (Pronovost, Angus et al. 2002).
A key challenge in measuring safety is clarifying what can and cannot be measured as a valid rate. To be a valid rate, the numerator (event or harm) and denominator (population at risk) should be clearly defined and measured with minimal bias. A surveillance system must be in place to accurately identify and measure both the numerator and denominator of the rate (Gordis 2004). Most safety parameters, such as information from patient safety reporting systems (PSRS), are difficult, if not impossible, to capture in the form of a valid rate. Such safety parameters are still useful, but not interpretable as rates. For example, surgical complications or medical errors that result in significant patient harm are important as a numerical count (i.e., numerator), but they are not likely valid rates as there is no clear denominator and reporting biases present for the numerator. Establishing safety indicators that can be measured as valid rates is a critical first step in monitoring and improving safety and reliability.
We have developed a framework (Table 1) for measuring patient safety that has been previously published (Pronovost, Holzmueller, Sexton et al. 2006). In our framework, we address the critical issue of appropriate use of rates to measure safety by stratifying measurements into two categories. One category uses valid rate-based measures that are readily available using existing hospital resources. This category addresses outcome and process measures, respectively: (1) how often do we harm patients? and (2) how often do we use evidence-based medicine? The second category captures indicators that are essential to patient safety, but not measurable as valid rates. This category addresses structural and context measures, respectively: (1) how do we know we learned from mistakes? and (2) how well have we created a culture of safety—measured with the Safety Attitudes Questionnaire?
To improve the nonrate-based measures that are not related to a specific discipline in health care settings, we implement the comprehensive unit based safety program (CUSP), which has demonstrated improvements in safety culture (Pronovost 2005). CUSP provides enough structure such that a health care organization can develop a broad strategy to improve safety, yet flexible enough to defer to the local concerns and wisdom of staff in individual care areas. As part of CUSP, a senior executive adopts a work area and actively participates in safety efforts with staff. Staff in each work area are asked to learn from one defect per month, and department and hospital leaders learn from one defect per quarter using a structured tool (Pronovost, Holzmueller, Martinez et al. 2006). The goal is to move away from just reporting and superficially reviewing multiple hazards to focusing intently on a few and mitigating the hazards (i.e., redesign the system in which work is performed). In addition, CUSP asks safety teams to implement tools, such as daily goals and morning briefings (Pronovost et al. 2003; Thompson et al. 2005) to help improve safety culture.
Our model to improve reliability focuses on the rate-based measures of safety; how often do we harm patients and how often do we use evidence-based medicine. Rate-based measures are specific to a clinical area or discipline. Using the objective of eliminating CRBSI as an example, we describe the model below
1. Identify interventions associated with an improved outcome in a specific patient population. To a large extent, this has been accomplished with practice guidelines or summaries of clinical research evidence. For example, the Centers for Disease Control (CDC) and others have published guidelines for preventing CRBSIs (CDC 2004).
2. Select interventions that have the biggest impact on outcomes and convert these into behaviors (Grimshaw et al. 2001b; Michie and Johnston 2004). The team should focus on approximately five interventions that are supported by strong evidence, have the greatest potential benefit, and reflect patients' values and preferences. Recent recommendations to grade evidence into “do or do not do” will greatly facilitate this step (Atkins et al. 2004). In selecting interventions, it may be helpful, if not done in the evidence review, to make a table of each potential intervention with the strength of the evidence supporting its use, the strength of the relationship (e.g., a risk ratio) between the intervention and the outcome, and the barriers in implementing the intervention (Gordis 2004).
3. Develop measures to evaluate reliability. Here, we seek a scientifically sound and feasible rate-based measure that can either be an outcome or process element of safety. The measure(s) selected should be carefully considered. Both types of measures have strengths and weaknesses that have been published (Rubin,Pronovost, and Diette 2001; Lilford et al. 2004; Pronovost, Nolan et al. 2004). For example, if the intervention is a medication, we could measure if it was given, or what medication, dose, and/or when it was given. Several principles guide us in deciding which of these to measure. First, choose measures that are scientifically sound or supported by the evidence. If timing or dose of antibiotic administration is important, measure when the medication was given and dose given as two separate variables. Second, measure what is feasible, or easily collected with available resources. Third, if possible, measure where defects most commonly occurred. To do this, review each step in the process for a sample of patients and identify where defects most commonly occurred. For example, evidence suggests that steroids reduce mortality in septic shock patients (Annane et al. 2002). When we monitored use of steroids for this patient population, we found that failure to prescribe the medication was the most common defect. As a result, we developed a measure to evaluate whether patients with septic shock received steroids.
Development of measures typically requires significant resources and expertise in developing measures and specific clinical content (Garber 2005), which few health care organizations will likely have available. As such, national measures should be developed and broadly shared among health care organizations.
In this case, the National Nosocomial Infection Surveillance System (NNIS) has standardized measures for CRBSI that are valid, reliable, and widely used, which prompted us to measure the outcome rather than the process. Attempts to measure the process proved neither valid nor feasible. Such a measure would require additional ICU staff—not available to us—to independently monitor the placement of all central venous catheters.
4. Measure baseline performance. This is the best test of whether the proposed measures can be feasibly collected. If baseline data cannot be collected with minimal bias, it is unlikely that these data can be collected after the intervention has been implemented. Moreover, without baseline data, an organization cannot assess if safety has improved. In addition to collecting data, a health care organization should create a database to evaluate data quality and missing data, store and analyze data, and produce reports. In our experience, few quality improvement projects create such a database.
5. Ensure patients receive evidence-based interventions. This effort is the biggest challenge. While steps 1–4 are generally performed by a team of researchers and clinicians with sufficient resources who may or may not personally implement the interventions, step 5 involves teams from the participating health care organization who will actually implement the interventions. These interventions must be tailored to address each participant's current system, culture, resources, and commitment. While there is no formula for system redesign, there are many tactics that appear effective for improving care (Grol et al. 1998; Cabana et al. 1999; Grol 2001; Pronovost, Wu et al. 2002; Pronovost, Weast et al. 2004; Pronovost and Berenholtz 2002; Bradley et al. 2005).
The change model we used to improve reliability (outlined in Table 2) was designed as a practical application of theories related to diffusion of innovation and behavior change (Grimshaw et al. 2001a; Greenhalgh et al. 2004; Michie et al. 2005). The change model includes four components: engage, educate, execute, and evaluate. Each component targets senior leaders, team leaders, and front-line staff.
Engaging and educating front-line staff is challenging and resource intensive. The execute component encourages staff to use HRO theory (i.e., standardize, create independent checks, and learn from mistakes) to ensure patients receive evidence-based interventions. Here, we encourage teams to first consider how they can standardize (including reducing complexity) what they do to reduce the risk of failure. Often this step includes creating a standard order set or protocol. Next, teams create independent checks (i.e., two or more persons recheck independent of the other[s]) for key processes. Finally, when defects occur, teams are encouraged to evaluate or learn the causes.
We applied this safety framework to improve safety in over 100 Michigan ICUs. This research study, called the Keystone ICU project, was based on a collaborative model (Ovretveit et al. 2002; Mills and Weeks 2004) between the Johns Hopkins University, Quality and Safety Research Group (QSRG), and the Michigan Health & Hospital Association (MHA), Keystone Center for Patient Safety & Quality. The project was designed as a prospective cohort study to evaluate the effects of implementing patient-safety interventions. The research was conducted from September 30, 2003 to September 30, 2005. It was funded by the U.S. Agency for Healthcare Research and Quality (AHRQ) and received Institutional Review Board approval from the Johns Hopkins University School of Medicine.
In June 2003, all Michigan hospitals with ICUs were invited to participate in the Keystone ICU project. To participate, hospitals had to assemble an ICU improvement team and send a written commitment to the project, signed by a hospital senior executive. At a minimum, the ICU improvement team included a senior executive, the ICU director and nurse manager, an ICU physician and nurse, and often a department administrator. Hospital senior executives were asked to ensure that the ICU physician and nurse could commit 20 percent of their time to the project. In addition, each team agreed to implement the patient-safety interventions, collect and submit the required data in a timely manner, attend the biannual 1.5-day conferences, and participate in monthly conference calls.
The overall objective of the study was to improve patient safety using the safety score card (Table 1), in participating ICUs. We will discuss two study objectives in this paper: improving culture and eliminating CRBSI.
Before implementing the intervention (January–March 2004), and 1 year after exposure (March–May 2005), participating ICUs assessed their safety culture in a pre–post design. The Safety Attitudes Questionnaire (SAQ) (ICU version) (Pronovost and Sexton 2005) was administered to all caregivers who routinely had contact with ICU patients. This survey is reliable, sensitive to change (Gregorich, Helmreich and Wilhelm 1990; Thomas et al. 2005), and elicits attitudes shown to predict important performance outcomes (Foushee 1984; Helmreich et al. 1986; Pronovost et al. 2005). The six domains of the SAQ are perceptions of management, job satisfaction, stress recognition, working conditions, teamwork climate, and safety climate. A more detailed report of assessing and improving safety climate will be presented elsewhere (unpublished data, assessing and improving safety climate in a statewide sample of ICUs). In this study, we report context of care issues related to patient safety and perceptions of leadership to demonstrate the impact of providers surfacing and addressing safety issues with hospital leaders. Response options for each item range from 1 (disagree strongly) to 5 (agree strongly).
Throughout the study, data on the number of CRBSIs and central line days were collected monthly from the hospital Infection Control Practitioner (ICP) using CDC NNIS system definitions and standards. A quarterly CRBSI rate was calculated as the number of infections per 1,000 central line days for each 3-month period. Each quarterly CRBSI rate was assigned to one of five time periods: preimplementation baseline, peri-implementation, and 0–3, 4–6, or 7–9 months postimplementation.
To reduce bias in data collection, we developed a manual of operations, which included explicit definitions for each process and outcome measure. Standardized data collection forms were developed, pilot tested, revised, and distributed to ICU teams and then converted into electronic format. Teams were trained to collect data via conference calls. ICUs received monthly and quarterly ICU performance reports and compared their performance with aggregate results from the other participating ICUs.
Teams focused first on improving ICU culture using CUSP because we believed that this change was necessary before teams could redesign care and improve reliability (Sexton, Helmreich and Thomas 2000; Shortell et al.2004b). For rate-based measures, we will discuss CRBSI.
Identify and select interventions (steps 1 and 2). To reduce CRBSI, we summarized the nearly 100 page evidence summary into five behavior-specific interventions related to central line placement: (1) wash your hands, (2) use full-barrier precautions, (3) prepare the insertion site with chlorhexadine antiseptic, (4) avoid the femoral site for insertion, and (5) remove unnecessary lines.
Develop measures and collect baseline data (steps 3 and 4). Because NNIS has standardized measures for CRBSI, we opted to measure the outcome not the process. The median baseline rate of CRBSI was 4.2 per 1,000 catheter days.
Ensure patients receive interventions (step 5). Using our change model, we engaged ICU staff by providing an estimate of the number of deaths attributable to CRBSIs in their ICU. Indeed, harm was now visible. We educated staff by making the research evidence supporting the CRBSI intervention easily accessible in the form of original literature, concise evidence summaries, and slide presentations of the relevant literature. We accomplished engagement and education of front line ICU staff through conference calls, newsletters, and printed educational materials. Available resources limited our ability to create electronic learning tools, but this represents another potential aid for engaging and educating a large number of ICU staff. To implement the Keystone ICU project interventions, team leaders were encouraged to make a task list and associated time line for the interventions and then pilot test the interventions on a small sample of patients or caregivers before wide-scale implementation.
To execute the interventions and ensure patients reliably received these evidence-based interventions, we asked teams to standardize, create independent checks, and learn from mistakes. Complexity was reduced and standardization accomplished by creating a central line cart to store all necessary equipment and supplies for line insertion. Previously, caregivers went to eight different locations in the ICU to collect all necessary equipment. In addition, we created a checklist of the five interventions to reduce CRBSI and empowered nurses assisting with central line placement to ensure physician compliance with all five interventions under nonemergency conditions (Berenholtz et al. 2004). Finally, when a CRBSI occurred, the care team evaluated the case to identify whether it could have been prevented.
ICU teams partnered with their hospital infection control staff to implement the CRBSI intervention and monitor its impact. This approach centralized and standardized data collection and fostered local ownership and accountability for improving CRBSI.
In addition, we directly involved senior leadership from each participating hospital in specific tasks to help the project succeed. For example, strong evidence has shown that skin sterilization, specifically with chlorhexadine, before central venous catheter placement will reduce CRBSI by 50 percent (Mermel 2000). At the start of this project, 20 percent of Michigan hospitals had chlorhexadine routinely available in their ICU central line kits. Chief executive officers (CEOs) were sent a letter from the principal investigator (PJP) and project director (CG) outlining this evidence and asking them to facilitate the availability and use of chlorhexadine in their hospitals. Within 6 weeks, 76 percent of participating hospitals had chlorhexadine in-house and by project end, all teams were using chlorhexadine (Pronovost and Goeschel 2005).
The senior leader's role is to provide teams with sufficient resources and incentives, and remove barriers (e.g., political) to the team's success. Unfortunately, we lack a formal mechanism to evaluate the extent to which teams perceive senior leaders are performing this role. To surface barriers for successful intervention implementation and provide feedback to senior leaders regarding these barriers, we surveyed ICU teams monthly using a “team checkup” survey. Specifically, we asked their perceptions about the adequacy of physician and senior leader support, time to implement the interventions, and support for data collection. Over half of the teams reported that senior and physician leaders, and insufficient time significantly deterred their progress. Senior leaders were given this survey data in a process to evaluate their leadership role in the project.
We obtained data from 99 of 107 ICUs in 2004 and 98 of 127 in 2005. Between the 2004 and 2005 administrations, ICU mergers, closings, splits, or failure to collect data in both years left 72 ICUs intact with 2004 and 2005 data. In 72 ICUs, we received 4,474 of 5,975 surveys (75 percent response rate) in 2004 and 3,876 of 5,965 (65 percent response rate) in 2005. Two-tailed paired sample t-tests showed that context of care items improved, and we report agreement with the following items at the respondent and ICU levels: “Patient safety is constantly reinforced as the priority in this ICU”t(71)=5.091, p<.001, respondents pre 74 percent and post 80 percent, ICUs pre 58 percent and post 78 percent; “There is widespread adherence to clinical guidelines and evidence based criteria in this ICU”t(71)=7.041, p<.001, respondents pre 59 percent and post 66 percent, ICUs pre 10 percent and post 25 percent; “The administration of this hospital is doing a good job”t(71)=3.449, p<.001, respondents pre 36 percent and post 42 percent, ICUs pre 1 percent and post 3 percent; “Hospital administration supports my daily efforts”t(71)=3.417, p<.001, respondents pre 33 percent and post 38 percent, ICUs pre 1 percent and post 3 percent. Figure 1 shows the distribution of agreement for each context of care item for 2004 and 2005.
At the start of the study, 107 ICUs agreed to participate and 98 ICUs collected CRBSI data using NNIS definitions. As described in Table 3, the proportion of all ICU months of observation with zero CRBSI increased from 59 percent at baseline to 80 percent by 7–9 months postimplementation (p=.005, relative risk 0.50, 95 percent CI 0.29–0.87). The prevalence of small ICUs with a relatively low number of catheter line days made the goal of eliminating CRBSI easier. Consequently, we performed a sensitivity analysis of the results. In this analysis, we focused only on observations with ≥150 catheter line days per month (52 percent of the entire sample). As expected, the proportion of ICU months with zero CRBSI decreased (44 percent at baseline). However, the magnitude and significance of the improvement achieved by the intervention was unchanged (relative risk 0.53, 95 percent CI 0.30–0.91). For observations with <150 catheter line days per month, the proportion of ICU months with zero CRBSIs at baseline was greater than 74 percent and the intervention also demonstrated evidence of benefit, but did not reach statistical significance because of the small number of observations in this strata in the 7–9 month postintervention time period (relative risk 0.19, 95 percent CI 0.03–1.32). Thus, the benefit of this intervention is beneficial at eliminating CRBSI across a range of ICU sizes.
We present a framework for improving reliability in health care that was associated with a significant reduction in CRBSIs across nearly 100 ICUs in Michigan and with significant improvements in safety culture (Grol 2001). Our framework differs in several important ways from existing models. First, it incorporates efforts to improve culture. Organizational culture is the lubrication that allows for system redesign and helps ensure the sustainability of changes (Sexton, Helmreich and Thomas 2000; Shortell et al. 2004a.,). Second, we targeted three distinct groups to improve safety: senior leaders, team leaders, and front-line staff. Third, teams were given a manual of operations to facilitate change management—engage, educate, execute, and evaluate—for the planned interventions.
Nevertheless, there are limitations to the proposed framework. First, the resources required for this model likely exceed those available at any single hospital. Consequently, these programs are best implemented through a large consortium of hospitals (e.g., state-wide via a state hospital association). Second, measuring improvements in safety takes resources to both develop measures and collect data. Developing measures requires expertise not commonly present in most health systems, and the collection of data for many measures is not yet a routine part of hospital operations. Third, we calculated CRBSI rates using the 1,000 catheter days standardized through NNIS, which does account for the majority of risks of exposure to a catheter, but does not account for an individual patient's risk of infection from the device. Fourth, like all models, this proposed framework requires empiric validation; efforts to improve effectiveness and efficiency should be a research priority.
Nearly all of health care lacks the ability to evaluate whether they are providing safer patient care. HRO provide insight into the context of care, often called culture, that influences reliability. In this paper, we outline a framework for health care organizations to improve reliability and describe application of this model to a large cohort of ICUs in Michigan. Use of this model was associated with a significant reduction in CRBSIs and improvement in culture. We look forward to empiric validation of this model.
Funding was provided in part from the Agency for Healthcare Research and Quality (AHRQ), grant #1UC1HS14246.