|Home | About | Journals | Submit | Contact Us | Français|
Improvement collaboratives consisting of various components are used throughout healthcare to improve quality, but no study has identified which components work best. This study tested the effectiveness of different components in addiction treatment services, hypothesizing that a combination of all components would be most effective.
An unblinded cluster-randomized trial assigned clinics to one of four groups: interest circle calls (group teleconferences), clinic-level coaching, learning sessions (large face-to-face meetings), and a combination of all three. Interest circle calls functioned as a minimal intervention comparison group.
Outpatient addiction treatment clinics in the U.S.
201 clinics in 5 states.
Clinic data managers submitted data on three primary outcomes: waiting time (mean days between first contact and first treatment), retention (percent of patients retained from first to fourth treatment session), and annual number of new patients. State and group costs were collected for a cost-effectiveness analysis.
Waiting time declined significantly for 3 groups: coaching (an average of −4.6 days/clinic, P=0.001), learning sessions (−3.5 days/clinic, P=0.012), and the combination (−4.7 days/clinic, P=0.001). The coaching and combination groups significantly increased the number of new patients (19.5%, P=0.028; 8.9%, P=0.029; respectively). Interest circle calls showed no significant effects on outcomes. None of the groups significantly improved retention. The estimated cost/clinic was $2,878 for coaching versus $7,930 for the combination. Coaching and the combination of collaborative components were about equally effective in achieving study aims, but coaching was substantially more cost effective.
When trying to improve the effectiveness of addiction treatment services, clinic-level coaching appears to help improve waiting time and number of new patients while other components of improvement collaboratives (interest circles calls and learning sessions) do not seem to add further value.
Quality improvement (QI) collaboratives are used in North America, Australia, England, and European countries to improve healthcare quality, and yet little is known about which components of collaboratives work best.1 This study reports the primary outcomes from a trial designed to identify the most effective elements in improvement collaboratives.
The study applied QI to U.S. addiction treatment clinics. The American addiction treatment system resembles health systems in many countries because 80% of it is publicly funded.2 Internationally, policy makers demand more effective treatments for alcohol and drug use disorders, and it has been suggested that process improvement can change what providers do and how treatments work.3
This study addressed problems of access and adherence to treatment by focusing on process targets (e.g., reducing waiting time). The link between process targets and patient outcomes is complex and depends in part on which patient outcomes are studied. A link was found between process goals and increased patient show rate,4 reduced medical expenses,5 and reduced arrests and incarcerations.6 Harris et al did not find an association between increased continuity of care and reduced problematic substance use;7 they did find a statistically significant but clinically modest association between improved engagement and problematic substance use.8 In a review of the literature, Humphreys and McLellan found that although process improvements can change how treatment programs work, the link to better patient outcomes is weak, in part because outcomes are so heavily influenced by events in and the environment of patients’ lives.3
The primary unit of analysis is the clinic (not the patient) because clinic leaders set organizational policies and processes that affect patient care. The study’s main goal was identifying which components of improvement collaboratives are most effective in helping clinics reduce waiting time to enter treatment, enhance treatment retention, and increase the number of new patients.
NIATx 200 was a cluster-randomized trial conducted according to the CONSORT criteria in 201 addiction treatment clinics in five U.S. states. NIATx (the Network for the Improvement of Addiction Treatment), a research center at the University of Wisconsin–Madison, promotes process improvements in addiction treatment. Details about the methods have been previously published.9 Eligible clinics were publicly funded outpatient and intensive outpatient clinics with at least 60 patients annually and no previous NIATx experience. All patients seen within an enrolled clinic were included in the analysis.
Clinics were randomized into one of four groups that used different components of improvement collaboratives:1,10,11 (1) interest circle calls, (2) coaching, (3) learning sessions, and (4) the combination of all three components. Clinics, randomized within states, were stratified by size (measured in number of patients per year) and management score.12 University of Wisconsin researchers enrolled clinics, generated the allocation sequence, and assigned clinics to groups using a computerized random number generator. Clinic staff and researchers were unblinded.
The 18-month intervention, delivered at the clinic level, was divided into three 6-month periods, each with one aim and a set of web-based materials about NIATx 200 recommended practices, instructions on implementing changes, QI tools, measures, and case studies (Fig. 1).13,14 The goal was providing the same content to all participants, varied by the support provided. The components tested in this study come from Institute for Healthcare Improvement10 and NIATx14 models.
Interest circle calls were monthly teleconferences in which staff from different clinics learned from experts and discussed progress with one another. The calls were led by QI experts trained in the NIATx model.14 Interest circle calls provide a simple and inexpensive way for clinics to collaborate, but quality may vary by facilitator. In addition, the calls may conflict with some participants’ schedules, limiting participation. This was the lowest cost, lowest intensity study condition, and it functioned as a minimal intervention comparison group.
Coaching assigned one of the QI experts to help clinic leaders and change teams make improvements. Coaching involved one initial site visit, monthly phone conferences, and e-mail correspondence. During the site visit, the coach met with clinic leaders and change team members to plan the clinic’s first change project. Using follow-up calls and e-mail, the coach reviewed the assigned project aim with clinic leaders and suggested practices and QI tools from the website. Coaches encouraged clinics to work on the assigned aim for each 6-month period and wrote reports summarizing each clinic contact. Coaches tailor process improvement advice to leaders and change teams. The match between coach and organization may or may not be good, and the quality of coaching can vary.
Learning sessions occurred in each state during months 1, 6, and 12 of the 18-month intervention period. The same three coaches led all sessions. Learning sessions convened change teams from different clinics in face-to-face conferences to learn from coaches and one another. Participating state agencies hosted learning sessions. The agendas for the learning sessions guided the content delivered in subsequent months in the other three groups. Learning sessions are intended to promote peer learning.10,11 They can also provide inspiration and social support.15 Learning sessions are expensive and require that most participants travel.
The combination group had the same type and number of interest circle calls, coaching activities, and learning sessions as groups 1, 2, and 3.
In the U.S., an agency of state government in each of the 50 states coordinates services and manages federal funds for addiction treatment. The research team worked with five of these state agencies to conduct the study. The agencies recruited clinics to the study, sometimes encouraging specific clinics to apply (e.g., clinics with a good record of providing data). Compared to all eligible clinics, enrolled clinics were larger by approximately 100 annual admissions, served a smaller proportion of African Americans, and were more often not-for-profit. The five states formed two cohorts. Cohort 1 had clinics in Michigan, New York, and Washington; recruitment lasted from March through September 2007 and the interventions from October 2007 through March 2009. Cohort 2 had clinics from Massachusetts and Oregon; dates shifted 4 months later.
All clinics began with baseline data collection and a “walkthrough,” in which staff members assumed the role of a patient to personally experience the intake process and identify areas needing improvement.16 At the beginning of each 6-month intervention, the web-based curriculum (www.niatx200.net) was launched for the aim of that period. Staff members were encouraged to use recommended practices (e.g., establish walk-in hours) and their assigned component (e.g., coaching) to address each aim. A sustainability period lasting up to 9 months followed each 6-month intervention to assess maintenance of change.
Primary outcome measures were mean days between first contact and first treatment (waiting time), retention rate from first to fourth treatment session, and percent increase in the annual number of new patients. Each outcome was aggregated to the clinic level for analysis.
Each state designed systems for collecting waiting time17 and retention data and designated a data manager who collected, de-identified, and sent patient data to the research team. A designated data coordinator at each clinic submitted dates of first contact, assessment, and first four treatment sessions for each patient admitted to care. States hosted training sessions for clinic data coordinators and provided technical support throughout the study. The number of new patients was collected in annual surveys.
The goal of the economic analysis was to estimate costs of each group for governmental authorities who might organize improvement collaboratives. Costs to the clinics of participating in the study--such as staff time spent on implementing changes--were not collected. The cost data collection instrument was based on the Drug Abuse Treatment Cost Analysis Program.18 The instrument collected the cost of personnel (state employees, NIATx employees, coaches, and consultants), data management, buildings and facilities, lodging, travel, phone calls, and miscellaneous costs. Costs were categorized as group specific (such as hotel costs for the learning sessions group) or non-group-specific, which included state-incurred costs for outreach, data management and infrastructure, encouraging participation, and administration.
Cost data were collected 3 times during the study period and aggregated to create a total cost estimate. To compare the cost effectiveness of the groups, the total cost of each group was divided by the total change in each group in each outcome. We calculated the annualized reduction in waiting days for each group by multiplying the average improvement in waiting days per clinic per patient by the number of clinics analyzed in the group and the average number of patients per year per clinic. A similar approach was used to calculate the annualized increase in new patients for each group and the annualized increase in the number of retained patients. We also calculated cost effectiveness by clinic.
In determining sample size, we wanted to detect a 10% reduction in waiting time. Power calculations were done for various sample sizes considering clinic recruitment levels anticipated in each state. We assumed a clinic attrition rate of 20% and a baseline average waiting time of 30 days. A sample of 200 clinics provided 80% power to detect a difference of 10.6% in waiting time, 7.5% in retention, and 14.2% in annual number of new patients.
Mixed-effect regression models were fit to primary outcomes, including terms to isolate state and group effects. Organization-level random effects were included to model the correlation among clinics within the same organization, and clinic-level random effects were included to model correlations between repeated observations from the same clinic over time. A vector of monthly waiting time and retention averages for each clinic served as the unit of analysis. Monthly averages based on fewer than 5 patient records were removed. To prevent larger clinics from dominating the results, equal weight was given to each clinic. The number of new patients was aggregated to an annual level to minimize seasonality. Increases in the number of new patients were summarized by changes in the natural logarithm of new patients from baseline data collected and combined for 2006 and 2007 (before the first intervention period) and surveys taken and combined for 2008 and 2009 (after the interventions began). Groups were compared pair-wise to detect statistically significant differences between groups. Clinic-level covariates accounted for in the analysis included clinic size, management score, and state affiliation.
Table 1 shows baseline characteristics of the 201 clinics. Baseline data collection extended one month into the first intervention period as clinics increased their data collection capacity. At baseline, across all clinics, mean waiting time was 36.8 days, mean retention through 4 sessions was 74.9%, and mean number of new patients was 563 per year. Although better baseline management scores were associated with lower waiting times12 and baseline retention varied significantly by state, differences in these covariates were not associated with outcome changes.
Table 2a summarizes waiting time changes. At the end of the 6-month intervention, the coaching and combination groups had statistically significant reductions in waiting time (4.9 days, P=0.013 and 6.2 days, P=0.002, respectively). Learning sessions had a modest waiting time reduction while interest circle calls had a slight increase, but these two groups’ changes were not statistically significant. Pair-wise comparisons between groups reveal a significant difference in improvement between interest circle calls and the combination group (P=0.024).
Table 2b shows the reduction from baseline in average waiting days per patient averaged across each of the 14 months of the intervention and sustainability periods. For these months, three groups had statistically significant reductions--coaching (4.6 days), learning sessions (3.5 days), and the combination (4.7 days). Differences between coaching and interest circle calls and between the combination and interest circle calls were statistically significant (P=0.028 and P=0.024, respectively). Although the three groups ended the evaluation period with similar levels of improvement, the combination group had the greatest improvement (followed by coaching, and then learning sessions) because of the patterns of improvement over time.
None of the groups showed significant improvement in retention for the 6-month intervention period (Table 3a) or the entire intervention and sustainability period (Table 3b), and there were no significant differences between groups.
The last QI aim was increasing the number of new patients. The intervention and sustainability periods were combined for this outcome because numbers were aggregated to a yearly level to reduce seasonality. Table 4 shows that the coaching and combination groups both had statistically significant increases of 19.5% (P=0.028) and 8.9% (P=0.029), while learning sessions and interest circle calls did not differ from baseline. (Despite a substantial difference in coefficients, the coaching and combination group P values are similar because the coaching group had a higher standard error.) Pair-wise comparisons indicate that the coaching and combination groups both had significantly greater increases than interest circle calls (P=0.018 and P=0.029, respectively).
A sensitivity analysis was run to weight each clinic’s contribution to outcomes differentially (based on patient counts) rather than weighting each clinic equally. This alternative model did not change the ordering of the groups’ performance with respect to waiting time or number of new patients, suggesting that larger and smaller clinics performed similarly on these outcomes.
We used an alternative measure of retention in an exploratory analysis. Baseline retention rate from first to fourth treatment averaged nearly 75%, higher than reported elsewhere,7,8 possibly causing a ceiling effect. Furthermore, many patients can be lost between first request and treatment;19 measuring retention only from first treatment misses this early dropoff. The Massachusetts and New York data included records of all patients who requested treatment, regardless of whether they received any. For 57 clinics in these two states, it was possible to measure retention from first request to fourth treatment and capture the early dropoff. For these clinics, the baseline retention rate from first request to fourth treatment was 32%. Measuring retention this way removed the ceiling effect. Improvements in retention rate from first request (rather than first treatment) to fourth treatment were 4% for interest circle calls, 22% for coaching, 27% for learning sessions, and 26% for the combination group. These results parallel those for waiting time—interest circle calls showed little improvement while the coaching and combination groups did. No statistical test was performed because the analysis was exploratory.
Table 5a shows the non-group-specific costs incurred by states. Each state enrolled between 37 and 43 clinics and costs per state ranged from US$85,475 to US$394,729. These estimates suggest the range of costs for running a research-based collaborative for approximately 40 clinics. One state (New York) had very high data infrastructure costs, in part to create a system to collect data about first contact for all prospective patients, indicating that the costs of running a collaborative depend greatly on the data infrastructure already in place. Figure 6a also shows per-clinic costs. Data infrastructure costs are excluded from the per-clinic costs because they are assumed to be one-time costs and largely independent of the number of clinics enrolled.
Table 5b shows group costs and cost-effectiveness ratios (CERs). The least expensive group was interest circle calls, followed by coaching, learning sessions, and the combination. Our approach in calculating CERs was to order the groups by cost and eliminate any dominated alternatives (i.e., more costly but less effective groups). Dominated alternatives are listed as “N/A,” and the CER is based on the next available alternative in terms of cost. For waiting time, coaching had a CER of $0.56 per patient waiting day saved compared to interest circle calls; coaching dominated learning sessions; and the CER for the combination compared to coaching was $37.30. For new patients, coaching had a CER of $11.36 per new patient compared to interest circle calls and dominated both learning sessions and the combination. While cost-effectiveness ratios are reported for retention in Figure 6b, none of the groups had a statistically significant effect, making any interpretation of CERs tenuous.
This randomized trial demonstrates that coaching and the combination of collaborative elements can produce statistically significant improvements in waiting time and the number of new patients compared to interest circle calls (a web site and monthly teleconferences). However, learning sessions (three face-to-face conferences at 6-month intervals) did not produce statistically significant outcome changes compared to interest circle calls. Coaching is the more cost-effective component. The combination costs almost 300% more than coaching by itself. For statistically significant differences between groups, effect sizes varied between 0.23 to 0.60, a range common for non-laboratory studies.20 Organizational change is difficult because it depends on the interaction among individuals, the setting, the organizational climate, and the change itself:21 A study that examined 54 reviews of various interventions aimed at changing clinical practice found average effects of about 10% for main targets, similar to effects in this study.22
This study addresses a research question that has not to our knowledge been asked before: Which component(s) of improvement collaboratives are most effective in improving healthcare quality? The study is unique in the literature on QI in healthcare both in its scope and the strength of its design.
The study has limitations. Assessment of clinic-level costs was outside the scope of this research, though these costs will likely have an important effect on whether clinics ultimately decide to participate in QI collaboratives. Assessment of patient-level health outcomes was not possible. Limits to generalizability include clinics in NIATx 200 being significantly larger than the average addiction treatment center in each state, and the possibility that organizations electing to participate may have been more receptive to QI than those that declined.
The retention results show that the interventions had no significant effect, and yet the exploratory study (which included people seeking but not getting treatment) produced results similar to those for waiting time and the number of new patients, suggesting that a more accurate indication of clinic performance would require collecting data on all calls for help, not just those resulting in treatment.
It may be that the study outcomes respond to different types of improvements. The process changes recommended in the NIATx 200 curriculum may be more suitable to administrative outcomes such as waiting time and the number of new patients, while retention may require an approach focused more on clinicians and clinical practice. One could argue that coaching, which consisted in this study of one site visit and monthly phone calls, was too light or that the content and quality of the learning sessions and interest circle calls could have been different. Yet the design and content were based on established work on improvement collaboratives.10,14 Results could vary under other circumstances.
Why might coaching outperform the other groups? One reason may be that coaching tailors instruction.23 A coach can encourage a clinic to stay with a topic longer, if needed; interest circle calls and learning sessions proceed at a more predetermined pace. Coaches can be a more persistent voice for improvement than less personal interventions. Coaches can also respond directly to change leaders’ concerns and smooth transitions when staff turns over. Schein identifies three ways consultants can help organizations—being an expert resource, diagnosing problems, and consulting on process24—and the coach reports submitted after each clinic contact indicate that NIATx 200 coaches did help in these ways. Although coaching is increasingly a component of QI initiatives in healthcare,25, 26 the literature about coaching is sparse.
Why would coaching be as effective as the combination of services? The combination may have offered more information than participants could use. Information from one source (e.g., interest circle calls) also might have competed with information from another (e.g., coaching), causing confusion. The data in this and other studies23,27 also suggest that although the combination group had the best waiting time results in the first six months, coaching (and to a lesser extent, learning sessions) “caught up” and even began to surpass the combination over the sustainability period. Combining services appears to produce the greatest initial results, while coaching delivers steady gains over time.
Effective coaching bears further research. Coaching in this study drew from the work of Deming, who encouraged focusing on processes and using an outside consultant—a “master”—to help organizations make improvements.28 Others, such as Donald Schön29 and Atul Gawande,30 describe coaching as it relates to individual work performance. What are the characteristics of effective coaching for organizations and individuals? How can a good match be defined between coach and organization and coach and individual?
The study raises questions about a key element of improvement collaboratives, learning sessions.10 This may be good because learning sessions are expensive. The results of this study should encourage the use of QI and suggest ways to reduce the cost of doing so.
We thank the thousands of very dedicated, very busy staff members from treatment clinics in Massachusetts, Michigan, Oregon, New York, and Washington who took part in this study to improve the care of the patients they serve; Tom Hilton, Ph.D., and Redonna Chandler, Ph.D., from the National Institute on Drug Abuse for their deep interest in and commitment to the project; Susan Brandau, Michael Botticelli, Phil Chvojka, Shawn Clark, Michael Ellis, Dawn Lambert-Wacey, Eric Larson, Robin Roberts, Mark Steinberg, Karen Wheeler, Fritz Wrede, and Dagan Wright from the state authorities for substance abuse treatment, whose work was indispensable to the project; staff members at the Center for Health Enhancement Systems Studies at the University of Wisconsin–Madison: Nataliya Batina, Jim Kadunc, and John Oruongo for their help with the data analysis; Victor Cappocia, Ph.D., for his input on data analysis; Bobbie Johnson for writing and editing assistance; Skyler Neylon for input about data analysis; Victoria Sviridova for data entry and data collection; Kyle Grazier, Ph.D., School of Public Heath and School of Medicine, University of Michigan, and David Zimmerman, Ph.D., Center for Health Systems Research and Analysis, University of Wisconsin–Madison, for their input on data analysis; Frank Davidoff, M.D., Institute for Healthcare Improvement, for his thoughtful reviews of the manuscript; and the coaches and facilitators for their commitment and good work with the addiction treatment clinics.
Trial registration ClinicalTrials NCT0093414.
Declaration of interest: The study was funded by the U.S. National Institute on Drug Abuse (R01 DA020832). The funder had no role in the study design, data collection, analysis, interpretation of data, the content of the study, or deciding to submit the article for publication. D McC serves as the Principal Investigator on Research Service Agreements between Oregon Health & Science University and Purdue Pharma and Alkermes, Inc.