We have previously reported the design principles and methods for the UPQUAL study.
5 These are briefly summarized here, followed by a full description of the new intervention and analyses.
The central hypothesis of UPQUAL is that improving the accuracy of electronic, point-of care clinical decision support tools (ie, quality reminders) can create a ‘virtuous cycle.’ Entering exceptions, when appropriate, improves the accuracy of the decision supports by turning off a reminder for a period of time. This increases the positive predictive value of alerts in the future and increases clinicians' trust in reports of their performance: the accuracy of the report is determined largely by the accuracy of the data they enter. If clinicians seek to provide the highest quality of care possible, then they need to eliminate the ‘noise’ in the system by recording exceptions so they can find the patients who truly need care. They are therefore more motivated to use the clinical decision support system, including recording exceptions.
UPQUAL had several design principles, including non-disruptive reminders (‘passive alerts’): a ‘hub and spoke’ design of the clinical decision support that allowed providers to jump to patients' medication history, a health maintenance section to record outside tests, and pre-specified order sets as needed to gain information and act on the alerts; tools to allow clinicians to enter patient and medical reasons for not following recommendations (ie, exceptions) as part of routine workflow; feedback to physicians on their performance on quality measures; and feedback to physicians of the names of patients not receiving essential medications so they could reach out to patients who are not scheduled for upcoming appointments.
Setting and eligible patients
We performed this study at an academic internal medicine practice in Chicago, Illinois that uses a commercial EHR (EpicCare, Spring 2007 and then Spring 2008; Epic Systems, Verona, Wisconsin, USA). Northwestern University's institutional review board approved the study with a waiver of patient informed consent. All patients eligible for one or more quality measures () cared for by attending physicians were included.
The practice has used EpicCare since 1997. During the 2 years prior to the start of the study (2006–2007), we developed quality measurement tools using discrete data from the EHR. Physicians received printed quarterly reports of their performance on 12 quality measures (all of which were included in this intervention). They did not receive information about individual patients with quality deficits. Interruptive (ie, ‘pop-up’) point-of-care reminders with links to order entry were active for many clinical topics but were rarely used. Some measures included limited medical exceptions (eg, a documented drug allergy), but there was no mechanism for clinicians to record other medical and patient reasons for not following recommendations. These reminders were discontinued 3 months before the intervention began.
Initial implementation of the UPQUAL intervention (phase 1)
The UPQUAL intervention has been fully described previously (including figures with examples of the EHR interfaces) and is only summarized here. We used a minimally intrusive reminder: a single tab in the visit navigator which was highlighted in yellow if any measure was not satisfied and an exception was not documented. Alerts included standardized ways to capture patient reasons (eg, refusals) or medical reasons for not following an alert. Clinicians could also enter global exceptions (eg, terminal disease) to suppress multiple reminders and for performance measurement. Preventive services performed elsewhere could also be recorded.
The UPQUAL intervention was implemented on February 7, 2008. We held a 1-h initial training session to teach physicians how to use the decision support tools and to record exceptions. Performance was not used to determine compensation. We informed clinicians that medical exceptions would be peer reviewed; the vast majority of medical exceptions entered were judged valid.
11 In addition, we gave physicians printed lists each month of their patients who appeared to be eligible for an indicated medication but were not receiving it and had no exception recorded. Quarterly performance reports were continued as before the start of the study. This intervention was continued for 1 year through February 2009.
Addition of pre-visit printed clinical reminders (phase 2)
The nurses in the general internal medicine (GIM) clinic typically record vital signs and any comments for the physician (eg, ‘needs medication refill’) on a sheet that is left in a box outside the examination room. In February 2009, we implemented a system that queried the EHR for outstanding quality deficits when the patient registered and printed these for the rooming nurses to use in lieu of their previous rooming sheets. All other quality measurement and feedback remained the same.
Study measures
At each time point, patients were eligible for a measure if they had two or more office visits in the preceding 18 months, were cared for by an attending physician, and met the other measure criteria (). For chronic disease measures, we included patients when ICD-9-CM disease codes were recorded on the active problem list, past medical history, or as prior visit diagnoses. We used Structured Query Language to retrieve data from an enterprise data warehouse that contains data copied daily from the EHR. For each of the 37 months of the evaluation period (1 year prior to the intervention, phase 1, and phase 2), all patients were classified for each measure for which they were eligible as: (a) satisfied, (b) did not satisfy but had an exception, or (c) did not satisfy and had no documented exception. The primary outcome for each measure was calculated as: number satisfied/(number eligible−number not satisfied with an exception). Changes in secondary outcomes during the year after the intervention (eg, number who did not satisfy but had an exception) have been reported previously and are not reported here for the second study year.
Statistical analysis of changes in group performance during phase 2
Analyses used SAS v 9.1 (SAS Institute, Cary, North Carolina, USA) and R software package v 0.10–16 (R Foundation for Statistical Computing, Vienna, Austria). We calculated each of the 16 performance measures for the first of each month from February 1, 2007 through February 1, 2010. This yielded a 37-point time series for each measure. For the current analysis, we concentrated only on data from months 24–37. To determine whether changes in performance over this time period were statistically significant, a linear model was fit to each time series using time (ie, month) as a continuous predictor, as described previously.
5 Next, we determined the autoregressive order of the model residuals by minimizing Akaike's information criterion.
12 Finally, we fit a linear regression model with autoregressive errors (using the appropriate number of autoregressive parameters, if any were necessary) to each series. These fitted models were used to test statistical significance.
13 To ensure model validity, we examined several residual diagnostics, the Jarque–Bera and the Shapiro–Wilk tests for normality of residuals, and normal Q-Q and autocorrelation plots.
14–16Changes during phase 2 for physicians with low performance at the end of phase 1
In addition to the time series analyses, we conducted analyses to specifically examine changes in performance for physicians at the low end of the range within the practice. As described above, these physicians were the real target of the phase 2 intervention. We anticipated that the overall changes in performance across all physicians would be relatively small during phase 2 because most physicians were already near the ceiling of attainable quality for many process of care measures (ie, near 100% for prescribing recommended medications) and for preventive care measures most physicians were actively using the clinical decision support tools and performance was still rising at a steady pace.
To examine changes at the physician level, we developed two composite measures, one for the seven recommended medications for patients with coronary artery disease (four measures) and/or heart failure (three measures), and one for the five preventive services. For each physician in the practice throughout the entire study period (N=31), we identified all patients eligible for each of the composite measures at the start and end of phase 2 who did not have an exception recorded. We then determined whether the medication was prescribed and determined the performance (percent satisfied). Thus, if a physician had five patients eligible for all coronary artery disease measures and four eligible for all heart failure measures, none of which had an exception documented, the physician would have 32 eligible measures and a range of possible performance from 0 to 32. Performance was reported as percent satisfied, as described above. For each of the two composite measures, we compared differences in the mean improvement in performance during phase 2 for the 15 physicians whose performance was below the median and the 15 physicians whose performance was above the median using two-sample t tests.