|Home | About | Journals | Submit | Contact Us | Français|
To critically review the existing evidence on interventions aimed at reducing errors in health care delivery.
Systematic review of randomized trials on behavioral, educational, informational and management interventions relating to medical errors. Pertinent studies were identified from MEDLINE, EMBASE, the Cochrane Clinical Trials Registry, and communications with experts.
Both inpatients and outpatients qualified. No age or disease restrictions were set.
Outcomes were medical errors, including medication, prescription, and diagnostic errors, and excluding preventive medicine errors and simple ordering of redundant tests.
Thirteen randomized studies qualified for evaluation. The trials varied extensively in their patient populations (mean age, 2 weeks to 83 years), study setting, definition of errors, and interventions. Most studies could not afford masking and rigorous allocation concealment. In 9 of 13 studies, error rates in the control arms were very high (10% to 63%), and large treatment benefits from the studied interventions were demonstrated for the main outcome. Interventions were almost always effective in a sample of 24 nonrandomized studies evaluated for comparison. Actual patient harm from serious errors was rarely recorded.
Medical errors were very frequent in the studies we identified, arising sometimes in more than half of the cases where there is an opportunity for error. Relatively simple interventions may achieve large reductions in error rates. Evidence on reduction of medical errors needs to be better categorized, replicated, and tested in study designs maximizing protection from bias. Emphasis should be placed on serious errors.
Medical errors are a common cause of morbidity and mortality in a variety of health care settings.1–3 Their importance has been increasingly recognized, as reflected in a recent report by the Institute of Medicine that drew widespread public attention.4,5 It is estimated that in the United States alone, medical errors result in 44,000 to 98,000 unnecessary deaths and approximately 1 million excess injuries each year.4,6 Extremes of age, complex and/or urgent care, and prolonged hospital stay are associated with more errors, but significant errors may also occur in the outpatient setting and for patients of all ages.4,6 Despite the increasing recognition of the importance of this phenomenon, there is limited knowledge of which interventions may be used to effectively reduce the incidence and impact of errors on medical care. There is optimism that both simple behavioral interventions, system approaches, and information technology-based approaches may alter the incidence and consequences of medical errors.5,7 However, much of the data pertaining to medical error have been generated using epidemiologic observational studies or intervention studies with before-after comparisons. Randomized evidence has not been carefully scrutinized.
In the present systematic review, we undertook to retrieve and critically evaluate the available randomized evidence on interventions specifically aimed at reducing medical errors. Given the intricacy of defining what constitutes a medical error and the multifaceted dimensions of this phenomenon, we anticipated that this field of research would pose important study design issues. Therefore, in addition to summarizing the available evidence, the aim of this overview was to evaluate study designs and limitations in order to make recommendations for improving future research in this field. Pertinent nonrandomized studies were also evaluated briefly for comparison and to obtain complementary information.
We considered all randomized, controlled trials that examined an intervention versus placebo or no intervention and specified the aim of reducing medical errors as a primary or secondary outcome. It is conceivable that several other interventions tested in randomized trials for other reasons may incidentally reduce errors, but unless error reduction has been set as a primary or secondary endpoint, this is impossible to know with any certainty. Furthermore, an exact definition of medical errors is difficult. For the purposes of this overview, we included medication errors (including prescription, dosing, and omission errors), prescription of inappropriate/harmful diagnostic tests or omission of necessary orders/prescriptions directly related to patient safety, and misdiagnosis errors beyond the inherent limitations of applied diagnostic tests (limitations existing even when these tests are applied and interpreted appropriately). Behavioral, educational, information, and management interventions qualified, including computerized interventions. Simple ordering of redundant tests was not an eligible outcome. Studies with emphasis on patient compliance were excluded, unless emphasis was entirely on errors (not simply missed doses) made by patients or parents because of inadequate information given by health care providers. Studies evaluating only the omission of orders or actions suggested by preventive medicine guidelines were not included in this overview, because they usually constitute omission of potential benefit rather than direct harm to the patient. Similarly, we did not consider studies of computerized systems for altering physician behavior where the emphasis of the overview was not specifically on errors—there is an extensive literature that has already been reviewed previously.8 We also excluded studies where 1) the outcome was only interrater variability in the absence of a gold standard; 2) the intervention was a different imaging or laboratory test that could improve diagnosis beyond the diagnostic technology used in the control arm (the emphasis was on missed and wrong diagnosis with existing technologies, not on diagnostic accuracy of new diagnostic technologies); 3) trainees rather than professional staff or the patients themselves were involved; 4) fictitious or simulated cases were considered; or 5) only time to response of caregivers was assessed without actual errors being counted. We also excluded studies where imaging studies were read with different approaches (e.g., blinded vs unblinded, use of comparative images, etc.) where the focus would be on the optimal reading and interpretation of a diagnostic test. Finally, only randomized evidence was considered for the main evaluation. However, nonrandomized studies were also collected for a complementary assessment of the evidence they may provide.
The literature search was based on MEDLINE (1966–2000) and EMBASE searches. The main search was conducted in June 1999 and updated subsequently until March 2000. The search was based on the terms “medical errors,”“prescription errors,”“diagnostic errors,” and “medication errors” in conjunction with an array of terms characteristic of randomized controlled trials (e.g., randomized controlled trial, randomized clinical trial, random, placebo). We used no language restrictions. The abstracts were screened, and studies that might qualify were retrieved in full for further screening. We also screened the references of the retrieved papers and communicated with experts and colleagues. Finally, the Cochrane Clinical Trials Registry was also screened.
Data extraction was performed in duplicate and disagreements were discussed in a consensus conference. The following data were extracted: year of publication, sample size (number of patients per arm and number of opportunities for error, if the latter was different from number of patients), study setting and study population characteristics, quality components (including double blinding, allocation concealment,9 details on the mode and adequacy of randomization, and adequacy of description of withdrawals), definition of errors and whether errors were a primary or secondary outcome, number or score of errors per arm and comparative statistics. In some studies, the number of opportunities for errors was equal to the number of patients, while in others where several decision and action items might be available per patient, the opportunities for errors count might be substantially larger. For example, in a study of computer generated reminders, it is possible that many reminders may be generated for various orders pertaining to the same patient. Finally, we also collected data on whether the definition of errors used as outcomes was adequately clear and presented in sufficient detail, whether the errors were clinically serious or at least the number of clinically serious errors was mentioned, whether data were reported on the number of patients who were actually clinically harmed and on the number of deaths due to errors, and how the information on error outcomes had been collected. Errors were categorized into diagnosis errors, medication errors (including prescription, omission, and dosing errors), and other management errors.
In order to evaluate the pertinent nonrandomized evidence, a separate search was performed without the requirement for studies to be randomized. Otherwise, the same search strategy was used as for randomized studies. The search was based on MEDLINE and did not include perusal of references or communication with experts. The aim was not to ensure that every single observational study in the field would be retrieved, but to complement our assessment of the randomized evidence by also assessing nonrandomized studies in the field. We summarized key attributes of the retrieved studies, including the following: year of publication, type of design, definition of errors, definition of interventions, sample size (in patients or opportunities for errors), clarity of error definition, whether data were collected also on clinically serious errors, and whether the intervention was found to be statistically significantly effective.
Given the large heterogeneity of study designs and definitions of outcomes, we did not attempt a formal quantitative synthesis by meta-analytic methods. Instead, we expressed the results of each randomized study using the same treatment effect metrics, including risk ratios, risk differences, and numbers needed to treat, whenever applicable.10 Numbers needed to treat are based on patients or opportunities for error, depending on the study outcome definition. Selected attributes of randomized and nonrandomized studies were compared with Fisher's exact test and odds ratios (ORs) were estimated. Analyses were conducted in SPSS 10.0 (SPSS, Inc., Chicago, Ill) and P-values are 2-tailed.
For the identification of randomized trials, 52 papers were retrieved and reviewed at full length. Of those, 39 were excluded as ineligible (nonrandomized design, improper randomization, or no control arm [n = 14], interrater variability outcome only and/or no medical error outcomes [n = 5], trainees only being involved and fictitious/simulated cases only being considered [n = 1], intervention consisting of new technology for drug delivery or mode of reading a diagnostic test [n = 6], focus only on redundant testing [n = 5], focus only on preventive interventions [n = 8]).
Characteristics of the 13 eligible randomized studies11–23 are summarized in Table 1. These studies addressed very diverse aspects of errors in health care delivery, ranging from medication, prescription, and dosage errors to misdiagnosis/mismanagement and lack of recognition of the seriousness of illness. Errors could be made by the physicians, nurses, pharmacists, patients themselves, or (in the case of sick children) their parents. The interventions were equally diverse, including leaflets, automated systems and computerized reminders, instructive protocols, multidisciplinary approaches, algorithms, pharmacist interventions, ergonomic changes (illumination in the workplace), and performance of aspects of health care by health professionals other than physicians. Outcomes were measured either on the basis of patients or opportunities for errors. Three studies involved more than 1,000 patients and 2 more had more than 1,000 opportunities for error. The settings were extremely diverse, offering a sample of the multifarious nature of current medical care and the mean age of the patient populations varied from 2 weeks to 83 years, when reported.
Only 1 randomized study scored positively on all 4 quality items. Masking was implemented in only 2 trials, and only 1 of the 2 provided detailed information on how allocation concealment was ensured. Masking and allocation concealment were usually impractical or even impossible. All studies offered appropriate details on their randomization procedures. The unit of randomization could be the patient (or parent of a sick child), the health care provider, the ward, or even a time unit (in one study, 3 different levels of illumination were randomized to be applied in the workplace for 7 random days each in a sample of 21 days). Ten of the 13 studies offered adequate data on withdrawals and exclusions.
As shown in Table 2, the definition of errors used as outcomes was adequately clear, with the exception of one study which simply stated that a serious error is one which has a potential health risk, without further details. Four studies targeted diagnostic errors, 9 studies targeted medication errors, and 3 also targeted other management errors. In 10 of the 13 studies, the targeted errors were not necessarily clinically serious, and in most cases they were probably not serious at all. Only 2 studies17,19 clearly reported clinically significant harm to the patients; no study recorded any information on deaths. It is unclear whether any deaths due to errors occurred at all in any of the 13 studies, although 17 errors were coded as life-threatening in one study.
The data collection methods varied across protocols. Several studies used chart reviews or benefited from the electronic medical records they had set up as part of the intervention. Direct observation of health care delivery by an independent investigator was also commonly used.
The main results of the 13 randomized trials are summarized in Table 3. In 9 of the 13 studies, the interventions were found to be effective in reducing error rates. However, even in the 4 cases where the randomized intervention did not reduce error rates significantly, the conclusion of the study authors was (at least in part) favorable for the tested intervention. In one study,16 the intervention (a structured protocol for assessing the need for obtaining a radiograph after lower extremity injury) actually increased the error rate; however, it significantly decreased the emergency department waiting time. In another study, investigators demonstrated that nurse practitioners did not make more errors than junior doctors in evaluating minor injuries in the emergency department; this was an equivalence rather than superiority design.23 In a third study without statistical significance,18 errors were not the primary endpoint of the study — the intervention was superior to control in terms of interrater concordance of the parent with the pediatrician in recognizing the child's severity of illness. Finally, one study19 found no additional benefit from adding a team approach to a computerized physician order entry for reducing medication errors, but in a before-after comparison (in the same report), the computerized physician order entry system already seemed to provide a large benefit.
In 9 of the 13 randomized studies, the error rates in the control arms of the studies were remarkably high (10% to 63.3%). Given the high control rate and significant treatment effects, sizable risk differences and attractive estimates of numbers needed to treat were calculated in most of the studies (Table 3).
A total of 24 nonrandomized studies were also retrieved and evaluated.14,19,24–45 Their characteristics are summarized in Table 4. They included 18 before-after comparisons (comparisons of different time periods), and 6 studies with concurrent controls, of which 2 were pseudo-randomized (patients were rotated in the different arms,38 or a random sample selected for evaluation from the experimental arm with allocation that had been performed sequentially).42 Thirteen of the 24 studies had been published before 1990 (vs only 2 of the 13 randomized trials). The studies dealt usually with medication errors, but some had also management errors as well. Interventions and sample sizes varied widely and are summarized in Table 4. Five studies did not even specify their sample size (typically error rates were reported). Five studies had unclear definitions of what counted as an error. Only 4 studies gave some data on clinically serious errors, but even these did not have serious errors as the main study outcome. Finally, the intervention was almost ubiquitously found to be effective. There were only 3 exceptions: in one study,34 the authors described 14 different interventions, but claimed that only 3 were actually implemented — the failure of implementation was felt responsible for the lack of effect; one other exception occurred in a pseudorandomized study42 which did not demonstrate any improvement in errors reflecting serious drug interactions by using a computerized medication profile; the third exception occurred in a study evaluating an automated medication cart-filing system.43
In the comparison of randomized versus nonrandomized studies, nonrandomized reports were more likely to have been published before 1990 (OR, 6.5; 95% confidence intervals [95% CI], 1.2 to 36; P = .035). As compared to nonrandomized reports, those of randomized design were more likely to report data on clinically serious errors (OR, 3.1; 95% CI, 0.7 to 14.7; P = .229), to find that the intervention was not effective in reducing errors (OR, 3.1; 95% CI, 0.7 to 15; P = .213), and to specify the sample size (OR, undefined; P = .140), but these associations were far from being significant.
Medical errors recently have drawn large publicity and the identification of simple measures to reduce their impact has been considered a public priority.4 This overview shows the complexity involved in studying interventions aimed to minimize errors in health care delivery. The definition of medical errors is difficult to set within clear boundaries. Medical errors are multifaceted and may involve the diagnosis, management, ordering, carrying out of orders, prescriptions and other medical care by physicians, nurses, pharmacists, other staff, the patients themselves, or the parents and custodians responsible for their care. The heterogeneity of the studies retrieved in this overview shows these diverse components in the fabric of health care where errors may arise.
In this overview, we used a very strict definition of errors with many exclusion criteria in an effort to bring the topic into sharp focus. We acknowledge that we may have missed some studies which are sometimes difficult to even code for database purposes. Other studies such as those assessing interventions to reduce the omission of preventive measures or even those assessing interventions to reduce redundant test ordering or improve the behavior and reaction time of physicians may be perceived as interventions aiming to reduce errors in a broader sense. There is an even more extensive literature on these topics, some of which has been already reviewed previously.8
Given the observed high frequency of errors, there is a dearth of randomized trial data about ways to reduce medical errors. Even with a less comprehensive search, we were able to identify a larger amount of nonrandomized evidence. Compared with the randomized trials, the observational studies were older, although several recent ones were also identified; and they had an equally high, if not even higher, rate of positive findings, raising concerns about publication bias and selection biases in targeting study settings and patient populations. In many cases, it is conceivable that when large improvements are seen in nonrandomized studies, randomized trials may never be performed, such as in the case of unit dose distribution systems or, more recently, several computerized interventions. Randomized trials may not always be applicable to all settings and questions46 and observational studies may not always square with the results of randomized trials.47 Observational studies have the advantage of studying effectiveness in real practice settings, while randomized trials may be prone to study efficacy under more controlled circumstances with less generalizability. Observational evidence may yield useful insights and should not be discarded.
However, there are several cases where randomization would have been easy to implement without affecting the generalizability or the difficulty of conducting a study. Studies using concurrent controls usually are easy to transform into randomized designs. Finally, before–after comparisons are easier to implement than randomized trials, but one should caution that the use of historical controls may sometimes inflate the magnitude of the treatment effect. Before-after comparisons may be particularly biased, if they extend over a long period of time during which several other changes and interventions that may affect the error rate are implemented (either systematically or sporadically) in the health system where the study is conducted. Such serendipitous changes may result in large reductions of the error rate and they may often be unknown or difficult to model and adjust appropriately. For example, a Hawthorne effect may be observed where physicians make fewer errors over time, especially after a computerized system has been installed, regardless of the merits of the system itself.
Many other important studies in the field were excluded, because they had no control comparison at all. For example, one otherwise well-designed study implemented a computer-alert system to prevent injury from adverse drug events.48 The outcome was the number of true-positive alerts. Measuring such parameters in uncontrolled studies is useful for assessing the performance of a system. However, comparative outcomes (vs no intervention or a standard intervention) on hard endpoints, e.g., deaths or serious morbidity due to iatrogenic errors, would prove indispensable. Such outcomes need to be studied with the gold standard of randomized trials, especially if the proposed interventions are not devoid of cost.
The heterogeneity of the retrieved studies precludes any attempt at their quantitative synthesis. However, it is interesting that the majority of both randomized and nonrandomized studies give positive results for reducing medical errors, and practically almost all of them show that some or all of the interventions under study are worthwhile adopting, based on the results for one or more outcomes. The magnitude of the reported treatment effects is often very large. This probably means that there is large room for improvements in controlling medical errors, often with relatively simple interventions. Regardless of the magnitude of the treatment effects, the magnitude of the error rates in the control groups is also remarkably high, ranging from 10% to 63% in 9 of the 13 studies. These high rates are probably strong evidence for the high incidence of medical errors.
Alternately, it is also possible that biases may be operating in this field of research. Such biases could stem either from publication lag and publication bias.49,50 or from the difficulty of assessing outcomes objectively in studies where masking and allocation concealment are difficult to achieve.9 Theoretically, this is a field of research where a study with negative results (e.g., when a behavioral or computerized intervention does not affect the error rate) may be difficult to publish. However, given that the large majority of studies, both randomized and nonrandomized, yield positive results, the evidence published to date gives limited insight on why error prevention interventions may sometimes fail rather than succeed. Studies explicitly designed to study this aspect should be encouraged. The available data suggest that even with the large error reductions achieved, there are still large rates of errors even in the interventional groups. Reasons for these failures require better study.
Some other methodologic limitations of the available randomized evidence on medical errors need to be discussed. First, many of the tested interventions may be difficult to generalize and/or their effectiveness may depend on familiarity of and training on their principles. Studies in acute hospitals were sometimes conducted in renowned tertiary care, university-affiliated medical centers in the United States; it is unknown whether the same favorable results would be obtained in other hospitals or in other parts of the world. Furthermore, many of the prior trials have dealt with specialized populations such as children and their parents, the elderly, acute care, or specific outpatient settings. Their findings may not be generalizable to different populations. Furthermore, many involved cognitive interventions (reminders, educational tools, etc.) and their effectiveness may vary depending on the rigor and intensity of their application.
Some studies used less than optimal randomization methods, in ways that the allocation of the patients to each of the compared arms could have been manipulated. Almost all of the included studies were unmasked, although this was usually fully justified given the nature of the studied observations. The unit of randomization in these studies must be carefully selected. In some cases, using the patient as the unit of randomization may be inappropriate: skills acquired while managing intervention patients may be applied to the care of control patients.
These limitations being acknowledged, clearly all the identified pertinent studies required substantial ingenuity for their design, since studying medical errors requires innovation both in the way interventions are applied and outcomes are assessed. Several of the studies included more than 1,000 patients, and the results are probably valid. However, before the proposed interventions are widely adopted, the cost-effectiveness and effects on quality of life should also be considered alongside with their effects on error counts. Variability across the trials may make generalization difficult. While it is easy to agree that preventing errors is desirable, specific interventions must be evaluated according to their own merits with respect to effectiveness and cost-effectiveness in specific health care settings.
Finally, all errors are not equal, and their impact may vary tremendously. For example, although medication errors are strongly linked to adverse drug events,51 most medication or prescription errors will not result in any harm. Many studies on medical errors have given limited information of the seriousness of the captured errors. Even when outcomes were defined to include only “serious” errors, the seriousness may have varied substantially from study to study and even within the same study different “serious” errors may have had different consequences for patients. We found a dearth of data pertaining to actual clinical harm to the patient and no study focused on patient deaths. Many studies assessed parameters that almost certainly would not have resulted in death. Although less serious errors that affect many patients and that may expend extensive resources are definitely worth studying, we believe that studies incorporating hard patient outcomes, such as iatrogenic mortality and serious iatrogenic illness, should be encouraged.
In order to perform such studies, large numbers of patients would have to be included, since the event rates are likely to be low. Given the fact that errors occur at various steps in the sequence of medical care and could involve various care providers (e.g., physicians, pharmacists, nurses) or even the patients themselves, observational studies may be used to provide insight on which steps are more error-prone. These steps could then be the target for probing the efficacy of interventions by randomized trials. Still, in order to design such studies, we have to first recognize that medical errors, including serious ones, should be subjected to study from the perspective of a systematic approach and become the target of large experimental investigation, rather than be silenced for legal purposes.
We thank Ms. Priscilla Chew for her invaluable help in the literature search and Ms. Marian Perez for the retrieval of articles.