In this study of elderly veterans, we found substantial discordance between drug quality assessments made by drugs-to-avoid criteria and individualized expert assessments. Half or more of the drugs flagged by the Beers and Zhan criteria were not considered problematic upon individualized, implicit expert review. Moreover, the Beers and Zhan criteria identified only 8–15% of drugs that experts judged to be problematic. Similarly discordant results were observed at the level of the patient, with limited correlation between patients taking drugs-to-avoid medications and those with prescribing problems identified on expert review.
Our finding that drugs-to-avoid criteria detected only a small fraction of prescribing problems found on individualized expert review is not surprising. Drugs-to-avoid criteria are not intended to identify all problematic drugs, but to have high specificity and high positive predictive value – that is, to focus on a limited number of drugs for which consensus indicates that use is often (or almost always) inappropriate.1, 14
However, our findings suggest suboptimal accuracy of the Beers and Zhan criteria even for this limited goal. Half or more of the drugs identified as problematic by the Beers and Zhan criteria were not judged as problematic by the expert reviewers. Although the developers of these criteria were careful to note that there may be exceptions to the judgments rendered by their criteria, these exceptions were as or more common than the rule. These findings support the claim, frequently made by physicians, that many of the drugs included in the Beers and Zhan drugs-to-avoid criteria are appropriate in selected circumstances.3, 8
Of note, there is no single, universally-accepted standard for defining prescribing problems, so we can not definitively conclude that the drugs-to-avoid criteria were incorrect in every instance where they disagreed with individualized expert review. Nonetheless, to the extent that individualized drug review represents a careful, patient-oriented assessment in real-world clinical settings, our findings suggest that drugs-to-avoid criteria have limited ability to distinguish between drugs that do and do not pose a problem for patients.
In addition to their limitations in evaluating individual drugs, our findings suggest limited accuracy of drugs-to-avoid criteria when applied at the level of the patient (defined by the presence or absence of an offending drug on the patient’s medication list). Concordance between the Beers criteria and expert review was only slightly above that expected by chance, with the Beers criteria having almost no ability to discriminate between subjects with and without prescribing problems defined by expert review (as reflected by likelihood ratios close to 1). The Zhan criteria had a positive likelihood ratio of 2.5, somewhat better than the Beers criteria but still reflecting weak ability to distinguish between patients with and without prescribing problems identified on expert review.
These results follow a limited body of previous work. In a small study of a homeless geriatric population, a clinical pharmacist recommended drug changes for 60% of Beers criteria drugs identified on medical record review (76% when previously discontinued drugs were excluded).26
In contrast, another study done in nursing homes identified uneven and generally minimal changes in use of medications from a drugs-to-avoid list after CMS implemented a policy mandating utilization review of patients taking these drugs, suggesting that most such drugs were maintained even after individualized review.7
Finally, in a previous report from the Enhanced Pharmacy Outpatient Clinic study we found low levels of inter-rater reliability between the Beers criteria and other commonly-used measures of prescribing quality, including the Medication Appropriateness Index and use of >=9 medications (one definition of “polypharmacy”).27
Notwithstanding the problems identified above, the Beers and Zhan criteria are useful when applied in a suitable context. First, these criteria may have utility for identifying prescribing problems in retrospective review of elders’ medication lists.26
This application shows promise insofar as it uses drug-to-avoid criteria to screen drugs for individualized review, rather than using the criteria as the final arbiter of appropriateness.7
Second, drugs-to-avoid criteria may be particularly valuable when applied at the time of the prescribing decision, for example through prior physician education and/or clinical alerts integrated into electronic prescribing systems.8
By definition, many of the drugs on these lists have high rates of adverse effects and/or limited efficacy, warranting caution in prescribing. Thus, many of the Beers and Zhan criteria drugs taken by patients in our study may have been suboptimal choices at the time they were initially prescribed even if they later proved to have good efficacy and few side effects for certain patients. For example, a reviewer might caution against beginning elders on diphendyramine given its high incidence of side effects. However, if a patient with refractory pruritis had been taking diphenhydramine for one year with good symptom control and no side effects, the same reviewer would likely not have recommended the drug be stopped. As a result, the positive predictive value of the Beers and Zhan criteria may be higher when used prospectively to avoid harmful drugs, rather than retrospectively to evaluate drugs currently in use.
While there may be clinical applications of drugs-to-avoid criteria, these criteria have increasingly been used as quality measures to assess and compare prescribing quality across providers and health systems – and in this process have often been reinterpreted not as “potentially inappropriate medications” but as “definitely inappropriate medications”.3, 28, 29
Our study demonstrates substantial deficiencies when these criteria are employed for this purpose. In particular, we found that half or more of the quality “problems” identified by the criteria may in fact not have been problems. The ambiguity of quality judgments made by drugs-to-avoid criteria are further amplified when comparing care across physicians or institutions. Given that the appropriateness of these drugs may vary substantially across different clinical settings and that the number of medications a patient receives is strongly linked to the presence of Beers and Zhan criteria drugs, comparisons of prescribing quality using drugs-to-avoid criteria may be particularly challenging when patients’ clinical scenarios, level of illness burden, and medication use vary between institutions or physicians.3, 8, 18, 30
Our results should be interpreted in the context of our study design and limitations of our measures. First, subjects were recruited from a single VA medical center, and were taking a minimum of 5 medications. Second, the expert pharmacist reviews are an imperfect measure of prescribing quality, and different experts may give different assessments of prescribing appropriateness. (Of note, although this study did not conduct dual independent ratings of appropriateness for each patient, the ultimate decision about prescribing recommendations were made by consensus by an expert pharmacist and physician, thus limiting the impact of any one rater to influence the results.) Thus, our expert reviews should not be considered a criterion standard of prescribing quality, and further studies are needed to confirm our findings in different care settings and with different expert raters. Third, the recommendations generated by the study’s expert raters reflected the individual clinical circumstances of the patient. Thus, our results should be interpreted as evaluating drugs-to-avoid criteria against real-world clinical situations, rather than against more abstract notions of appropriateness.
Measuring and improving the quality of drug prescribing in older patients is essential for increasing the overall quality of health care for the elderly population. Unfortunately, drugs-to-avoid criteria performed poorly when used as quality measures to assess the current state of a patient’s drug therapy. As a result, use of these tools to judge a physician’s quality of care and to compare performance across providers and health plans may lead to erroneous conclusions. Rather, drug-to-avoid criteria are best used to warn physicians of potential problems prior to prescribing, and as a simple yet insensitive means to identify potentially inappropriate drugs for follow-up with individualized review.