Our experience with this project has confirmed our belief that implementing automated guidelines is still extremely difficult—despite having started with a state-of-the-art clinical information system, garnering significant institutional commitment from the outset, employing a powerful underlying knowledge model, and starting with as ideal a guideline as possible. A number of lessons have been learned at each step of the process.
Choice of Guideline
We limited the scope of the guideline to secondary prevention to minimize complexity and to maximize consensus. First, this subset of the NCEP guidelines enjoyed significant backing by scientific evidence as well as wide acceptance by clinicians and addressed an important clinical problem. Second, data required to compute and navigate the guideline were all contained in the EMR (cholesterol levels, problem lists, and medications); in other words, interactive dialogs with the clinician to collect these data were not required. Third, the secondary prevention portion of the guideline was relatively easy to translate because the decision logic and recommendations were explicit and measurable (check cholesterol level, start drug therapy, or adjust drug therapy). In comparison, the primary prevention portion of the NCEP guidelines had less scientific support, less acceptance by clinicians, and vague logic and recommendations.
It became clear that even simple and relatively straightforward guidelines can be interpreted in different ways, depending on one’s perspective or specialty. Much effort was spent trying to achieve agreement among our experts about details of the guideline. Although initial efforts tried to put too much corrective action into the algorithm’s recommendations, the experts ultimately focused on a more pragmatic goal. This goal was simply to ensure that the basic and most important recommendations of the NCEP guidelines were being followed, not to pre-specify every medical decision related to the management of hypercholesterolemia or to replace the clinician or substitute for his or her medical education. For example, rather than recommend one particular drug (or drug class) over another (which entails factoring in highly nuanced patient-specific data that is not stored in or easily accessible from the EMR), we decided to implement the more general reminder that the patient simply qualified for pharmacologic treatment. Then, by linking to background reference information about the mechanism, effectiveness, costs, and side effects of various lipid-lowering medications, the autonomy of the clinician to make the best decision for the patient was preserved.
We were pleasantly surprised to learn that our knowledge model was not the project’s limiting step. Indeed, GLIF was easily extended, even to deal with execution modalities that were not anticipated at the start of the process, notably the ability to support different notifications and actions from the same step, depending on whether the user was currently interacting with the guideline. Others have also successfully extended GLIF in similar ways.62
One noticeable but surmountable obstacle that had as much to do with the original guideline as with the knowledge model used to encode it was conflicting or borderline data. For example, the NCEP guideline does not specify what to do if more than a single recent LDL is available. For any specific patient, a human can quickly integrate the levels over time and judge whether it is reasonable to use the lowest, highest, most recent, median, or mean value. The computer is limited to an analyst’s best a priori guess, which must then be applied to every subsequent patient.
Although our guideline model allows different recommendations for different test results, it does not flexibly handle borderline labs, such as an LDL of 102 mg/dl. The NCEP guideline itself is precise enough about this point, but clinicians in practice might violate the strict guidelines for such a close result, rightly or wrongly.
We used Visio to represent the sequence of decisions and actions at a highly conceptual level, as a flowchart. This version was passed back and forth among the experts and "debugged" by hand. Because PCAPE cannot read Visio data, the flowchart representation had to be re-entered step by step into the editor, which, though powerful, was not particularly user-friendly. A simple change in the Visio flowchart, such as the insertion of a new decision step, could mean a 15-minute interaction with PCAPE.
Others have developed integrated tools that link graphically based authoring and editing of guidelines with execution engines of one kind or another.63–65
Such tools that directly translate the flowchart specification of a guideline into executable code not only would speed development of computer-based guidelines but also would help ensure the fidelity of the translations. Without it, the PCAPE version had to be debugged independently of the expert-verified flowchart. Even after extensive testing in a "live" test environment and then again in a real-world pilot clinic, some important bugs slipped by our scrutiny. These were most commonly related to issues with modeling the passage of time or with supporting synchronous interaction between clinician and computer.
Clinicians who use our EMR are quite familiar with encounter sheet-based reminders. Other encounter sheet reminders at our institution are followed 5–60% of the time (the wide variation is due to differences among the reminders that we have implemented).48
Based on how physicians interact with our EMR in the inpatient arena, we hypothesized that direct synchronous interaction with an electronic guideline would have added value in the outpatient setting as well. However, despite incentives to do so, such as access to more detailed recommendations and background information, citations of supporting references, links to patient handouts, and facilitated documentation, clinicians almost never opted to interact in real time with the guideline. Instead, they relied only on the brief reminders printed at the bottom of patient encounter sheets. This finding is consistent with McDonald et al. that physicians do not take advantage of ancillary features that require extra time and effort.66
Whether the lack of online interaction with the cholesterol algorithm reflected obstacles in using the guideline application itself or the EMR in general or whether it was a characteristic of the problem domain is not clear. The end result was that the guideline’s ability to collect data and to disseminate in-depth recommendations was limited. Indeed, without synchronous or interactive forms of messaging, it is difficult to determine whether a recommendation has been read, let alone accepted or rejected, except by using proxies such as new LDL results or changes in the medication list (which do, in fairness, reflect the intended goal).
Our implementation of automated guidelines also may have been more effective if used in conjunction with an outpatient physician order entry system. Unlike inpatient alerts and warnings, which have been so successful at our institution,46,67,68
there was no way to facilitate the actual implementation of recommended outpatient actions, such as ordering a lipid level or prescribing a statin, because we did not have outpatient order entry. Outpatient order entry with rule-based decision support (as opposed to multistep and persistent algorithms such as ours) has been successfully implemented at other centers.60,70
Of course, order entry does not guarantee compliance with guidelines. For example, a recent study by Dexter et al. documented that one user interface model in an order entry system did not increase compliance with guidelines, whereas another user interface model did.25
We also envision additional data that can be included to make the guideline’s recommendation more meaningful. For instance, knowing details of the context of the visit (urgent, general check-up, health maintenance) can help determine the most appropriate mode of messaging. Also, additional data elements not commonly found in EMRs, such as information about modifiable risk factors (e.g., diet and exercise), may allow finer tuning of decisions and recommendations. This information could be captured with user dialogs, but, as noted above, getting physicians to provide such data is difficult. Interestingly, in our inpatient order entry system, there are many situations in which physicians enter supplemental data reliably and frequently. It may be that because entering orders is a necessary and regular part of clinical workflow in the hospital, greater interaction and user data entry have become acceptable. On the other hand, investigating a clinical algorithm—especially when the basic answer is already revealed—may be perceived as peripheral to the clinical workflow in the office, making extra interaction unnecessary and/or unacceptable.