Balance between desirable and undesirable effects
The first key determinant of the strength of a recommendation is the balance between the desirable and undesirable consequences of the alternative management strategies, on the basis of the best estimates of those consequences (table). Consider, for instance, the use of antenatal steroids in women destined to deliver an infant prematurely. High quality evidence shows that administration of steroids to mothers decreases the risk of infant respiratory distress syndrome with minimal side effects, inconvenience, and costs. The advantages of administration of steroids hugely outweigh the disadvantages, indicating the appropriateness of a strong recommendation.
Determinants of strength of recommendation
When advantages and disadvantages are closely balanced, a weak recommendation becomes appropriate. Consider, for instance, patients with atrial fibrillation at low risk of stroke. Warfarin can reduce that low risk even further but adds inconvenience and an increased risk of bleeding. The right choice under such circumstances is not self evident and is likely to differ between patients.
As with all other aspects of a grading system, a tension exists between the important goal of simplicity and the danger of oversimplification. We have presented the trade-off between advantages and disadvantages as a dichotomy: clear difference versus a close call. Of course, the reality is a continuum between these extremes. Nevertheless, the forced dichotomisation allows simplification of a process that many people already find complex and may enhance the transparency of decision making.
Quality of evidence
The second factor that determines the strength of a recommendation is the quality of the evidence. If we are uncertain of the magnitude of the benefits and harms of an intervention, making a strong recommendation for or against that intervention becomes problematic. Thus, even when an apparent large gradient exists in the balance of advantages and disadvantages, guideline developers will be appropriately reluctant to offer a strong recommendation if the quality of the evidence is low.
For instance, graduated compression stockings have an apparent large effect in reducing deep venous thrombosis in people making long plane journeys. The randomised trials from which the estimate of effect comes were, however, seriously flawed—randomisation was unconcealed, the techniques for measuring deep venous thrombosis were not reproducible, and the studies were not blinded. Despite the apparent large benefit, and the only major disadvantage being inconvenience, use of stockings warrants only a weak recommendation.3
Values and preferences
The third determinant of the strength of recommendation is uncertainty about, or variability in, values and preferences. Given that alternative management strategies will always have advantages and disadvantages, and thus a trade-off will occur, how a guideline panel values benefits, risks, and inconvenience is critical to any recommendation and the strength of the recommendation. One could argue that, given the very limited study the subject has received, large uncertainty always exists about values and preferences. On the other hand, some systematic study of values and preferences has been completed, and clinicians’ experience with patients provides additional insight.
Consider, for instance, prevention of stroke in patients with atrial fibrillation. Warfarin, relative to no antithrombotic treatment, reduces the risk of stroke—in relative terms—by approximately 65%, but at an appreciable increased risk of severe gastrointestinal bleeding. Devereaux and colleagues asked 63 physicians and 61 patients how many serious gastrointestinal bleeds they would tolerate in 100 patients and still be willing to prescribe or take warfarin to prevent eight strokes (four minor and four major) in 100 patients.4
Figure 1 shows the results. Whereas physicians gave a wide diversity of responses, most patients placed a high value on avoiding a stroke and were ready to accept a bleeding risk of 22% to reduce their chances of having a stroke by 8%. Even among patients, however, diversity in values and preferences was apparent; a few patients were ready to accept only a small risk of bleeding to reduce their stroke risk by 8%. These data suggest that only in patients at high risk of stroke would a strong recommendation for warfarin be warranted.
Fig 1Varying thresholds of major gastrointestinal bleeding found acceptable by patients and physicians for prevention of eight strokes in 100 patients
Contrast this with the decision faced by pregnant women with deep venous thrombosis. Warfarin therapy between the sixth and 12th week of pregnancy puts women’s unborn infants at risk of relatively minor developmental abnormalities. The alternative, heparin, eliminates the risk to the child. This benefit, however, comes with disadvantages of pain (heparin injections), inconvenience, and cost. Nevertheless, clinicians’ experience is that women overwhelmingly place a very high value on preventing fetal complications. As a result, a strong recommendation for substitution of heparin is warranted.
Given the paucity of empirical examinations of patients’ values and preferences, well resourced guideline panels will usually have to rely on consultation with individual patients and patients’ groups to gain insight into patients’ values. Less well resourced panels must rely on their intuitive impressions of these values. In either case, when a recommendation is particularly dependent on values and preferences, panels must state the values underlying their decision. For instance, the following assumption provided the basis for recommendations in a guideline for antithrombotic treatment in pregnant women: “While we are unaware of any research specifically addressing women’s preferences regarding antithrombotic therapy in pregnancy, anecdotal evidence suggests that many, though not all women, give higher priority to the impact of any treatment on the health of their unborn baby than to effects on themselves.”5
The final determinant of the strength of a recommendation is cost. One could consider cost as one of the outcomes when weighing up the advantages and disadvantages of competing management strategies. Cost, however, is much more variable over time, geographical areas, and implications than are other outcomes. Drug costs tend to plummet when patents expire, and charges for the same drug differ widely across jurisdictions. In addition, the resource implications vary widely. For instance, a year’s prescription of the same expensive drug may pay for a single nurse’s salary in the United States, six nurses’ salaries in Poland, and 30 nurses’ salaries in China.
Thus, although higher costs will reduce the likelihood of a strong recommendation in favour of a particular intervention, the context of the recommendation will be critical. In considering resource allocation, guideline panels must thus be very specific about the setting to which a recommendation applies and the perspective that is used—that is, which costs were considered. Furthermore, recommendations that are heavily influenced by costs are likely to change over time as resource implications evolve.