Although some have suggested that quantitative treatment benefit information may be best interpreted when it is presented as NNT,2,3
our study suggests just the opposite. Patients had more difficulty interpreting written treatment benefit information when it was presented as NNT. This effect was evident whether patients were comparing the benefits of two treatments or calculating the exact effect of a treatment on a given baseline risk of disease. The difficulty was magnified in patients with lower numeracy levels.
Patients' difficulties with NNT could perhaps have been predicted from the results of several studies examining the perception of risk in both rate (X in 1,000) and proportion (1 in X) formats.8–10
All of these studies found that patients had more difficulty with the “1 in X” scale, perhaps because larger numbers are represented by smaller numbers in the denominator. Because NNT is essentially a “1 in X” scale, reporting the number of people who must be treated for 1 person to benefit, it is not surprising that patients would have difficulty comparing and calculating treatment benefits presented as NNT.
This study suggests that written information on treatment benefits is better understood when it is presented as ARR or RRR in the context of a given baseline risk of disease. When comparing the effectiveness of 2 treatments, both the ARR and RRR require equivalent and straightforward tasks: the patient must choose the treatment with the largest risk reduction. When calculating the effect of a given treatment on a baseline risk, the ARR requires the simplest task: subtraction. The RRR presentation, however, requires a very familiar task, a task akin to figuring out how much money would be saved during a sale at the store. Familiarity may serve to smooth differences attributable to format. The effect of the combination presentation is more difficult to characterize. It is as easily interpretable as the ARR and RRR presentations when patients are asked to compare the effectiveness of 2 treatments, but performs no better than the NNT presentation when patients are asked to calculate the effect of a treatment on a given baseline risk of disease. This result is harder to explain, but may be due to overload of information or the more difficult construct of the presentation format. Regardless, the poor performance of patients presented with the combination presentation when they are trying to calculate the effect of treatment on a given risk of disease makes the presentation format a less desirable one for clinicians. More information is not necessarily better.
Because even patients who received the “simplest” risk presentation formats had difficulty comparing and calculating treatment benefit information, this study again raises questions about whether patients can independently make informed medical decisions using written quantitative information. We did note that patients who had a recent medical discussion with their doctor, or who reported receiving at least some quantitative information from their doctor, interpreted treatment benefits correctly more often than patients who did not report these interactions. This finding may indicate that patients who are more educated, or have better numeracy skills, are more likely to receive quantitative information from their physicians. Alternately, it may indicate that patients can learn the skills needed to interpret quantitative presentations of treatment benefits through discussions with their doctor. We are aware of at least one effort to prepare patients to better interpret quantitative information on treatment benefit through a computerized risk tutorial.11
Additionally, several researchers continue to explore more accessible ways to present quantitative information. A recent comparison of graphical and numerical presentations of treatment benefit, however, showed that numbers were interpreted equally well (in comparison tasks) or better (in calculation tasks) than graphical presentations.12
Thus, a continued effort to improve patient interpretation of numerical treatment benefits is indicated.
The Evidence-Based Medicine Working Group has recently proposed presenting patients with the Likelihood of Being Helped Versus Harmed as a means of communicating treatment benefit.13
This treatment benefit presentation format incorporates an individual patient's values with the number of patients that need to be treated for a benefit to be realized in 1 and the number of patients that need to be treated for a harm to be realized in 1, to report that a patient is x times more likely to be helped than harmed. The Likelihood of Being Helped Versus Harmed format avoids the problematic “1 in x” scale of the Number Needed to Treat and obviates the need for a patient to calculate the exact effect of a treatment on a given baseline risk of disease, because the goal of this calculation is to facilitate the weighing of harms and benefits. This presentation format therefore deserves further testing. Until this or the other innovations in the quantitative presentation of treatment benefits are tested, our study supports presenting treatment benefit information to patients as ARR or RRR, when a baseline risk of disease is available, and verifying patient understanding.
Our study does have several potential limitations. First, written information on treatment benefits was presented out of context in this study, reducing patients' personal involvement in the tasks measuring perception of treatment benefit. Previous research has shown that the degree of issue involvement influences how a patient processes information:14
those who are highly involved process information in a detailed and integrative way, whereas those who are less involved process information superficially. Involving patients in actual treatment decisions would be expected to increase their processing of quantitative information, although these effects may be diminished by the burden of acute illness, which may make patients less able to process complex information. Regardless, we expect that the out of context presentation would affect all risk presentation format groups equally.
Second, we asked subjects to interpret individual benefit from NNT. This is not the task for which NNT was proposed. This is, however, the task implied by those who claim that NNT is easily understood by patients; patients are intrinsically interested in how an intervention affects them, not how an intervention affects the population from which probabilistic information was derived.
Third, ARR, RRR, and NNT can be worded in many different ways. Whether alternate wording of the presentation of treatment benefits would produce different results has not been tested. In our study, the readability of treatment benefits varied by risk reduction format (Fleisch-Kincade grade levels 5.8 [RRR], 8.3 [ARR], 11.5 [NNT], and 10.9 [COMBO], despite the fact that the readability of the entire presentation regarding treatment benefits was similar (Fleisch-Kincade grade levels from 10.3 to 11.8). It may be possible to word treatment benefit presentations so that they are less different in reading grade levels. Future research will help us determine what proportion of the difference in patient understanding is from differences in the readability of the presentations versus differences inherent to the concepts themselves.
Fourth, we did not measure literacy. Inability to understand the written presentations of treatment benefit and the written questions (Fleisch-Kincade grade level 8) could have accounted for some of our findings. Whether presenting the information orally would change results should be investigated in future studies.
Fifth, subjects in this study had no opportunity to ask questions about treatment benefits. It is possible simple clarification from a physician may have significantly improved patient understanding of some of these risk reduction formats.
Sixth, we used between-subject comparisons rather than within-subject comparisons of the risk reduction formats to reduce the length of the survey, increase the feasibility of survey administration in our clinic setting, and reduce “training” effects. Although within-subject comparisons have the advantage of allowing each subject to act as his own control, we were able to study large numbers of patients to minimize the effect of between-subject variation. Additionally, the findings of three within-group comparisons15–17
and three between-group comparisons18–20
of the persuasiveness of alternate risk reduction formats in physicians have showed a high degree of consistency.
Seventh, our results may not be generalizable to patients in other age groups. Older adults are more likely than middle-aged and younger adults to demonstrate limited numeracy skills.21
Finally, the nonconsecutive nature of our sample may also affect generalizability. When two eligible patients presented to the clinic at the same time, our sole research assistant could approach only one. We are not aware of appreciable differences between the patients who were approached to participate in the study and those who were not, but we made no formal attempt to monitor differences. Similarly, we do not have information on the patients who refused to participate in our study. We suspect, however, that people who did not participate were less confident in their quantitative abilities than those who participated.
Despite these limitations, this study provides important information to clinicians who wish to help their patients make informed decisions: many patients have poor numeracy skills; patients have difficulty interpreting quantitative information; and NNT, and sometimes combination presentations, are interpreted less successfully than ARR or RRR.
To address patients' limited ability to use quantitative information, clinicians may, in the short term, want to use written quantitative information only with patients with higher numeracy skills; present information that uses comparison, not calculation; present risk reduction information as ARR or RRR rather than as NNT or a combination presentation; and verify patient understanding after the presentation of treatment benefit information.
In the longer term, however, we believe researchers should continue to explore the robustness of current observations on ARR, RRR, and NNT in different populations and in different risk reduction scenarios, with alternate wording and different combinations of ARR, RRR, and NNT. Both clinicians and researchers should also work to improve patient understanding of quantitative information through exploring new presentation formats and developing patient tutorials on how to interpret quantitative health information.