A total of 808 articles were initially retrieved in the PubMed search, of which 734 met the inclusion criteria. Figure shows the flow chart of this literature review. Of the eligible articles, 62 (8.4%) reported a number needed to treat (Table ). One article used the method proposed by Lubsen et al. [15
] but described the results as "number of patient years of treatment to save one life" and not as "NNT" or "number of patients ...". Thus, this article was classified as non-NNT-reporting article. The 62 NNT-reporting articles had a median sample size of 553, ranging from 47 to 12639. Furthermore, 56 of 62 (90.3%) articles calculated the number needed to treat for the primary endpoint, 5 (8.1%) for primary and secondary endpoints, and 1 study (1.6%) calculated the number needed to treat only for a secondary outcome. The distribution of the 734 articles across the four considered journals BMJ, JAMA, Lancet, and NEJM was 90, 199, 190, and 255 (Table ). As the results indicated no trend over the three years we do not show the results of the single years. NEJM published the largest number of RCTs but had the lowest use of NNTs (19 of 255 articles), whereas the BMJ with the least number of RCTs represented the journal with the highest use of NNTs (13 of 90 articles). The BMJ was the journal with the largest percentage of articles presenting confidence intervals for NNT estimates (7 of 13, 53.8%).
Reporting of the number needed to treat (NNT) and corresponding 95% confidence interval (CI) in randomised controlled trials (RCTs) in leading medical journals in the years 2003–2005
Time-to-event outcomes were investigated in 373 (51%) articles; the other 361 articles used binary outcomes. In 3 articles, survival techniques as well as 2 × 2 tables were used for data analysis. The use of both methods was adequate in these articles because the follow-up time was equal for all patients and no censoring occurred. As NNTs were calculated on the basis of 2 × 2 tables these articles were classified as RCTs using binary outcomes. Of the 62 articles reporting NNTs, 34 articles presented time-to-event outcomes and 28 presented binary outcomes. Of the 34 NNT-reporting articles with time-to-event outcomes, only 17 (50%) applied an appropriate calculation method (Table ). In all these articles, the NNT calculation was clearly based on estimated survival probabilities by means of the Kaplan-Meier survival curve or the Cox regression model or the reported NNT equalled our recalculated NNT. In the remaining 17 (50%) of the 34 NNT-reporting articles with time-to-event outcomes the calculation was seemingly based on naive proportions (rates from 2 × 2 tables). This approach neglects varying follow-up times and censoring and was therefore classified as inappropriate. If possible, we recalculated the NNT based upon estimated survival probabilities. In Table the published and recalculated NNTs of the 17 articles with 95% confidence intervals (if recalculation was possible) and the corresponding absolute differences are summarized. A table providing some details (citation, experimental and control intervention, outcomes, sample size, follow-up time, published NNT, and corresponding 95% confidence interval) of the 34 NNT-reporting articles with time-to-event outcomes is given as Additional file 1
Reporting of the number needed to treat (NNT) and corresponding 95% confidence interval (CI) in randomised controlled trials (RCTs) with time-to-event outcomes in leading medical journals in the years 2003–2005
Reported and recalculated NNTs with 95% confidence intervals (CIs) from 17 studies using inappropriate methods to calculate NNTs for time-to-event data
To explain the methods of our calculations we present one typical example. One study provided the information "The number needed to treat to prevent 1 cardiovascular event would be 40 patients with IGT over 3.3 years". Additionally, the naive proportions of patients experiencing an event were given as 32/686 in the placebo group and 15/682 in the intervention group. Obviously, the result of NNT = 40 is based upon these naive proportions, because 1/(32/686-15/682)≈1/0.025 = 40. However, due to varying follow-up times and censoring, the naive proportions represent no valid estimates of the corresponding risks at time point 3.3 years, which is only the mean follow-up time. An adequate approach to estimate the required risks for a specified time point is given by the Kaplan-Meier method.
We enlarged the Kaplan-Meier incidence curve given in the paper and determined the corresponding risk estimates at time point 1200 days visually as accurate as possible. We found the risk values 0.0410 and 0.0235 for the placebo and the intervention group, respectively. Thus, the recalculated NNT is given by 1/(0.0410 - 0.0235) = 1/0.0175 = 57.1 and the reported NNT of 40 is about 30% too low.
In the 62 NNT-reporting articles, corresponding confidence intervals were presented in 21 studies (6 of the 34 studies with time-to-event outcomes and 15 of the 28 studies with binary outcomes). Among the 62 NNT-reporting articles, 1 article used the term "number needed to screen" (NNS), 2 articles used the terminology "number needed to treat for one patient to benefit" (NNTB) and harm (NNTH), respectively, and 1 article used the term "number needed to harm" (NNH).
The absolute risk reduction was given in 33 (53.2%) of the 62 NNT-reporting articles (17 with time-to-event data and 16 with binary data), a corresponding confidence interval for the absolute risk reduction was given in 21 (63.6%) of 33 articles (7 with time-to-event data and 14 with binary data).