|Home | About | Journals | Submit | Contact Us | Français|
Measures of a therapy’s effect size are important guides to clinicians, patients, and policy-makers on treatment decisions in clinical practice. The ECASS 3 trial demonstrated a statistically significant benefit of intravenous TPA for acute cerebral ischemia in the 3–4.5 hour window, but an effect size estimate incorporating benefit and harm across all levels of poststroke disability has not previously been derived.
Joint outcome table specification was employed to derive number needed to treat to benefit (NNTB) and number needed to treat to harm (NNTH) values summarizing treatment impact over the entire outcome range on the modified Rankin Scale of global disability, including both expert-dependent and expert-independent (algorithmic and repeated random sampling) array generation.
For the full 7-category mRS, algorithmic analysis demonstrated that the NNTB for 1 additional patient to have a better outcome by 1 or more grades than with placebo must lie between 4.0 – 13.0. In bootstrap simulations, the mean NNTB was 7.1. Expert joint outcome table analyses indicated that the NNTB for improved final outcome was 6.1 (95%CI 5.6–6.7) and the NNTH 37.5 (95%CI 34.6–40.5). Benefit per hundred patients treated was 16.3 and harm per hundred 2.7. The likelihood of help to harm (LHH) ratio was 6.0.
Treatment with TPA in the 3–4.5 hour window confers benefit on about half as many patients as treatment under 3 hours, with no increase in the conferral of harm. About one in six patients has a better and one in thirty-five a worse outcome as a result of therapy.
In human cerebral ischemia, the infarct core progressively expands over minutes to hours. In a typical ischemic stroke, nearly 2 million nerve cells are lost in each minute in which reperfusion is not established.1 Consequently, the benefits conferred by intravenous thrombolytic therapy for acute cerebral ischemia are highly time-dependent. The two NINDS TPA trials and subgroups of other trials demonstrated that treatment with intravenous tissue plasminogen activator (TPA) confers substantial net benefit when administered between 1–3 hours after onset.2–5 Although the 0–1 hour window has not directly been investigated (only 2 of the 624 patients in the NINDS trials were treated within 1 hour of onset),6 an even greater benefit is presumed in hyper-early patients. Recently, the ECASS 3 trial prospectively confirmed a lesser, but still statistically significant, benefit for intravenous TPA in the 3 to 4.5 hour window,7 a benefit previously suggested by pooled analysis of 3–4.5 hour window patients from 4 prior trials.5
For adoption in routine care, the results of a clinical trial must be not only statistically significant, but also clinically significant. As the clinical maxim observes, “a difference to be a difference must make a difference.” In deciding how best to apply the findings of a positive clinical trial, physicians, patients, family, and payors must consider the magnitude of the effect size demonstrated. Accordingly, a public health priority is to derive an index of the magnitude of the treatment effect demonstrated in the ECASS 3 trial.
The number needed to treat (NNT) is a statistically valid and clinically useful indicator of treatment effect magnitude.8 In the primary trial paper, the ECASS 3 investigators reported as 14 the net NNT to benefit (NNTB) for the trial’s dichotomized primary endpoint, achievement of an extremely good outcome (mRS 0–1) at 3 months. However, for treatments that alter outcomes across a range of health state transitions valued by patients, the NNTB for a single dichotomized transition provides an incomplete guide to treatment impact.9, 10
Joint outcome table specification permits derivation of NNTB and NNT to harm (NNTH) values summarizing treatment impact over the entire outcome range assessed by an ordinal scale.11, 12 This method has been successfully applied to trials of intravenous fibrinolysis, intra-arterial fibrinolyss, and neuroprotective therapy for acute ischemic stroke, and trials of clipping versus coiling and neuroprotective therapy for aneursymal subarachnoid hemorrhage.12 The results have been incorporated into an educational tool for patients and family members endorsed by the American College of Emergency Physicians, the American Academy of Neurology, and the American Heart Association.13 We undertook joint outcome table analysis of the ECASS 3 trial findings.
Derivation of number needed to treat values across an ordinal outcome range requires knowledge of final group distributions and the within patient variance.11 In the primary ECASS 3 report, the group distributions on all 7 levels of the modified Rankin Scale (mRS) of global disability was reported and treatment with alteplase was associated with a statistically significant (p = 0.02) beneficial shift in outcomes across the outcome range. 7 Parallel group clinical trials do not provide data regarding the within patient variance. Accordingly, in this study, equivalent information was derived using three methods of joint outcome table completion: 1) expert specification, 2) minimum and maximum algorithms, and 3) random sampling from all permitted joint distributions.
The expert specification followed methods previously described.11 Briefly, seven emergency physician and neurologist clinicians highly experienced in acute stroke care, and with no significant TPA-related competing interests, independently populated a joint outcome table for a model population of 1000 patients matching the ECASS 3 cohort. Each expert produced an array under the injunction that individual patient responses be those that are biologically most plausible within the constraint that group outcomes match the observed trial results.
The minimum and maximum algorithm specifications were also conducted according to prior described methods.14 The distribution of harm conferred by therapy was set at the median estimate of the expert panel. The maximum possible NNTB compatible with the data was then derived by completing the joint outcome table following the rule that every patient who benefits from therapy does so by improving only the minimum possible number of levels compatible with the final trial group outcome distributions. The minimum possible NNTB compatible with the data was derived by completing the table following the rule that every patient who benefits from therapy does so by improving the maximum possible number of grades compatible with the group data. The central value within the possible range was obtained by calculating the geometric mean of the minimum and maximum NNTBs.
The automated random sampling technique followed previously described principles.15 Within the 7 × 7 joint distribution matrix with fixed marginal frequencies adding to 1000, there are 26 unknown frequencies subject to 12 constraints (6 column, 6 row). The vector “x” is a 26 dimension vector that satisfies the 12 constraints. This is a linear inequality system of the form Ex=f where x > 0 and the 12 × 26 E matrix and 12 × 1 f vector is obtained from the constraints. Since computing all possible values of x is not practical, we used the R function “xsample” in the R program library “limSolve” to randomly sample values of the vector “x” from all possible joint outcome table solutions. Using the mirror algorithm, 3000 samples were taken from the large population of all possible solutions, without replacement. The mean and range NNTB values of these random samples from all possible NNTB values under the constraints were analyzed.
In the initial analysis, NNTs for change by 1 or more grades across all 7 levels of the mRS were derived. However, nearly half of individuals at risk for stroke do not consider a severely disabled outcome such as mRS rank 5 (severe disability, bedridden, incontinent, and requiring constant nursing care and attention) to be a better outcome than an mRS of 6 (dead).16 Accordingly, NNTs were also derived from the algorithmic and expert joint outcome tables across 6 levels of the mRS, collapsing mRS levels 5 and 6 into one worst outcome category.
Benefit per hundred patients treated (BPH) was derived in standard fashion by the formula BPH = 100/NNTB and harm per hundred patients treated (HPH) similarly by HPH = 100/NNTH. Likelihood of help to harm ratios (LHH) were derived in standard fashion by the formula LHH = BPH/HPH.17
In the ECASS 3 trial, the mean mRS score in the TPA group was 1.99 (SD 1.9) and in the placebo group 2.20 (SD 1.9), yielding a mean difference of 0.21. Net number needed to treat values for each of the 6 possible dichotomizations of the mRS are shown in Table 1, and range from −66.7 to 13.7.
For the full 7 category mRS, algorithmic analysis found the lowest possible NNTB consistent with the group outcomes of the ECASS 3 trial for 1 additional patient to have a better outcome by 1 or more grades with TPA than with placebo was 4.0 and the highest possible 13.0. The geometric mean of this range was an NNTB of 7.2. The independent expert joint outcome table analyses indicated that the biologically most plausible NNTB for improved final outcome on the mRS was 6.1 (95%CI 5.6–6.7, range 5.7 –7.4) and the NNTH 37.5 (95%CI 34.6–40.5, range 34.0–41.7)). The resulting expert likelihood of help to harm (LHH) ratio was 6.0. The results of the bootstrap simulations are shown in Figure 1. The mean NNTB was 7.08 (st dev 0.90).
For the 6 category mRS, algorithmic joint outcome table analysis of the group outcomes of the ECASS 3 trial indicated that the lowest possible for 1 additional patient to have a better outcome by 1 or more grades with TPA than with placebo was 4.2 and the highest possible 16.1. The independent expert joint outcome table analyses indicated that the biologically most plausible NNTB for improved final outcome on the mRS was 6.9 (95%CI 6.3–7.5, range 6.3–8.3) and the NNTH 39.9 (95%CI 36.9–43.1, range 34.5–45.5). The resulting expert likelihood of help to harm ratio was 5.8.
Table 2 compares the benefit and harm per hundred patients treated values for expert joint outcome table and dichotomized analyses of the ECASS 3 trial (3–4.5 hour window) and the two trials constituting the NINDS TPA Study (1–3 hour window).11 Across all 7 levels of the mRS, in both time windows, approximately 3 per 100 patients are harmed by therapy, while in the 1–3 hour window approximately 32 per 100 patients benefit and in the 3–4.5 hour window 16 per 100 patients benefit.
Patients and physicians consider as desirable several health state transitions across the spectrum of poststroke disability.18–20 Well-informed treatment decisions will take all valued transitions into account. Using previously described expert-based and expert-independent methods, we derived NNTB and NNTH values for outcome shifts over the entire range of the modified Rankin scale when patients with acute cerebral ischemia are treated with intravenous TPA in the 3–4.5 hour time window. The biologically most likely NNTB of 6.1 and NNTH of 37.5 indicate that, for every 100 patients treated in the 3–4.5 hour window, 16.4 will have a better and 2.7 a worse outcome by 1 or more levels on the modified Rankin Scale of global disability. Clinicians can use these values in the acute setting to inform patients and family members of the benefits and risks associated with intravenous TPA administered more than 3 and less than 4.5 hours after onset.
It is instructive to contrast the magnitude of the treatment effect of IV TPA in the 3–4.5 hour window with that in the 1–3 hour window. In biologically most plausible analyses, only half as many patients in the later time window benefit as in the earlier, while the risk of harm is about the same. These results emphasize the fundamental importance of treating patients quickly and continuous quality improvement to reduce time to therapy at every level of regional stroke systems of care. Nonetheless, the benefit to risk ratio in the 3–4.5 hour window of 6.0 suggests that, for those patients who have unavoidably missed the 3 hour window but are treatable within 4.5 hours, therapy will generally be worth pursuing.
The expert-independent methods employed in this study provide support for the expert-based best estimates. The algorithmic technique demonstrates that the NNTB over the entire outcome range must lie between 4.0–13.0, all values lower than any of the dichotomized NNTBs. The geometric mean within this possible range is 7.2, fairly close to the expert derived value of 6.1, lending it plausibility. This plausibility is further enhanced by the results of the sampling simulations, which similarly yielded a mean NNTB of 7.1.
The small difference between the expert-based and the expert-independent NNTB values provides insight into the experts’ views of the likely pattern of treatment effects yielded by TPA therapy in the 3–4.5 hour window. In parallel design clinical trials, the group outcome distribution data circumscribes a total amount of benefit of a treatment at a population level that can be allocated variously to individual patients. When individual patients benefit a lot, fewer patients benefit at all, and the NNTB is high. When individual patients benefit a little, more patients benefit at all, and the NNTB is low. The bootstrap simulation (and the geometric mean of the algorithmic-determined potential range) project that the degree of benefit experienced by individual patients is the average of the possible range. In contrast, experts project the degree of benefit experienced by individual patients based on physiologic knowledge and clinical experience. Their projections of individual benefit may be higher or lower than the average. In the current study of ECASS 3, that the expert-specified NNTB is smaller than the mean of all possible NNTBs determined by bootstrap simulations indicates that the experts project that individual patients who respond to TPA in a later time window have their outcomes improved only to a restricted degree. As a result, the total group benefit indicated by the trial outcome distribution is allocated to numerous, rather than only a few, individual patients, lowering the expert-derived NNTB estimate below the expert-independent NNTB estimates.
The number needed to harm values were similar in the 1–3 hour and 3–4.5 hour window, reflecting the similar rates of symptomatic intracerebral hemorrhage (SICH) in the two NINDS TPA trials and in the ECASS 3 trial. Any minor or major symptomatic hemorrhage (NINDS protocol definition of SICH) occurred 5.8% more often among TPA than placebo patients in the 1–3 hour NINDS trials,21 versus 4.4% more often among TPA than placebo patients in ECASS 3,7 supporting the expert assigned slightly lower harm per hundred treated values in the 3–4.5 hour window, and likely reflecting the milder strokes at entry in ECASS 3. For worsening of final outcome, the NNTH values derived by the experts more closely matched the difference in the rate of major SICH (2.2%, ECASS 3 protocol definition of SICH) than the difference in the rate of any SICH, in accord with prognostic models indicating that patients experiencing minor SICH generally do not have their final outcomes altered as a result.22
Comparison of the results for intravenous fibrinolysis with IV TPA in the 3–4.5 hour window with those for intra-arterial fibrinolysis in the overlapping 3–6 hour window is also instructive. Based on the PROACT 2 trial results, across the entire mRS, for every 100 patients treated with IA lysis in the 3–6 hour window, 20.8 benefit and 3.5 are harmed (LHH 6.0),12 versus 16.4 and 2.7 for IV TPA in the 3–4.5 hour window. Differences in trial populations between the PROACT 2 trial and the ECASS 3 trial include important prognostic features such as age, pretreatment stroke severity, and confirmed presence of middle cerebral artery occlusion, making direct comparisons tentative. Nonetheless, the concordant findings in these two trials of a reduced frequency of benefit in the greater than 3 hour window, compared with treatment in the first 3 hours, supports a biological constraint upon the degree of benefit that can be achieved with recanalization therapy once several hours have passed since onset.
This study has limitations. The NNTB and NNTH values are derived from a randomized trial and do not necessarily translate to an observational setting unless only patients are considered that would have participated in ECASS 3. The analysis treats only transitions in health state from one modified Rankin Scale level to the next as valuable, and does not give value to improvements or worsening in health that occur entirely within the bounds of a scale level, which may lead to underestimation of the full impact of treatment. The expert-independent analyses are independent with regard to derivation of NNTB, but not NNTH. Completely expert-independent models cannot be employed as they are vulnerable to random person statistical effects that lead to overestimation of degree of both benefit and harm.23
The positive results of the ECASS 3 trial are a major advance in stroke science that now require translation into everyday clinical practice, including hospital treatment policies and consent processes. Joint outcome table analysis indicates that, among individuals matching the ECASS 3 cohort, as a result of treatment with TPA in the 3–4.5 hour window, about one in six patients has a better and one in thirty-five a worse disability final outcome. Clinicians, patients, family members, and policy makers should consider these effect size indices when making consent and policy decisions about lytic therapy in later time windows.
This work was supported in part by NIH-NINDS Awards NIH-NINDS U01 NS 44364 and NIH-NINDS P50 NS044378.
Potential Competing Interests Disclosures
JLS has served as a scientific consultant regarding trial design and conduct to CoAxia, Concentric Medical, Talecris, and Cygnis (all modest); received lecture honoraria from Ferrer and Boehringer Ingelheim (modest); received devices for use in an NIH multicenter clinical trial from Concentric Medical (modest); was an unpaid investigator in a multicenter prevention trial sponsored by Boehringer Ingelheim; has declined consulting/honoraria monies from Genentech since 2002; is an employee of the University of California, which holds a patent on retriever devices for stroke.
JGornbein is an employee of the University of California, which holds a patent on retriever devices for stroke.
JGrotta has served as a scientific consultant regarding trial design and conduct to Lundbeck (modest); has patent on caffeinol in stroke licensed by InnerCool (significant).
DL has served as a scientific consultant regarding trial design and conduct to CoAxia, Concentric Medical, Brainsgate (all modest); was an unpaid investigator in a multicenter prevention trial sponsored by Boehringer Ingelheim; is an employee of the University of California, which holds a patent on retriever devices for stroke.
HL has served as a scientific consultant regarding trial design and conduct to CoAxia, Concentric Medical, and Talecris (all modest); is on prevention Speaker’s Bureau for Boehringer Ingelheim (modest).
LS has served as a scientific consultant regarding trial design and conduct to CoAxia, Concentric Medical, Phressia, RTI Health Solutions, and CryoCath (all modest); was a site investigator in a multicenter trial run by Forest for which the MGH Neurology Service received payments based on the clinical trial contracts for the number of subjects enrolled (modest).
SS has served as a scientific consultant regarding trial design and conduct to Astra Zeneca; was a site investigator in a multicenter trial run by Vernalis for which the UC Regents received payments based on the clinical trial contracts for the number of subjects enrolled; received devices or study agent for use in NIH multicenter clinical trials from Concentric Medical, Genentech, Ekos Medical (all modest); is an employee of the University of California, which holds a patent on retriever devices for stroke.