Trials in which no patient develops the outcome of interest are conventionally included in estimates of pooled RD, but not RR and OR. We explored this inconsistency using 3 published meta-analyses and found that when zero total event trials are included, there is a relatively small reduction in the magnitude of the pooled RR and OR and the confidence intervals, resulting in a slightly more conservative estimate of treatment effect. In contrast, for RD where zero total event trials are traditionally included, the effect is more pronounced, and in 1 extreme case, inclusion of zero total event trials negated an otherwise statistically significant treatment effect. Therefore, excluding zero total event trials could change the clinical implication of a meta-analysis that used RD exclusively to pool similarly extreme data.
The greater effect of zero total event trials on RD versus RR and OR occurs because trials with low event rates have a much higher weight in the pooled estimate of RD compared to the other measures (graphically depicted in reference [
16]). Because the pooled RR and OR is dominated by trials with at least 1 event in both groups, it is relatively insensitive to the inclusion of low-weight zero total event trials. Even when significant heterogeneity is present, which increases the relative weighting of low-weighted zero total event trials in random effects analyses, the changes in these pooled estimates are still relatively small (see Figure ).
We present an extreme example (Figure ) where the inclusion of zero total event trials in a meta-analysis using RD as the effect estimator negates a statistically significant treatment effect obtained when such trials are excluded. However, such situations would be expected to occur rarely because the inclusion of these trials has opposite impacts on the treatment effect (which becomes closer to nil) and its confidence interval (which narrows). For RR and OR, where the changes are smaller, it is even more unlikely that inclusion of zero total event trials would negate the statistical significance of a treatment effect, especially when the meta-analysis contains trials with at least 1 event in both groups.
The addition of zero total event trials decreased heterogeneity in the examples provided. One might have expected that heterogeneity would be increased by adding zero total event trials to a group of trials, each with at least 1 event, that on average show a non-zero treatment effect. However, in these examples the treatment effect was similar between zero total event trials and event trials. Therefore, the net effect of including more trials in the meta-analysis was to reduce heterogeneity. This arises because the Q statistic, which provides an assessment of heterogeneity, has a null Chi-squared distribution with degrees of freedom equal to one less than the number of trials. If heterogeneity increases only slightly as more trials are added, the increase in Q is small relative to the increased degrees of freedom and Q is less likely to be statistically significant. Similarly the I2 measure of heterogeneity (calculated as 100%·[Q-degrees of freedom]/Q) decreases. Figure illustrates this effect. Prior to the addition of the zero total event trials, there is a significant treatment effect and substantial heterogeneity for each of the effect measures. After adding the zero event trials, the treatment effect is smaller but still significant, and the degree of heterogeneity, expressed as I2, is reduced or eliminated for each of the effect measures.
For the RD effect measure, the weighting of the zero total event trials in the pooled result is comparable to the proportion of such trials in the meta-analysis. Therefore, including a large proportion of zero total event trials with identical results (i.e. a RD of exactly 0) may result in reduced heterogeneity by simply overwhelming the other results even if there is marked heterogeneity among the remaining trials. Even in the presence of equal underlying event rates in the treatment and control groups one would expect some random variation around a RD of 0; however, zero total event trials are frequently small which makes it less likely to observe any events in either arm. Thus, these small underpowered studies contribute to an exaggerated decrease in heterogeneity. One would expect this to be less of a factor for OR and RR because the weighting of zero total event trials in the effect measure is lower, and sometimes significantly lower, than the proportion of such trials included in the meta-analysis (for example see figures and ).
We used a continuity correction of 0.5 since this is the correction most commonly used [
3,
4]. Sweeting et al [
5] have recently proposed two alternative continuity corrections, one based on the reciprocal of the group (i.e. treatment or control) size opposite the zero cell, and the second based on an empirical estimate of the pooled effect size using the studies in the meta-analysis with events in both the treatment and control arms. Applying these corrections instead of 0.5 in two of our examples (dopamine and antibiotics to prevent rheumatic fever) gives very similar results. For the third extreme example (heparin to prevent non-fatal pulmonary embolism), there are no trials with events in both the treatment and control groups, preventing the use of the second alternative correction. Inclusion of zero total event trials using the continuity correction based on the reciprocal of the opposite group size negates a statistically significant treatment effect obtained when such trials are excluded for all three effect estimators [similar to RD in Figure , the lower bounds of the 95% confidence interval for the RR and OR effect estimators also cross unity if this alternative correction is used (results not shown)]. Using this correction in our case examples shows that including or excluding zero total event trials can change the clinical implication of a meta-analysis pooling similarly extreme data, regardless of the effect measure used.
We focused on the impact of zero total event trials on summary effect measures in meta-analyses using inverse variance weighting, which is the only commonly used method that can incorporate between-study heterogeneity in a random effects model. We chose published illustrative meta-analyses that combined high- and low-risk patients. We did not consider other issues such as the choice of effect measure, appropriateness of combining high- and low-risk patients, the choice of a fixed vs. random effects model, or other methods for addressing heterogeneity.
In contrast to our illustrative examples combining high- and low-risk patients, two recently published simulation studies provide comprehensive comparisons of multiple meta-analytic methods when baseline event rates are low [
5,
17]. These studies, both of which excluded zero total event trials for the OR simulations, suggest that in general the commonly used continuity correction of 0.5 biases the Mantel-Haenszel and inverse variance OR estimators [
5,
17]. The simulation study that examined RD demonstrated that when events are rare, using RD widened confidence intervals and thus lowered statistical power compared to other methods [
17]. Finally, although the inverse variance method is the only non-Bayesian method that incorporates between-study heterogeneity, the same authors found that it gives biased effect estimates when event rates are low. Although this bias is present at moderate event rates and effect sizes, it becomes greater than 1% when event rates are very low (± 1%) or treatment effects very large (RR ± 0.5) [
17]. The overall event rates in our examples were 7.7% [
13], 1.1% [
14], and 1.5% [
15], respectively.