We use an example data set describing mortality in randomised controlled trials of treatments for chronic obstructive pulmonary disease (COPD). The data set consists of a subset of trials from Baker et al [11
]. We chose five trials providing mortality data for the following comparators: salmeterol, fluticasone, salmeterol fluticasone combination (SFC) or placebo. Three of the trials report count data [12
], one reports hazard ratio data from a two arm trial [15
] and one reports hazard ratio data from a multi (four)-arm trial [16
]. Hazard ratio data were obtained from the trial publications and an existing meta-analysis of hazard ratio data [17
The count statistics used are presented as Table and the hazard ratio statistics as Table . The derived estimates of the mean log hazard ratio and it's standard error, required for the analysis, are also presented in Table , these were estimated using formulae (1) and (2):
Hazard ratio and log hazard ratio statistics
The results of this example analysis are purely illustrative of the methodology and do not provide any indication of the comparative effectiveness of treatments for decision-making purposes, as the data set omits relevant direct and indirect data.
Using the example data set we show how to:
(a) Perform a meta-analysis of count statistics on the log hazard ratio scale;
(b) Reflect correlation in relative treatment effect estimates from multi-arm trials;
(c) Combine count and hazard ratio statistics in a single analysis on the log-hazard ratio scale; and
(d) Include a random effect in to the analysis, whilst preserving the correlation in relative treatment effects for multi-arm trials.
We also discuss how other possible presentations of survival data, not available from our motivating example, could be incorporated into an analysis on the log-hazard ratio scale.
All analyses described were conducted using WinBUGS [18
]. The WinBUGS code for the fixed and random effects analyses are presented as an Appendix.
(a) Meta-analysing count statistics on the log hazard ratio scale
The count data are incorporated in the network meta-analysis model using a binomial likelihood:
where rs,k is the cumulative count of subjects who have experienced an event in arm k of study s; ns,k is the total number of subjects in arm k of study s; and Fs,k is the cumulative probability of a subject having experienced an event (or 'failure').
A log cumulative hazard for each trial arm ln(Hs,k
) is then derived from Fs,k
The log cumulative hazard estimates are then included in a treatment effect model with a linear regression structure. The log cumulative hazard is estimated as the sum of a study specific 'baseline' term αs
and a treatment effect coefficient βk
where β1 = 0 for the reference treatment (placebo in our example) and βb represents the treatment effect for the baseline treatment in study s. The fixed study level 'baseline' term is a nuisance parameter, included to ensure that the treatment effect estimates are informed by within trial differences between treatment arms and not by differences in baseline event rates across trials.
Under an assumption of proportional hazards, the βk
coefficient is equal to both the log cumulative hazard ratio and the log hazard ratio:
where hs,b represents the hazard for the baseline treatment in study s. This identity allows us to combine the count statistics analysed on the log cumulative hazard scale with the hazard ratio data analysed on the log hazard scale. It also demonstrates that analysis of count statistics on the log hazard scale does not require a stronger assumption than proportional hazards.
(b) Reflecting correlations in relative treatment effects from multi-arm trials
Estimates of relative treatment effects from trials with more than two treatment arms will be correlated [10
]. In our example the TORCH trial [16
] is the only multi-arm trial reporting hazard ratio data. Estimated treatment effects from TORCH will be correlated, for example, the hazard ratio comparing SFC to placebo and the hazard ratio comparing salmeterol to placebo will be correlated due to their joint dependence on the time to event data in the placebo arm.
If a network meta-analysis is based on estimates of treatment effect in individual trial arms rather than estimates of relative treatment effect between arms ("contrast" statistics), this correlation will automatically be captured in the analysis [10
]. This is the case when a network meta-analysis is based on count statistics. However, if the network meta-analysis is conducted based on estimates of relative treatment effect, this correlation between arms will not automatically be captured [10
For multi-arm trials reporting hazard ratio statistics, this problem can be addressed by converting the log hazard ratios (contrast statistics) to log hazards (arm-specific statistics). Log hazards for individual trial arms are derived by nominally setting the log hazard for the baseline treatment bfor the trial to zero. The mean log hazards for the other treatments are then equal to the log hazard ratios compared to baseline treatment.
The variance for a log hazard ratio is the sum of the variances for the individual log hazards. Standard errors of the log hazards for each trial arm can therefore be estimated by solving simultaneous equations based on the standard errors for the set of log-hazard ratios. For example:
Where se2i,j is the variance of the log hazard ratio comparing arm i to arm j and sei is the standard error of the log hazard for arm i.
The standard errors of the log hazards for the other treatment arms are then estimated as:
Mean log hazards and associated standard errors for each treatment in TORCH [16
] are reported in Table . These are derived from the hazard ratio data presented in Table .
Table 3 Log hazard statistics for a multi-arm trial (TORCH )
In order to estimate standard errors of the log hazards for each treatment, we required estimates of the uncertainty associated with four treatment contrasts. In some cases this data may not be available and thus the methods presented in equations 7 and 8 may not be feasible. For example, hazard ratios and associated measures of uncertainty may only be available for each active treatment relative to a single common comparator (e.g. placebo) as is commonly reported in the published literature.
In this situation, we can approximate the standard error for the comparison between active treatments by assuming the standard error is proportional to
. For example
(c) Combining count and hazard ratio statistics in a network meta-analysis
The log hazard ratio statistics from two arm trials comparing treatments k
are incorporated in the network meta-analysis model using a normal likelihood:
is the log hazard ratio estimate for study s comparing treatments k
is the corresponding variance.
The log hazard ratio estimates are then included in a treatment effect model with a linear regression structure, with the predicted log hazard ratio for a study s
comparing treatments k
equal to the difference between the two treatment coefficients:
where β1 = 0 for the reference treatment (in our example placebo) and βb represents the treatment effect for the baseline treatment in study s. As in equation 5, the βk coefficient is equal to the log hazard ratio for treatment k compared to the reference treatment.
The log hazard statistics from a multi-arm trial are incorporated in the analysis using the following likelihood functions. For the baseline treatment, b
For the other treatments:
is the log hazard for treatment arm k
from study s
is the associated variance.
The log hazard estimates are then included in a treatment effect model with a linear regression structure. The log hazard is estimated as the sum of a study specific 'baseline' term αs
and a treatment effect coefficient βk
where β1 = 0 for the reference treatment (placebo in our example) and βb represents the treatment effect for the baseline treatment in study s. The fixed study level 'baseline' term is a nuisance parameter, included to ensure that the treatment effect estimates are informed by within trial differences between treatment arms and not by differences in baseline event rates across trials. As the βk coefficient is equal to the log hazard ratio for the cumulative count data, the log hazard ratio data and the log hazard data, they can be combined within a single analysis. Where an individual study reports both cumulative count and hazard ratio data, only one set of data should be included in the analysis to avoid double counting.
(d) Incorporating a random effect
In a random effects analysis of a network containing multi-arm trial contrast data, the correlation in the random effects must also be taken in to account. Again this is due to the joint dependence of the multiple contrast estimates on common trial arms.
This correlation is reflected in the model by separating the random effect deviation for each contrast in to the contributions to the random effect deviation of the two treatments that form the contrast. This is achieved by modifying the linear predictor component of the model for the cumulative count, log hazard ratio and log hazard data:
is the random effect deviation for arm k
of study s
and is assumed to be normally distributed with zero mean and variance σ2
/2 where σ2
is the random effect variance for a treatment contrast:
This approach assumes that σ2
is the same for all treatments and consequently that the random effect variance will be the same for all treatment contrasts. The assumption of a common random effect variance across treatment contrasts implies that the covariance for any pair of treatment contrasts from the same study will equal half the treatment contrast random effect variance [10
A vague prior for the study specific baseline αs~N(0,106) is used to ensure estimates of treatment effect are informed by within trial differences between treatment arms, and not by differences in absolute response between trials. A vague prior is also used for the treatment effect coefficients with βk ~ N(0,106) and β1 = 0 (representing placebo).
Each model was run for 40,000 burn-in simulations and 200,000 runs which were then thinned every 20th simulation to reduce autocorrelation.
Two sets of initial values were used and convergence was assessed by examining caterpillar plots and Brooks Gelman-Rubin (BGR) statistics. The deviance information criteria (DIC) was used to compare the fit of the fixed and random effects models [19