Although we found large between-centre differences in outcome in the CRASH trial, taking these into account did not substantially change the estimated treatment effect. Neither did we see major differences in treatment effect by centre. This study provides no support for the hypothesis that between-centre differences in outcome affect the chances of demonstrating a treatment effect in RCTs, in contrast to current beliefs in this clinical area [7
Considering differences between centres in outcome and in estimated treatment effect could be of importance from two perspectives. First, between-centre heterogeneity in the treatment effect between may indicate limited generalizability, which is of importance for example when registering a drug in a particular country. In our study there was no clinically meaningful heterogeneity in overall treatment effect. Although the between-centre differences in the treatment effect were statistically significant, the 95% range was small (1.17-1.26). Clearly, determining generalizability is not solely a statistical issue but requires a clinical judgement to the extent to which the trial results might apply to another population.
Some trials have estimated the heterogeneity of the treatment effect between centres or countries or regions, but did not use random effect modelling. The PLATO study (The Study of Platelet Inhibition and Patient Outcomes) compared two platelet inhibitors (Ticagrelor versus Clopidogrel) for prevention of cardiovascular events in patients with acute coronary syndrome. The overall treatment effect was a hazard ratio (HR) of 0.84 in favour of Ticagrelor. The treatment effect was also tested in four different geographic regions separately; Asia-Australia (N = 1,714), Central-South America (N = 1,237), Europe-Middle East-Africa (N = 13,859), and North America (N = 1,814). In Europe the estimated HR was 0.80 (95% CI: 0.72-0.90). The HRs in Asia-Australia, Central-South America were 0.80 and 0.86, both non statistically significant. The estimated HR in North America was however 1.25 (95% CI: 0.93-1.67). The authors state that "the difference in results between patients enrolled in North America and those enrolled elsewhere raises the questions of whether geographic differences between populations of patients or practice patterns influenced the effects of the randomized treatments, although no apparent explanations have been found."
This interpretation shows the importance to distinct statistical from clinical reasoning. Although the statistical analysis showed significant differences between geographic regions in the PLATO trial, which could be an indication of limited generalizability, the authors have no biological or mechanistic explanation for the heterogeneity of the treatment effect and no heterogeneity was expected on beforehand. In such a situation were region specific estimates of the treatment effect are desired, or when heterogeneity in the treatment effect is expected, we would recommend to use a random effect model to estimate the between-region differences in treatment effect. On the other hand, a limited number of centres, countries or regions, complicates estimation of the heterogeneity in treatment effect.
Second, it is thought that heterogeneity between centres might reduce statistical power to detect the treatment effect.9
Providing that a trial is large enough, randomization will ensure that the intervention and control group are similar with regard to known and unknown confounders [10
]. As expected, our study showed that taking into account between-centre differences did not affect statistical significance.
Several explanations can be given for our findings. First, differences in outcome between centres in RCTs may be caused by patient characteristics, which we adjusted for in this analysis. We may not expect that patient characteristics result in differences in treatment effect between centres if the treatment is assumed to work for all patients included in the trial. Secondly there may be differences in care. If these only affect the baseline event rate (e.g. fewer ICU capacity) the treatment effect is not likely to be influenced. In contrast there could be differences in care interacting with the treatment, e.g. if time to hospital arrival is structurally longer in some places, an acute treatment may be less effective. If such an interaction is expected, it would usually be captured in inclusion criteria, such as inclusion within a certain time after injury. In our study we found large differences in outcome between the centres but limited variability in the treatment effect. In other words, there was no substantial interaction between centre and treatment, although such an interaction might have been expected since the CRASH trial comprised an acute treatment and was conducted in low- to high- income countries. This is also an important finding from the perspective of standardisation of care in trials, which some consider very important [9
]. Our study suggests that if non-standardized care only influences the absolute risk and does not interact with the treatment, there is no reason to put much effort in standardizing care.
We consider our results to be applicable to drug interventions, which work on physiological mechanisms. Trials investigating a more complex intervention such as surgery or a complex treatment strategy may be more sensitive to differences in quality of care. The effect of outcome difference on treatment effect is not expected to be related to the magnitude of the treatment effect. We recognize that further studies are required to confirm or refute these findings for other types of interventions and for other diseases. Moreover it is crucial to think in advance on the mechanism of the treatment, and whether heterogeneity or homogeneity of the treatment effect by centre is expected.
In this study we have assessed heterogeneity of the treatment effects on a relative scale, but we can also use an absolute scale (risk difference). We found that there is no heterogeneity on the relative scale, despite heterogeneity in the absolute risks per centre. This combination implies that there is heterogeneity in treatment effects on an absolute scale, which is important to realize when considering treatment for individuals [14
The demonstration of hetero- or homogeneity in treatment effects by country or centre in the single study is conceptually the same as demonstration hetero- or homogeneity is a meta-analysis. The CRASH trial could be seen as a prospective meta-analysis of 40 trials in 40 different countries. A simple way showing the heterogeneity in treatment effects would be to present the results of a forest plot meta-analysis and test for heterogeneity. This was done for the CRASH trial (data not shown), also not indicating heterogeneity.
Our finding that between-centre differences were not explained by patient characteristics corresponds to previous studies in TBI.8 Part of the between-centre differences were actually between-county differences. This could be an indication of centre-differences being caused by structural differences between countries such as availability of resources and organisation of trauma care. The exact explanation of outcome differences between centres and countries requires further study.
Our study has some limitations. First, we did not consider differences in data quality between the centres, which might affect the estimated treatment effect [7
]. Second, the CRASH might be considered an exception in the sense that the treatment was harmful. However, it is unlikely that our results would depend on the direction of the treatment effect.