These trials might not have been so necessary if better use had been made of existing evidence. Analyses suggested selection bias in observational studies of use of hormone replacement therapy,5
and this was supported by many similar analyses. Also, in 1980 the Coronary Drug Project, a double blind secondary coronary prevention trial among men, had shown that compliers to placebo had a highly significant reduced relative risk of death from coronary heart disease over five year (0.64) compared with men who did not comply.6
This protection remained even after 40 baseline variables had been adjusted for, and remains difficult to explain. Those who take drugs are compliers, and thus compliance bias could be important. None the less, enthusiasts for hormone replacement therapy thought these doubts had little relevance because most cohort studies suggested such large benefits.
We remained concerned about the validity of the results. Therefore, well before the publication of the Women's Health Initiative trial results, we retrieved and analysed the available randomised studies of the short term efficacy of various hormone replacement therapies.7
Many of these studies were used to provide evidence for licensing. We chose randomised studies with a non-hormonal control, three or more months of treatment, and mention of adverse events, including cardiovascular episodes by allocation.
Twenty three trials met the criteria and included a total of around 2000 women allocated to hormone replacement therapy and 1300 to control treatment. A higher proportion of the women taking hormone replacement therapy had cardiovascular events than women in the control groups. Crude estimates put the relative risk at around 1.39 for cardiovascular outcomes and 1.64 for outcomes including thromboembolic events, neither of which was significant. However, if the true relative risk was actually 0.5 for use of hormone replacement therapy, these estimates were both significantly different. This suggested that hormone replacement therapy was not as protective as the observational data had shown.
When we published these findings in 1997, we were ridiculed.8
“For one, I shall continue to tell my patients that hormone replacement therapy is likely to help prevent coronary disease,” asserted one expert commentator.9
Critics claimed that the choice of trials was selective, the quality of trials was inadequate, and the follow up too short. Against all the observational evidence, these results just looked wrong.
We sought to improve the methods by seeking unpublished randomised licensing data using the same criteria as for published data. We were able to obtain data in Finland, ultimately by resorting to the High Court, which rejected objections from the companies to the Ministry of Health.10
Apparently obtaining such data would not be possible in the United Kingdom (Michael Rawlins, personal communication 2003).11
When the extra data from the six unpublished studies that met our criteria were added, the pooled relative risk for cardiovascular events increased to 1.78. We tested this against a protective relative risk value of 0.7 and 0.5, and it was significant in both cases. The evidence now hinted at publication bias; the relative risk in the unpublished trials was around 4.25 for cardiovascular events. Altogether 29 of the 200 existing trials (15%) provided useful information; 30% of the 200 trials had reasonable controls, but only 4% properly recorded cardiovascular events. We often had to retrieve this information from data sheets.
Our 1997 results agreed well with the those of the Women's Health Initiative primary prevention trial, which reported in 2002 an overall relative risk for coronary heart disease with current use of hormone replacement therapy of 1.29 (95% confidence interval 1.07 to 1.85). Beral et al's overview of primary prevention trials and secondary prevention trials,12
which omitted the small trials we had used, estimated the effect of hormone replacement therapy to be 1.11 (0.96 to 1.30). Since the relative risk of coronary heart disease in the first year of the Women's Health Initiative study was also 1.78 (later revised to 1.8113
), our results can no longer be accused of being systematically biased because of the low proportion of satisfactory trials we were able to include.
Efficacy of new drugs has to be proved in randomised trials
Recording of rare adverse events is currently haphazard and unreliable in efficacy trials
Many of these trials are not in the public research domain
Systematic synthesis of trials with reliable recording of adverse events would enable earlier detection of unexpected effects
Regulators should require drug manufacturers to record adverse effects and make the results public