To assess the effect of interpretive bias in trials we hand searched studies published in the BMJ during 2002 to 2006 for trials that showed a non-significant difference in the opposite direction of that hypothesised. Two researchers (CEH, NM) identified the papers with a P value above 0.05 and below 0.3 on the primary outcome, which they agreed between them. Our choice of the limits for the P value was arbitrary and was driven by our decision to identify trials where there was an unexpected difference that could potentially be important and not be statistically significant because of lack of statistical power (type II error).
The decision to use a P value of 0.05 or a 95% confidence interval to determine statistical significance is arbitrary but widely accepted.4
Ideally, we should judge the findings of a study not only on its statistical significance but in terms of its relative harms and benefits. Statistical significance is important, however, to guide us in the interpretation of a study’s results.
We found 17 papers where there was a difference between the two groups and this difference had a P value of between 0.05 and 0.30. Of these 17 trials, seven (table 1) showed differences in the opposite direction to that specified by the hypothesis.
Table 1 Trials with negative non-significant results published in BMJ, 2002-6
We calculated three confidence intervals for each identified trial: 95%, 67%, and 51%. We chose 67% as this is half of 95% (that is, the z value for the 67% confidence interval is about half the z value for the 95% interval) and 51% because this range shows where, more often than not, the true treatment estimate will lie. Obviously, each value within the confidence interval is not equally plausible. Values that are close to the point estimate are more likely to correspond to the true value than estimates towards the extreme of the confidence interval.
We used the information in the box in each paper entitled “What this study adds” to determine whether the authors recommended the intervention. We then assessed the data in the paper and used the three confidence intervals to make our recommendation. The authors seem to recommend that the intervention should or could be used in four studies (table 2). We disagreed with this conclusion for three of these studies and were unsure for the other one, as discussed below.
Table 2 Interpretation of trials with non-significant negative results published in BMJ
Sex education programme for 13-15 year olds
Twenty five schools in Scotland were randomised to receive either normal sex education or an enhanced package.5
The trial was powered to show a 33% reduction in termination rates and had over 99% follow-up after 4.5 years. The intervention schools had an increase of 15.7 terminations per 1000 compared with the control schools (P=0.26). Although the 95% confidence intervals did not exclude an 11% decrease in terminations, they included a 42% increase in terminations. The 67% confidence intervals did not pass through zero, thus on balance the intervention was more likely to be associated with an increase in terminations than a decrease. The cost of the intervention was up to 45 times greater than usual sex education.
To support use of the intervention the authors refer to an earlier report that “pupils and teachers preferred the SHARE programme . . . It also increased pupils’ knowledge of sexual health . . . and had a small but beneficial effect on beliefs about alternatives to sexual intercourse and intentions to resist unwanted sexual activities and to discuss condoms with partners.” Although the authors admit that the programme “was not more effective than conventional provision,” they do not discuss the possibility that the increase in termination rates might be real and that the programme should be withdrawn until further research supported its implementation. Indeed, the Scottish Executive supports its use in Scottish schools.
Providing free child safety equipment to prevent injuries
A total of 3428 families were randomised to provide 80% power to show a 10% reduction in medically attended injuries.9
Free safety equipment was offered to families living in deprived areas along with advice from health visitors. Data on injuries attended in primary care were available for >80% of participants and secondary care >92%. There was an increased risk of having medically attended injuries in the intervention group (P=0.08). The 67% confidence intervals suggested that on balance the most likely value for the true effect is to increase the risk of injuries. The intervention is associated with increased cost and increased risk.
Despite this, the authors seem to use proxy measures of outcome as justification for the intervention: “Our findings in relation to safety practices and degrees of satisfaction are encouraging for safety equipment schemes such as those organised by SureStart.” The authors also note that it was unlikely that intervention would not reduce injury rates because “several observational studies have shown a lower risk of injury among people with a range of safety practices.” Observational studies are potentially biased, which is one of the main reasons we do randomised trials. It is, therefore, surprising to seek reassurance from non-randomised data when a randomised trial shows the “wrong” result. The authors suggest that bias could have been introduced because of differential raised parental awareness, although they acknowledge that the intervention could have increased injury through the process of risk compensation.
Oral misoprostol for induction of labour
In this trial, 741 pregnant women with an indication for prostaglandin induction of labour were randomised to oral misoprostol or vaginal dinoprostone gel.10
The trial was powered to show a 30% difference in vaginal birth after 24 hours. Follow-up rates were 100% in both groups, allocated treatment adherence was greater than 99%. 46% of women in the oral misoprostol group did not achieve a vaginal birth within 24 hours compared with 41% of the vaginal dinoprostone group. The 95% confidence intervals suggested, at best, the intervention could be associated with 0.95 relative risk improvement.
The authors stated that there was no difference between the two treatments but women preferred oral treatment. However, the 67% confidence interval was significant, suggesting that oral treatment increased the risk of delayed vaginal birth. We could not make a definite recommendation because the risk of caesarean section was reduced for the intervention group (0.82, P=0.13), and the 67% confidence interval (0.73 to 0.91) on this outcome favours the intervention.
Lidocaine spray to reduce pain during vaginal delivery
This trial randomised 185 women to receive a topically applied anaesthetic spray or placebo.11
The primary outcome was pain during delivery. Follow-up was 100% at delivery. The pain on delivery was increased by 4.8 points in the intervention group, although the 95% confidence intervals suggested that it could reduce pain by 1.7 points or increase it by 11.2 points. The 67% interval suggested that the true difference was an increase in pain. An adjusted analysis suggested a bigger difference in pain scores. Therefore, this intervention should not be used.