We identified 70 parallel group randomised trials that received ethics approval in 1994-5 and were subsequently published (table 2).9
The median publication year was 1999 (range 1995-2003). Fifty two of the trials evaluated drugs, and 56 were funded in full or in part by industry. Most trials involved two study arms (n=47), multiple centres (46), and some form of blinding (49). The median achieved sample size per study arm was 66 (10th-90th centile range 13-324). The most common specialty fields were endocrinology (n=11), anaesthesiology (5), cardiology (5), infectious diseases (5), and oncology (5).
Table 2 Characteristics of published parallel group randomised trials
Sixty nine trials were designed and reported as superiority trials. One trial was stated to be an equivalence trial in the protocol but reported as a superiority trial in the publication; no explanation was given for the change.
Sample size calculation
Overall, only 11 trials fully and consistently reported all of the requisite components of the sample size calculation in both the protocol and the publication.
Completeness of reporting—An a priori sample size calculation was reported for 62 trials; 28 were described only in the protocol and 34 in both the protocol and the publication. Thirty seven protocols and 21 publications reported all of the components of the sample size calculation (figure). Individual components were reported in 74-100% of protocols and 48-75% of publications (table 1). Nine protocols provided only the calculated sample size without any further details about the calculation. Among trials that reported an estimated minimum clinically important effect size (delta), 20/53 protocols and 10/33 publications stated the basis on which the figure was derived.
Reporting of sample size calculations and data analyses in publications compared with protocols
Comparison of calculated and actual sample sizes—Sixty two trials provided a calculated sample size in the protocol. Of these, 30 subsequently recruited a sample size within 10% of the calculated figure from the protocol; 22 trials randomised at least 10% fewer participants than planned as a result of early stopping (n=3), poor recruitment (2), and unspecified reasons (17); and 10 trials randomised at least 10% more participants than planned as a result of lower than anticipated average age (1), a higher than expected recruitment rate (1), and unspecified reasons (8). A calculated sample size was as likely to be reported accurately in the publication if there was a discrepancy with the actual sample size compared with no discrepancy (11/32 v 14/30).
Discrepancies between publications and protocols—Both the publications and the protocols for 34 trials described a sample size calculation. Overall, we noted discrepancies in at least one component of the published sample size calculation when compared with the protocol for 18 trials (figure). Publications for eight trials reported components that had not been pre-specified in the protocol, and 16 had explicit discrepancies between information contained in the publication and protocol (table 3, box 2). None of the publications mentioned any amendments to the original sample size calculation.
Table 3 Discrepancies in sample size calculations reported in trial publications compared with protocols
Anonymised examples of unacknowledged discrepancies in sample size calculations and statistical analyses reported in publications compared with protocols
Sample size calculation Changed delta (1)
- Outcome: disease progression or death rate
- Protocol: delta 10%; event rates unspecified
- Publication: delta 6%; event rates 16% and 10%
Changed delta (2)
- Outcome: mean number of active joints
- Protocol: delta 2.5 joints
- Publication: delta 5 joints
Changed standard deviation
- Outcome: mean symptom score
- Protocol: 1.4
- Publication: 0.49
- Outcome: survival without disease progression
- Protocol: 90%
- Publication: 80%
Changed sample size estimate
- Outcome: thromboembolic complication rate
- Protocol: 2200
- Publication: 1500
Statistical analyses Changed primary outcome analysis
- Outcome: global disease assessment
- Protocol: χ2 test
- Publication: analysis of covariance
New subgroups added to publication
- Outcome: time to progression or death
- Protocol: baseline disease severity
- Publication: duration of previous treatment*, type of previous treatment*, blood count*, disease severity
Omitted covariates for adjusted analysis in publication
- Outcome: neurological score at six months
- Protocol: baseline neurological score, pupil reaction, age, CT scan classification, shock, haemorrhage
- Publication: no adjusted analysis reported
*Described explicitly as pre-specified despite not appearing in the protocol
The specific method of handling protocol deviations in the primary statistical analysis (as defined in box 1) was named or described in 37 protocols and 43 publications (figure). Overall, the primary method described for handling protocol deviations in the publication differed from that described in the protocol for 19/43 trials; table 4 provides details. None of these discrepancies was acknowledged in the journal publication.
Table 4 Discrepancies in primary method of handling protocol deviations, as reported in publications compared with protocols
Thirty protocols and 33 publications used the term “intention to treat” analysis and applied a variety of definitions (table 5). Few of these protocols (n=7) and publications (3) made it explicit whether study participants were analysed in the group to which they were originally randomised. Most protocols (22) and publications (18) incorrectly excluded participants from the intention to treat analysis for reasons other than loss to follow-up (table 5).
Table 5 Definitions of “intention to treat” analysis used in protocols and publications
The method of handling missing data was described in only 16 protocols and 49 publications (figure). Methods reported in publications differed from the protocol for 39/49 trials. Published methods were often not pre-specified in the protocol (38/49). For one trial, the protocol stipulated that missing data would be counted as failures, whereas in the publication they were excluded from the analysis.
Primary outcome analysis and overall number of tests
Fifty four trials designated at least one outcome measure as primary in the protocol (n=49) or publication (43). The statistical method for analysing the primary outcome measure was described in 39 protocols and 42 publications. Overall, 25 publications that described the statistical test for primary outcome measures differed from the protocol (figure, box 2).
The median number of between group statistical tests defined in 44 protocols was 30 (10th-90th centile range 8-218); the other 26 protocols contained insufficient statistical detail. Publications for all 70 trials reported a median of 22 (8-71) tests. Half of the protocols (n=36) and publications (34) did not define whether hypothesis testing was one or two sided. Interestingly, we found one neurology trial that used two sided P values in one publication (all P values >0.1) and a one sided P value in another (P=0.028).
Overall, 25 trials described subgroup analyses in the protocol (n=13) or publication (20). All had discrepancies between the two documents (figure, box 2). Twelve of the trials with protocol specified analyses reported only some (n=7) or none (5) in the publication. Nineteen of the trials with published subgroup analyses reported at least one that was not pre-specified in the protocol. Protocols for 12 of these trials specified no subgroup analyses, whereas seven specified some but not all of the published analyses. Only seven publications explicitly stated whether the analyses were defined a priori; four of these trials claimed that the subgroup analyses were pre-specified even though they did not appear in the protocol.
Overall, 28 trials described adjusted analyses in the protocol (n=18) or publication (18). Of these, 23 had discrepancies between the two documents (figure, box 2). Twelve of the trials with protocol specified covariates reported no adjustment (n=10) or omitted at least one pre-specified covariate (2) from the published analysis. Twelve of the trials with published adjusted analyses used covariates that were not pre-specified in the protocol. Ten of these trials did not mention any adjusted analysis in the protocol, whereas two trials added new covariates to those specified in the protocol. Publications for only one trial explicitly stated whether the covariates were defined a priori.
Interim analyses and data monitoring boards
Interim analyses were described in 13 protocols, but reported in only five corresponding publications. An additional two trials reported interim analyses in the publications, despite the protocol explicitly stating that there would be none. A data monitoring board was described in 12 protocols but in only five of the corresponding publications.