Figure 1 summarises selection of articles. The electronic search yielded 1070 citations, including 281 reports of parallel group superiority randomised controlled trials with two arms. We selected 215 articles (appendix 2) that reported only one primary outcome.
Fig 1 Study screening process
Description of trials
Table 1 describes the characteristics of the included studies. The median sample size of the trials was 425 (interquartile range [IQR] 158-1041), and 112 reports (52.1%) claimed significant results for the primary endpoint. Seventy-six percent were multicentre trials, with a median of 23 centres (IQR 7-59). The three most frequent medical areas of investigation were cardiovascular diseases (26%; n=56 articles), infectious diseases (11%; n=24), and haematology and oncology (10%; n=22). Interobserver agreement in extracting the data from reports was good; κ coefficients ranged from 0.76 to 1.00.
Table 1 Characteristics of 215 included studies
Reporting of required parameters for a priori sample size calculation
Ten articles (5%) did not report any sample size calculation. Only 113 (53%) reported all the required parameters for the calculation. Table 2 describes the reporting of necessary parameters for sample size calculation.
Table 2 Reporting of parameters required for a priori sample size calculation for the 215 articles
The median of the expected treatment effect for dichotomous or time to event outcomes (relative difference of event rates) was 33.3% (IQR 24.8-50.0) and the median of the expected effect size for continuous outcomes was 0.53 (0.40-0.69) (fig 2).
Fig 2 Histogram of assumptions of treatment effect. For dichotomous and time to event outcomes: relative difference of event rates (larger rate minus smaller rate, divided by rate in control group). For continuous outcomes: standardised effect size.
The design of 35 of the 215 trials (16%) was described elsewhere. In two, the primary outcome described in the report differed from that in the design article. In 31 articles (89%), the data for sample size calculation were given. For 16 articles (52%) the reporting of the assumptions differed from the design article.
Reporting of sample size calculation in online trial registration database
Of the 215 selected articles, 113 (53%) reported registration of the trial in an online database. Among them, 87 (77%) were registered in ClinicalTrials.gov, 23 (20%) in controlled-trials.com (ISRCTN registry), and three (3%) in another database. For 96 articles (85%), an expected sample size was given in the online database and was equal to the target sample size reported in the article in 46 of these articles (48%). The relative difference between the registered and reported sample size was greater than 10% in 18 articles (19%) and greater than 20% in five articles (5%). The parameters for the sample size calculation were not stated in the online registration databases for any of the trials.
Replication of sample size calculation
We were able to replicate sample size calculations for 164 articles: 113 reported all the required parameters, and 51 that omitted only the α risk or whether the test was one or two tailed. We were able to compare our recalculated sample size and the target sample size for 157 articles, since seven did not report any target sample size. The sample size recalculation was equal to the authors’ target sample size for 27 articles (17%) and close (absolute value of the difference <5%) for 76 (48%). The absolute value of the difference between the replicated sample size calculation and the authors’ target sample size was greater than 10% for 47 articles (30%) and greater than 50% for 10 (6%). Twenty-eight recalculations (18%) were 10% lower than reported sample size, and 19 recalculations (12%) were larger than reported sample size (fig 3). The results were similar when we analysed only the 113 articles reporting all the required parameters.
Fig 3 Differences between target sample size and replicated sample size calculations. Differences in sample size calculations are relative differences between target sample size given in materials and methods section of articles and our recalculation (more ...)
Comparisons between a priori parameters and corresponding estimates in results section
A comparison between the a priori assumptions and observed data was feasible for 145 of the 157 articles reporting enough parameters to recalculate the sample size and reporting the results of the authors’ calculations.
Assumptions about control group
The median relative difference between the control group pre-specified parameters and their estimates was 3.3% (IQR −16.7 to 21.4). The median difference was 2.0% (−15 to 21) for dichotomous or time to event outcomes and 11% (−24 to 27) for continuous outcomes. The absolute value of the relative difference was greater than 30% for 45 articles (31%) and greater than 50% for 24 (17%). Figure 4 shows that the differences between the assumptions and the results were large and small in roughly even proportions, whether the results were significant or not. The size of the trial and the differences between the assumptions for the control group and the results did not seem to be substantially related (rho=0.03, 95% confidence interval −0.05 to 0.15).
Fig 4 Relative differences between assumptions and results for control groups
Overall, 73 articles (34%) reported enough parameters for us to replicate the sample size calculation, had an accurate calculation (the replicated sample size calculation differed by less than 10% from the reported target sample size), and had accurate assumptions for the control group (the differences between the a priori assumptions and their estimates was less than 30%) (fig 5).
Fig 5 Articles selected for analysis of sample size calculations