We recommend alternatives to the power-based approach for choosing and justifying sample sizes for early studies of new ideas. An established alternative in the statistical literature is the “value of information” approach (4
), but this has rarely been used in practice. It calculates both the cost and value that can be expected over a range of sample sizes, choosing the one that maximizes value minus cost. The methods, however, may be too complex for frequent use with early studies.
A more recent proposal is to examine nine possibilities for the estimated effect and confidence interval that a study may produce (1
). These possibilities result from three different possible effect sizes (the one that is expected or hoped for, no effect, and a possibility in between) and three levels of background uncertainty (best guess, high uncertainty, and low uncertainty). A proposal then discusses how valuable the study would be under each possibility.
Another recently developed approach focuses on costs and diminishing returns to select sample sizes that cannot be validly criticized as “inadequate”, because they produce more projected value per dollar spent than any larger sample size (2
). Although the projected value of a study is difficult to quantify, specific projections are not needed for these methods: they rely only on the fact that projected value has diminishing marginal returns, which holds true regardless of how projected value is quantified (2
). For early studies, statistical arguments show that such a sample size is a choice called nroot
. This is determined by examining the projected total cost of the study at different possible sample sizes and then choosing the sample size that minimizes the ratio of total cost to the square root of the sample size. A spreadsheet performing the calculations is available as a supplemental file for reference (1
). The calculations are particularly simple if the total study cost is the sum of fixed costs, which do not depend on sample size, plus a set cost per subject. In such a linear cost case, nroot
is equal to the total fixed study costs divided by the incremental cost per subject.
Using nroot to plan a sample size can appropriately adapt to the cost concerns of innovative studies. Higher incremental costs per subject reduce nroot, reflecting the practical reality that very high per-subject costs make smaller sample sizes more desirable. When costs are extreme, nroot may be very small (even N=1), but in such cases larger sample sizes will often be impractical. Reviewers and funders must then evaluate whether a small and expensive study is worthwhile.
While we recommend using nroot
for innovative studies, sample sizes larger than nroot
might also be justifiable, especially when the idea is promising enough that the money spent to increase sample size above nroot
produces added projected value that exceeds the added expense (22
). Large studies may be practical for some innovative ideas, and we advocate more tolerance regarding sample size rather than a new form of knee-jerk intolerance directed at highly innovative studies with large sample sizes.
Finally, although the preceding approaches are valid and worthy of consideration, investigators may simply choose a sample size that has worked well for similar past studies. This approach lacks a theoretical justification but appeals to common sense. Importantly, such choices will generally seem reasonable and be affordable, making them a suitable solution in many cases. These common sense sample size determinations are familiar to most who do these types of studies, but they are difficult to justify to reviewers and regulators looking for an approach that seems more rigorous and objective.