Funding agencies, ethics review boards, journals, and investigators are often preoccupied with power calculations and sample sizes required in clinical trials. We argue that the current practice of sample size justification for randomized clinical trials (RCTs) represents a willing self-deception. Recognizing and adjusting to current realities of RCT conduct may be necessary.
In the high-income nations where most RCTs are organized and funded, chronic diseases are responsible for most morbidity and mortality. In these conditions, multiple pathogenic and behavioral mechanisms determine outcome. Thus, we can anticipate only small to moderate treatment effects from therapies that address only one or two mechanisms. Furthermore, events often occur over a prolonged period of time.
Investigators organizing clinical trials therefore face daunting obstacles. Providing definitive answers in the face of low event rates and small-to-moderate treatment effects necessitates sample sizes in the thousands or tens of thousands. Organizing trials that will enroll such samples involves enormous challenges, as does monitoring the quality of enrollment and data collection once the trials begin. Funding for such mega-trials is very limited, and is often restricted to industry sources.
Even very large trials often produce results that are far from definitive. For instance, the CAPRIE trial (Clopidogrel versus Aspirin in Patients at Risk of Ischemic Events) addressed the relative merits of these two drugs in over 19,000 patients with atherosclerotic vascular disease. The confidence interval around the statistically significant reduction in vascular events with clopidogrel included a lower boundary (0.3% relative risk reduction) that would preclude administration of this expensive drug [1
]. Furthermore, the results of even very large trials may prove discrepant with one another [2
Thus, it is seldom that single trials, even very large ones, provide definitive answers. The scientific community has appropriately accepted that only systematic reviews and meta-analyses combining high-quality evidence from many RCTs will yield robust answers. Individual trials are best viewed as providing important information that contributes to the larger body of evidence.
The clinical trial community responds to these problems with a variety of understandable, pragmatic strategies that nevertheless involve a degree of denial. Investigators typically decide how many patients they can feasibly enroll and then find ways of making assumptions that will justify embarking on a trial with a feasible sample size. These assumptions typically involve choosing a level of delta (the threshold effect below which they are ready to accept a false negative result) that exceeds the minimum effect many patients would consider important.
Other popular strategies are even more problematic. Investigators choose composite endpoints that include a wide range of components that would be important to patients, creating a high risk of misleading interpretation [3
]. Investigators focus on outcomes that are more frequent, but less important: for instance, in patients with diabetes, crossing a threshold of serum glucose, or earlier need for a second medication, rather than complications of illness such as major vascular events, neuropathy, or visual impairment.
Perhaps an even more damaging consequence of the unrealistic insistence that individual trials be powered to produce definitive results is that RCTs that would contribute to the body of knowledge are never undertaken. It is unclear how many potential trialists abandon conduct of a trial when they confront its sample size implications, or when they face demands from funding agencies and review committees regarding the sample size they must generate. Our experience in the world of clinical investigation suggests, however, that a large number of potential trials get abandoned. The result is that questions that could ultimately be resolved by a systematic review and meta-analysis remain unanswered, or inadequately answered [4
How can we resolve this dilemma? Peer-reviewed granting agencies should cease to ask the question “Is this trial powered to definitively answer its primary question?” Rather, they should ask a series of more appropriate alternative questions. First, how important is the issue the investigators propose to address? Are other groups throughout the world investigating the same, or similar questions? How much trial funding is the agency willing to provide? Within the limits of what the agency is willing to fund, have the investigators gone to appropriate lengths to include collaboration that will maximize the number of patients they will be able to enroll?
One could question whether it is ethical to enroll patients in a trial that makes no pretense of definitively answering a question. Indeed, some have asserted that underpowered trials are unethical [5
]. But is it not ethical to contribute to a body of knowledge that will ultimately lead to a definitive answer? Is it not unethical to tell patients that a trial will be definitive when that is very unlikely? Is it not more ethical to provide patients with a realistic notion of the contribution of the trial in which they may participate: that it will be one of a number of such studies that will ultimately resolve the issue?
What will result from the conceptual shift we propose? First, agencies and investigators will undertake more RCTs, and evidence for important questions will accumulate more quickly. Second, investigators will be less tempted to stretch their resources and capacities for quality control, and validity of RCTs will improve. Third, when we abandon the current delusion that sample size matters, our minds will open to new strategies for efficiently obtaining crucial evidence for important health issues.