PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Clin Trials. Author manuscript; available in PMC 2011 August 1.
Published in final edited form as:
PMCID: PMC3085081
NIHMSID: NIHMS288455

Calculating Sample Size in Trials Using Historical Controls

Song Zhang
Department of Clinical Sciences UT Southwestern Medical Center Dallas, TX
Jing Cao
Department of Statistical Science Southern Methodist University Dallas, TX

Abstract

Background

Makuch and Simon [1] developed a sample size formula for historical control trials. When assessing power, they assumed the true control treatment effect to be equal to the observed effect from the historical control group. Many researchers have pointed out that the M-S approach does not preserve the nominal power and type I error when considering the uncertainty in the true historical control treatment effect.

Purpose

To develop a sample size formula that properly accounts for the underlying randomness in the observations from the historical control group.

Methods

We reveal the extremely skewed nature in the distributions of power and type I error, obtained over all the random realizations of the historical control data. The skewness motivates us to derive a sample size formula that controls the percentiles, instead of the means, of the power and type I error.

Results

A closed-form sample size formula is developed to control arbitrary percentiles of power and type I error for historical control trials. A simulation study further demonstrates that this approach preserves the operational characteristics in a more realistic scenario where the population variances are unknown and replaced by sample variances.

Limitations

The closed-form sample size formula is derived for continuous outcomes. The formula is more complicated for binary or survival time outcomes.

Conclusions

We have derived a closed-form sample size formula that controls the percentiles instead of means of power and type I error in historical control trials, which have extremely skewed distributions over all the possible realizations of historical control data.

Keywords: clinical trial design, sample size, historical controls, percentiles of type I error and power

1 Introduction

Randomized clinical trials (RCT) have become the gold standard in comparing the effects between treatments. Despite the rigorous scientific basis, there are situations where RCTs are infeasible due to concerns of ethics, patient preference, cost, and regulatory acceptability. For example, the resources required by an RCT might be prohibitive for some phase II trials which are intended to obtain preliminary data on the effectiveness of a new treatment [2]. Another example is that when evidence already exists showing the superiority of a new treatment over the standard one, it might be unethical for a RCT to assign patients to a potentially inferior treatment. One solution is to use a historical control trial (HCT), where the experimental therapy is compared with a control therapy (referred to as historical control or HC) that has been evaluated in a previously conducted trial. Because an HCT can be smaller in size and easier to conduct, it has been widely applied in clinical research [3, 4, 5, 6, 7, 8, 9].

Makuch and Simon [1] developed a sample size formula for HCTs with a binary outcome. In power calculation, they assumed that the observed response rate from the HC group was the true control response rate. Their formula was based on the two-sample test statistic employed in RCTs but the power calculation only accounted for the sampling variability in the experimental group. The sample size solution was obtained through a numerical search. Using a similar idea, Dixon and Simon [10] provided a sample size formula for HCTs with exponential survival outcomes. Chang et al. [11] presented a two-stage design for phase II clinical trials with HC and continuous outcomes. More discussions about the HCT sample size calculation can be found in [12] and [13].

The estimated sample size for an HCT is usually much smaller than that required by an RCT. Lee and Tseng [14] pointed out that the sample size reduction in HCT is largely unjustified due to the strong assumption that the observed historical control response rate is equal to the true control response rate. They proposed a uniform power method to control the expected power, taking into account the uncertainty in the HC response rate. The resulting sample size is closer to the RCT sample size than the one based on Makuch and Simmon's method (M-S). Korn and Freidlin [15] compared three approaches to HCT design: M-S approach, RCT approach, and one-sample approach (based on a one-sample test that the experimental treatment effect is greater than the observed HC treatment effect). The authors suggested to adopt the RCT approach because it preserves the unconditional power over the random HC observations.

In this study, we investigate the sample size calculation for HCT with continuous outcomes, accounting for uncertainty caused by the unknown true HC treatment effect. We provide a unified framework for the M-S, RCT and one-sample approaches, where they are shown to either control the mean or the median of the random power and type I error, obtained over all the possible realizations of the HC data given the true HC effect. We further demonstrate through simulation that the distributions of power and type I error are extremely skewed. This extreme skewness leads to undesirable properties of sample sizes calculated to control the means of power and type I error. One revealing example in our simulation is that with the mean power controlled at 0.8, a slight decrease in the mean type I error from 0.06 to 0.05 leads to a drastic increase in sample size from 286 to 487. This observation motivates us to develop a sample size formula that controls the percentiles, instead of the means, of the random power and typer I error. To our knowledge, it is the first study in HCT design to demonstrate the extreme skewness in the distributions of power and type I error, and to estimate sample size based on the percentiles of power and type I error. It provides researchers a sensible way to assess the risk in an HCT. The proposed formula has a closed form, which can be easily computed using a scientific calculator.

The rest of this paper is organized as follows. In Section 2 we review the three different approaches (M-S, RCT, and one-sample) to sample size calculation in HCT under a unified framework. A simulation study is conducted to demonstrate the extreme skewness in the distributions of power and type I error. In Section 3 we present a sample size formula to control arbitrary percentiles of power and type I error. We evaluate its performance through simulation in two scenarios: an ideal scenario (population variances known) and a more realistic scenario (population variances unknown). In Section 4 we provide a real application of the proposed method. The final section is devoted to discussion.

2 A Unified Framework

We briefly review the M-S, RCT, and one-sample approaches to HCT sample size calculation. Suppose in a clinical trial we compare the outcomes between an experimental group and an HC group. The outcome variable is continuous following a normal distribution. Let equation M1 be the m observations from the HC group, and equation M2 be the n observations from the experimental group. We define Y = {Y1, (...), Ym} and X = {X1, (...), Xn}. The variances equation M3 and equation M4 are assumed to be known. With null hypothesis H0: θ1 = θ0 and alternative hypothesis H1: θ1 > θ0, the standard test statistic is

equation M5
(1)

where equation M6 and equation M7 are the sample means from the two groups. Given type I error α, power 1 − β, m, equation M8, equation M9, and difference in treatment effects Δ = θ1 − θ0, the sample size n is obtained by solving

equation M10
(2)

where z1−α is the 100(1 − α)th percentile of the standard normal distribution. Here and in the rest of the paper we do not differentiate the estimated sample size or the solution to the sample size equations, with the understanding that the final sample size is the smallest integer greater than or equal to the solution.

In the M-S approach, the Yjs from the HC group are considered not subject to sampling variability because they have been observed before the clinical trial. This consideration leads to the following manipulation of (2),

equation M11
(3)

where Φ(·) is the cumulative distribution function of the standard normal distribution. Thus we find n by solving

equation M12
(4)

Since the true HC effect (θ0) is usually unknown, it is cancelled out in the equation by assuming equation M13. This is a strong assumption especially when the number of HC observations is limited. Traditionally Equation (4) has been solved through a numeric search [11]. Here we present a closed-form solution and a sufficient and necessary condition for its existence.

Theorem 1. Define equation M14, equation M15, and equation M16. In clinical trials we usually specify α and β such that α ≤ β. Equation (4) has a unique sample size solution if and only if equation M17, and the solution is

equation M18
(5)

Proof. See Appendix A.1.

Theorem 1 helps researchers avoid time-consuming numerical search. It also points out a potential pitfall in the M-S approach where too small an assumed difference in the treatment effects (equation M19) would lead to no solution for the sample size.

In the one-sample approach, the hypotheses are specified as equation M20 and equation M21 based on the assumption that equation M22. The one-sample test statistic equation M23 is employed. The sample size estimate is equation M24.

In the RCT approach, the HC group is treated as a regular control arm in an RCT. The sample size is estimated by equation M25, which is based on a two-sample test.

The three approaches produce drastically different sample size estimates. For example, given m = 80, equation M26, Δ = 0.3, α = 0.05, and 1 − β = 0.8, the estimated sample sizes are n0 = 144, n1 = 69, and n2 = 487, respectively. The formulas of n0, n1 and n2 do not depend on the specific values of HC sample mean (equation M27) or true mean (θ0). The test statistics, however, are calculated based on the HC sample mean. In a particular study, the unknown difference between equation M28 and θ0 has a great impact on the realized power and type I error. We conduct Simulation 1 to compare the performance of n0, n1 and n2. Details of the simulation algorithm are presented in Appendix A.2.

The realization of a particular HC data (Y(k)) leads to a conditional power (equation M29) and a conditional type I error (equation M30) under sample size nv. Over the random realizations of Y, we have a random power, qv, and a random type I error, hv. The distributions of qv and hv provide a global view of the variability in power and type I error for HCTs given an unknown HC treatment effect.

Without loss of generality, we set the true HC effect at θ0 = 0. We also assume m = 80, equation M31, Δ= 0.3, α = 0.05, and 1 − β = 0.8. Figure 1 shows the results of Simulation 1. The graphs in the first column indicate that both the conditional power and type I error decrease monotonically as the difference between the observed and true HC treatment effect, (equation M32), increases. Table 1 lists the conditional power and type I errors given Y(k), with equation M33 changing between two standard errors below and above θ0. For example, under n0, when the observed HC effect (equation M34) is one standard error away (equation M35) from the true effect (θ0), the mean power changes from 0.313 to 0.987 and the mean type I error from 0 to 0.088, which deviate far from the nominal levels of 1 − β = 0.8 and α = 0.05. Note that such a deviation is not a rare event because for a particular HC data set, there is a 32% chance that the sample mean is one standard error or further away from its true mean. The second and third columns of Figure 1 show that the distributions of type I error and power are extremely skewed, which is also revealed by the difference between their means and medians (achieved at equation M36) in Table 1.

Figure 1
The type I errors (equation M156) and powers (equation M157) under n0, n1, and n2. The first column plots equation M158 and equation M159 versus the difference between the observed and true HC effects, equation M160, with black for equation M161 and red for equation M162. The second and third columns plot the histograms of hv and qv, respectively ...
Table 1
Simulation 1, Conditional Type I Errors and Powers Given Y(k)

We briefly explain why the power and type I error have skewed distributions over the random realizations of the HC data. Taking the one-sample approach (n1) for example, for a particular HC data Y(k), it can be shown that the conditional type I error is

equation M37

In the parentheses, z1−α is usually the dominant term and shifts the probability computation to the tail area of the normal distribution. As a result, although the sample mean (equation M38) is symmetric around the true mean (θ0), the impact of the sample mean being greater or smaller than the true mean is different. For equation M39, the conditional type I errors have a range of (0, α), which is narrow under commonly specified significance levels. On the other hand, for equation M40, the conditional type I errors have a much wider range, (α, 1). In summary, it is because researchers usually set power and significance level in the tail area (i.e., α close to 0 and 1 − β close to 1) that the random power and type I error are skewly distributed.

Finally, Table 1 provides empirical evidence for the theory presented in Theorem 2, which states a unified framework for nv (v = 0, 1, 2).

Theorem 2. The sample sizes (n0, n1, n2) control the random power and type I error in such a way that

  1. The M-S approach (n0) controls the mean of type I error at α and the median of power at 1 − β;
  2. The one-sample approach (n1) controls the medians of type I error and power at α and 1 − β, respectively;
  3. The RCT approach (n2) controls the means of type I error and power at α and 1 − β, respectively.

Proof. See Appendix A.3.

Theorem 2 suggests that the M-S approach tries to reach a compromise between the one-sample and RCT approaches by controlling the mean type I error at α, while the median power at 1 − β.

3 Sample Size Controlling the Percentiles

Simulation 1 shows that the distributions of power and type I error, observed over all the random realizations of HC data, are extremely skewed. For random variables with extremely skewed distributions, making decisions based on a location parameter such as a percentile is usually more desirable than based on the mean. We propose a sample size formula to control arbitrary percentiles of the random power and type I error. It provides a more sensible way to assess the risk in HCTs.

Theorem 3. Suppose in an HCT the goal is to control the (1 − pq)th percentile of the power at 1 − β, and the phth percentile of the type I error at α. Then the required sample size is

equation M41
(6)

and the null hypothesis is rejected if

equation M42

The parameters pq and ph can be specified arbitrarily as long as the condition equation M43 holds.

Proof. See Appendix A.4.

According to Theorem 3, let q* and h* be the random power and type I error under sample size n*. Then we have P(q* > 1 − β) = pq and P(h* < α) = ph. Suppose an HCT is conducted to assess the effectiveness of a new drug. Given a particular HC data set, the realized power and type I error depend on the random difference between the HC sample mean (equation M44) and its true mean (θ0). However, if the researchers enroll n* subjects, they can be confident that, over all the possible HC realizations, the power of the clinical trial would be greater than 1 − β with probability pq, and the type I error would be smaller than α with probability ph. It is easy to check that the sample size under the one-sample approach (n1) is a special case of n* at pq = ph = 0.5.

In other words, we propose sample size n* to achieve the goal that, the operational characteristics (realized power and type I error) of an HCT are more desirable than the nominal levels with certain pre-specified probabilities (pq and ph). Controlling the power and type I error by percentiles instead of means (the RCT approach) is more effective when their distributions are extremely skewed. Furthermore, the arbitrarily specified pq and ph provide flexibility in accommodating researchers' preference for risk control.

Based on the same setting as in Simulation 1, we conduct Simulation 2 to explore the properties of n*. We consider different combinations of pq and ph, ranging from 0.5 to 0.9. Table 2 lists the estimated sample size n*, the empirical means of type I error and power, equation M45 and equation M46, and the empirical percentiles, equation M47 and equation M48. From Table 2 we have two observations. First, for sample size calculation, pq and ph are exchangeable in the sense that switching their values leads to the identical sample size, which is also clear from Equation (6). Second, when the distributions of type I error and power are extremely skewed, it is more sensible to control the percentiles instead of means. For example, under (ph = 0.8, pq = 0.7), we are confident that by enrolling 286 patients, the type I error would be smaller than 0.05 with probability 0.8, and the power would be greater than 0.8 with probability 0.7. It provides a high assurance for researchers. Note that in this case the mean power is 0.8 but the mean type I error is 0.06, slightly higher than the nominal α = 0.05. As demonstrated in Table 1, the required sample size is n2 = 487 (under the RCT approach) if we control the mean power and mean type I error at 0.8 and 0.05, respectively. That is, in order to reduce the mean type I error from 0.06 to 0.05, researchers need to enroll 201 additional patients, due to the skewness in the type I error distribution.

Table 2
Simulation 2, Type I Errors and Powers under n*

In simulations 1 and 2, we have assumed the population variances of the HC and experimental groups (equation M49 and equation M50) to be known, which is usually unrealistic in practice. We conduct Simulation 3 to further access the performance of n* in a more realistic scenario. It proceeds as follows: a) To compute the required sample size n*, the assumed Δ and equation M51 will be plugged into Equation (6). However, equation M52 is replaced by equation M53, the HC sample variance. b) In hypothesis testing, we compute the test statistic Z* (X, Y) with equation M54 replaced by equation M55, and equation M56 replaced by equation M57, the sample variance in the experimental group. Thus the estimated sample size includes additional uncertainty from equation M58, and the test statistic includes additional uncertainty from equation M59 and equation M60. The detailed algorithm of Simulation 3 is presented in Appendix A.5.

Table 3 lists the results of Simulation 3. The sample size n* becomes random when we replace the HC population variance (equation M61) with a random sample variance (equation M62). When pq = ph = 0.5, the impact of the additional randomness is negligible. That is, after applying the integer restriction on sample size, the calculated n* is constant at 69, the same as its counterpart in Table 2. As pq or ph increases, the standard deviation of n* increases, and the mean of n* deviates from the fixed sample size (in Table 2) computed under the population variance. This is because under large pq and ph, the distribution of n* is heavily skewed to the right. Such skewness also arises from the tail behavior of the normal distribution. For example, under (pq = 0.9, ph = 0.8), the random sample size has a mean of 3065.59 and a standard deviation of 29348.36. Its distribution has an extremely long tail (the 99th percentile is greater than 200000). The skewness is much more severe under (pq = ph = 0.9), where we omit the simulation due to computer overflow. On the other hand, the operational characteristics of the clinical trial remain unchanged with additional randomness from the sample variances. Specifically, the realized controlling percentiles, equation M63 and equation M64, agree with the nominal levels. Furthermore, the means of power and type I error are close to those (in Table 2) obtained under the population variances. Taken together, the proposed sample size n* successfully controls the percentiles of power and type I error even when the population variances are unknown.

Table 3
Simulation 3, Type I Errors and Powers under n*

4 Example

The safety and effcacy of laparoscopic rectopoxy for rectal prolapse will be compared with those of open rectopexy procedure, which was conducted several months ago by the same group of surgeons at the same institution [16]. Data will be collected prospectively for the laparoscopic rectopoxy group and by hospital chart review for the HC group. The HC group includes 24 consecutive patients who had undergone conventional open rectopexy without having concomitant gynecologic procedures. These patients required an average of 71.5 milligrams of morphine during the first 48 hours after procedure with a standard deviation of 45.9 milligrams. It is expected that the average amount of morphine needed during the first 48 hours of laparoscopic rectopexy will be 41.5 milligrams with a standard deviation of 35.0 milligrams. We estimate the number of patients needed to detect the difference in morphine requirement during the first 48 hours between open and laparoscopic procedures, controlling the 70th percentile (ph = 0.7) of type I error at 5%, and the 30th percentile (pq = 0.7) of power at 80%. The numbers of patients needed in the laparaoscopic group are n0 = 14 (the M-S approach), n1 = 9 (the one-sample approach), n2 = 22 (the RCT approach), and n* = 19 (the proposed approach), respectively. Note that the above sample sizes are obtained by replacing the unknown HC population variance with the sample variance.

5 Discussion

We have provided a unified framework for three existing approaches (M-S, one-sample, and RCT) in HCTs, by showing that they either control the mean or the median of power and type I error. We further developed a closed-form sample size formula to control arbitrary percentiles of the random power and type I error. It provides more flexibility in assessing the risk in HCTs and accommodates the extreme skewness in the distributions of power and type I error. We limited our discussion to the HCTs with continuous outcomes. In the future we will extend it to HCTs with binary and survival time outcomes.

Similar to the existing approaches, the proposed sample size formula (n*) requires the population variances of the HC and the experimental group to be known. Through simulation study, we demonstrated that the proposed approach successfully controls the percentiles of power and type I error in a more realistic scenario, where the true variances are unknown and they are replaced with observed sample variances. One reviewer kindly pointed out that in situations where the measurements are continuous with bounded support, say on (a, b), a sample size formula can be derived without requiring the population variances. Specifically, we can define equation M65 with equation M66. Applying the arcsin transformation on equation M67, we can calculate sample size based on sin−1(equation M68), whose variance is free of the sampled population's true variance.

Lee and Tseng [14] presented sample size calculation for HCTs with binary outcomes controlling the means of power and type I error. Theorem 2 states that the same goal is achieved by n2 for HCTs with continuous outcomes. The computation in [14] is more complicated due to the transformation performed on binary data. For continuous outcomes, when the HC variance is assumed to be known, the sample size formula does not depend on observations from the HC group. Thus one pair of null and alternative hypotheses leads to one unique sample size estimate. For binary outcomes, the sample size formula computed under the arcsin transformation depends on the observations from the HC group. Thus one pair of hypotheses leads to many possible sample size estimates, each determined by a random realization of the HC data. In [14], the authors had to deal with the expectation of sample sizes.

The term (equation M69) in the numerator of Z*(X, Y) is the phth percentile of posterior distribution [θ0 | Y] under a flat prior, which suggests a potential connection of the proposed approach to a Bayesian sample size calculation. Nonetheless, in Appendix A.4., the derivation of n* is strictly in the frequentist paradigm, where the randomness of type I error and power comes from the uncertainty in the HC data Y, not from random variable θ0 (as in a Bayesian method). For example, we set type I error h* = α at equation M70, the (1 − ph)th percentile of equation M71. Because h* is monotonically decreasing in equation M72, the phth percentile of h* is controlled at α.

6 Acknowledgments

This study is supported in part by NIH grants UL1 RR024982 and P50 CA70907. The authors thank the two reviewers and associate editor for their constructive comments and suggestions.

Appendix A.1. Proof of Theorem 1

Proof. Assuming equation M73 and applying some simple algebra, we transform (4) to

equation M74

Squaring on both sides and rearranging, we have

equation M75
(7)

From (7) we can find a closed form solution for equation M76 subject to constraint that equation M77.

First we need b2 − 4ac ≥ 0, where a, b and c are defined in (5). This condition implies equation M78, and two possible roots

equation M79

Fact 1. No plausible solution exists under equation M80.

  • If equation M81 then a = 0, and the solution to (7) is r = −c/b. Because c > 0 when α < β and b > 0 by definition, we eliminate r due to the positive constraint on equation M82.
  • If equation M83 then a > 0 and 4ac > 0. It is easy to show that r1 < 0 and r2 < 0.

Suffciency: We demonstrate that the condition equation M84 implies (5) being the unique sample size solution. From the condition we have a < 0 and 4ac < 0. Together with b > 0, it is easy to show that r1 < 0 and r2 > 0. Thus (5) is the unique sample size solution.

Necessity: We demonstrate that (5) being the unique sample size solution implies equation M85. (5) being the unique solution is equivalent to r2 being the unique solution for equation M86. There are two scenarios:

  1. b2 − 4ac = 0 or equation M87. It is eliminated due to Fact 1.
  2. b2 − 4ac > 0 and r1 < 0 and r2 > 0. Note that the condition b2 − 4ac > 0 implies equation M88. We eliminate equation M89 based on Fact 1. The validity of equation M90 is established by Sufficiency.

Thus we complete the proof.

Appendix A.2. Algorithm of Simulation 1

Simulation 1. First we compute sample sizes nv for a given set of (m, equation M91, equation M92, Δ, α, β), where v = 0, 1, 2 denote the M-S, one-sample, and RCT approach, respectively. Then we generate null experimental data sets equation M93 from equation M94, and alternative experimental data sets equation M95 from equation M96, for l = 1, …, L and v = 0, 1, 2. The superscript 0 indicates that the null distribution is true, and the superscript(l) indicates the lth experimental data set generated. For iteration k = 1, …, K,

  1. Simulate HC data equation M97 from equation M98;
  2. Estimate the conditional type I error given Y(k) by equation M99. Note that Zv(X, Y) = Z(X, Y) for v = 0 and 2.
  3. Estimate the conditional power given Y(k) by equation M100.

The superscript(k) of equation M101 and equation M102 suggests that they are computed given the kth simulated HC data. We set K = L = 5000.

Appendix A.3. Proof of Theorem 2

Proof. We first state the fact that equation M103, equation M104, equation M105, and median(equation M106) =θ0. For n0 and n2, the null hypothesis is rejected if Z(X, Y) > z1−α. Marginalizing with respect to Y is equivalent to marginalizing with respect to equation M107. Thus for v = 0, 2,

equation M108

We have the third equality through random variable transformation, where equation M109 and it is easy to show that Uv ~ N(0, 1).

In the similar fashion, we can show that E(q2) = 1 − β,

equation M110

We have the second equality by defining equation M111. The third equality is obtained by plugging the expression of n2 and recognizing U ~ N(0, 1)

We then show that median(q0) = 1 − β. From (3) we have

equation M112

First we recognize that equation M113 is a decreasing function of equation M114. Second, n0 is the solution to equation M115 = 1 − β by setting equation M116 = θ0 = midian(equation M117). These two points lead to the conclusion that median(q0) = 1 − β. Note that E(q2) and equation M118 have different expressions because the former marginalizes with respect to random equation M119, while the latter is defined conditional on a particular Y(k).

Now we show that median(h1) = α,

equation M120

Thus equation M121 is a decreasing function of equation M122 and equation M123 at equation M124. Thus we conclude median(h1) = α. Similar argument leads to the conclusion that median(q1) = 1 − β.

Appendix A.4. Proof of Theorem 3

Proof. First we demonstrate that based on Z*(X, Y), the phth percentile of type I error is controlled at α: For a given equation M125, the type I error is

equation M126

Thus h* = α when equation M127, which is the (1−ph)th percentile of equation M128. Together with the fact that h* is a monotonically decreasing function in equation M129, we have equation M130. Note that this statement holds for any n*.

Then we solve for n* which controls the (1 − pq)th percentile of power at 1 − β: The conditional power given equation M131 is

equation M132
(8)

It is obvious that q* is a monotonically decreasing with equation M133. Using this property, if we set q* = 1 − β at equation M134, the pqth percentile of equation M135, we can achieve the goal of controlling the (1 − pq)th percentile of power at 1 − β, because

equation M136

Thus by plugging q* = 1 − β and equation M137 into (8), we can solve for n* from the following equation,

equation M138

The solution for n*, equation (6), can be obtained after some algebra. The condition equation M139 is due to the positive constraint on equation M140.

Appendix A.5. Algorithm of Simulation 3

Simulation 3. For iteration k = 1, …, K,

  1. Generate HC data equation M141 from equation M142. Compute the sample variance equation M143;
  2. Estimate the required sample size n*(k) based on Formula (6), with equation M144 replaced by equation M145;
  3. Given sample size n*(k), generate null experimental data sets equation M146 from equation M147, and alternative experimental data sets equation M148 from equation M149, for l = 1, …, L;
  4. Compute the empirical type I error by equation M150. Note that we replace the population variances (equation M151 and equation M152) in Z*(X0(k,l),Y(k) by sample variances (equation M153 and equation M154). Here equation M155 is the sample variance of X0(k,l). Similarly, we compute the empirical power q*(k).

References

[1] Makuch RW, Simon RM. Sample size considerations for non-randomized comparative studies. Journal of Chronic Diseases. 1980;33(3):175–181. [PubMed]
[2] Vickers AJ, Ballen V, Scher HI. Setting the bar in phase II trials: The use of historical data for determining ”go/no go” decision for definitive phase III testing. Clinical Cancer Research. 2007;13(3):972–976. [PMC free article] [PubMed]
[3] Cho SD, Krishnaswami S, Mckee JC, Zallen G, Silen ML, Bliss DW. Analysis of 29 consecutive thoracoscopic repairs of congenital diaphragmatic hernia in neonates compared to historical controls. Journal of Pediatric Surgery. 2009;44(1):80–86. [PubMed]
[4] Abe T, Kakemura T, Fujinuma S, Maetani I. Successful outcomes of emr-l with 3d-eus for rectal carcinoids compared with historical controls. World Journal of Gastroenterology. 2008;14(25):4054–4058. [PMC free article] [PubMed]
[5] Storm C, Steffen I, Schefold JC, Krueger A, Oppert M, Jorres A, Hasper D. Mild therapeutic hypothermia shortens intensive care unit stay of survivors after out-of-hospital cardiac arrest compared to historical controls. Critical Care. 12(3):2008. [PMC free article] [PubMed]
[6] Van Rooij WJ, De Gast AN, Sluzewski M. Results of 101 aneurysms treated with polyglycolic/polylactic acid microfilament nexus coils compared with historical controls treated with standard coils. American Journal of Neuroradiology. 2008;29(5):991–996. [PubMed]
[7] Ando R, Nakamura A, Nagatani M, Yamakawa S, Ohira T, Takagi M, Matsushima K, Aoki A, Fujita Y, Tamura K. Comparison of past and recent historical control data in relation to spontaneous tumors during carcinogenicity testing in fischer 344 rats. Journal of Toxicologic Pathology. 2008;21(1):53–60.
[8] Song JY, Chung BS, Choi KC, Shin BS. A 5-year period clinical observation on herpes zoster and the incidence of postherpetic neuralgia (2002–2006); a comparative analysis with the historical control group of a previous study (1995–1999) Korean Journal of Dermatology. 2008;46(4):431–436.
[9] Loudon I. The use of historical controls and concurrent controls to assess the effects of sulphonamides, 1936–1945. Journal of the Royal Society of Medicine. 2008;101(3):148–155. [PMC free article] [PubMed]
[10] Dixon DO, Simon R. Sample size considerations for studies comparing survival curves using historical controls. Journal of Clinical Epidemiology. 1988;41(12):1209–1213. [PubMed]
[11] Chang MN, Shuster JJ, Kepner JL. Group sequential designs for phase II trials with historical controls. Controlled Clinical Trials. 1999;20(4):353–364. [PubMed]
[12] Kepner J, Wackerly D. Some observations on the makuch/simon approach to sample size determination in clinical trials with historical controls. Communications in Statistics Part B: Simulation and Computation. 2001;30(3):611–621.
[13] Chang MN, Shuster JJ, Kepner JL. Sample sizes based on exact unconditional tests for phase II clinical trials with historical controls. Journal of Biopharmaceutical Statistics. 2004;14(1):189–200. [PubMed]
[14] Lee JJ, Tseng C. Uniform power method for sample size calculation in historical control studies with binary response. Controlled Clinical Trials. 2001;22(4):390–400. [PubMed]
[15] Korn EL, Freidlin B. Conditional power calculations for clinical trials with historical controls. Statistics in Medicine. 2006;25(17):2922–2931. [PubMed]
[16] Abraham NS, DuraiRaj R, Young JM, Young CJ, Solomon MJ. How does an historic control study of a surgical procedure compare with the ”gold standard”? Diseases of the Colon and Rectum. 2006;49(8):1141–1148. [PubMed]