The adequacy of the sample size formula will be assessed through simulation for large effect sizes and non-ignorable missing situations. We estimate the required number of subjects (

*n*) using

equation (2), and then generate for each of 5000 simulated samples

*n* subjects from a multivariate normal distribution with the damped exponential correlations and missing data patterns. We consider two missing patterns. One is independent missing, where

*p*_{jj′} =

*p*_{j}p_{j′}, and the other is monotone missing, where

*p*_{jj′} =

*p*_{j′} for

*j*<

*j′* (

*p*_{1} ≥ ··· ≥

*p*_{K}). We set the maximum number of measurements for each subject at

*K*=6, and use the following vectors for the probability of assessment at each time point:

Note that

*P*_{1},

*P*_{2} and

*P*_{3} describe the scenarios where an increasing number of subjects miss visits over time, with the dropout rate at the end of study being 0.3. We use

*P*_{4} to denote no missing data. In sample size estimation, a common practice is to use

*n* =

*n*_{0}/(1 −

*q*), where

*n*_{0} is the sample size estimate under no dropout and

*q* is the expected dropout rate. Following this procedure, the final sample size estimates under

*P*_{1},

*P*_{2} and

*P*_{3} would all be

*n*=

*n*_{0}/0.7, even though different amounts of information have been lost in the data. Simply adjusting the sample size estimate by

*n*=

*n*_{0}/(1−

*q*) is a crude and conservative method. As we would demonstrate in the simulation study, the sample sizes estimated by (

2) under

*P*_{1},

*P*_{2} and

*P*_{3} are usually much less than the adjusted sample size under

*P*_{4} by a factor of 0.7.

We estimate the sample size using the sample size formula given in

equation (2) with the prespecified missing data patterns, missing probabilities, damped exponential correlation structures,

*σ*^{2}=1, type I error

*α*=0.05, power 1−

*γ*=0.8,

*β*_{40} = 0.1 and

= 0.5. Once the sample size (

*n*) is estimated, we generate 5000 replicated samples of

*n* subjects using

*β* = (0.3, 0.3, 0.5, 0.1) with correlated measurement errors generated from the multivariate normal distribution. Note that the sample size estimate does not depend on the values of

*β*_{1},

*β*_{2} and

*β*_{3}.

3.1. Damped exponential family of correlation structures

We examine the effect of correlation using a damped exponential family of correlation structures proposed by Munoz

*et al*. [

6]. The correlation between two observations separated by

*s* units of time is modeled by

*ρ*_{j},

_{j}_{+}_{s} =

*ρ*^{c}, where

*c* =

*s*^{θ}. Here,

*ρ* is the correlation between observations separated by one unit and

*θ* is a damping parameter. The damped exponential correlation structure provides a rich family of correlation structures. Compound symmetry (CS) and first-order autoregressive (AR(1)) correlation structures are obtained by setting

*θ*=0 and 1. The correlation structures of the repeated measurements change from CS to AR(1) in graded steps as

*θ* increases from 0 to 1. We investigate the effect of the correlation structures on sample size estimate using

and 1 and

*ρ*=0.1, 0.25 and 0.5.

presents the sample size estimates obtained to produce 80% power along with the empirical powers computed from the simulated samples under independent and monotone missing data patterns, respectively. As

*θ* increases, sample size increases since the dependency decays when two measurement times get farther apart. The estimated sample size decreases as

*ρ* increases. The empirical power obtained using the estimated sample size is generally close to the nominal value of 80%. Furthermore, given

*ρ* and

*θ*, the estimated sample sizes decrease in the order from

*P*_{1} to

*P*_{4}. Comparing the estimated sample sizes under independent missingness and monotone missingness, we found that the difference increases with

*ρ*. For example, under

*P*_{2} and the CS correlation structure, the difference in sample sizes is 199−197=2 when

*ρ*=0.1, 175−169=6 when

*ρ*=0.25 and 135−124=11 when

*ρ*=0.5. also reports that adjusting for missing data by

*n*_{0}/(1−

*q*) is too conservative. For example, under monotone missing pattern, for

*ρ*=0.25 and

*θ*=1, the estimated sample sizes under

*P*_{1},

*P*_{2} and

*P*_{3} are 270, 266 and 262, respectively. If we use the naive adjustment method, the required sample size would be

.

| **Table I**Sample size *n* (empirical power) based on *P*_{i} (*i*=1,…,4) as defined in (3) with effect size *β*_{40} = 0.1, type I error=0.05, nominal power=0.8, damped correlation *ρ*_{j}_{,}_{j}_{+}_{s} = *ρ*^{c}, where *c* = *s*^{θ}, *ρ* is the correlation (more ...) |

To better understand the impact of dropout on sample size and power, we conduct a simulation study with the dropout rate at the end of study equal to 0.5. The vectors of assessment probability are defined as follows:

The comparison of and indicates that the dropout rate has a great impact on the sample size. For a given missing pattern,

*ρ* and

*θ*, a higher dropout rate leads to a much larger sample size. The differences between independent and monotone missing patterns are more pronounced under the higher dropout rate. For example, under

and the CS correlation structure, the difference in sample sizes is 240−234=6 when

*ρ*=0.1, 220−206=14 when

*ρ*=0.25 and 187−159=28 when

*ρ*=0.5.

| **Table II**Sample size *n* (empirical power) based on
(*i*=1,…,4) as defined in (4) with effect size *β*_{40} = 0.1, type I error=0.05, nominal power=0.8, damped correlation *ρ*_{j}_{,}_{j}_{+}_{s} = *ρ*^{c}, where *c* = *s*^{θ}, *ρ* is the correlation (more ...) |

3.2. Small sample size

The GEE method is based on a large sample approximation. A large value of

*β*_{4} will lead to a small sample size estimate. We investigate the performance of sample size formula when the estimated sample size is small. We set

*β*_{40} = 0.2, and then estimate the required sample size

*n* to achieve 80% power using the assessment probability vectors given in (

3) and the damped exponential correlation structure under independent and monotone missing data patterns. Five thousand samples of repeated measurements data are generated. reports the estimated sample sizes and the corresponding empirical powers calculated from the simulated samples under independent and monotone missing data patterns. According to formula (

2), the sample size estimate is proportional to

. Thus, the estimated sample sizes in are about one-fourth of those estimated in . The empirical powers are generally close to the nominal 80% power.

| **Table III**Sample size *n* (empirical power) based on *P*_{i} (*i*=1,…,4) as defined in (3) with effect size *β*_{40} = 0.2, type I error=0.05, nominal power=0.8, damped correlation *ρ*_{j}_{,}_{j}_{+}_{s} = *ρ*^{c}, where *c* = *s*^{θ}, *ρ* is the correlation (more ...) |

3.3. Non-ignorable missing

The sample size formula given in

equation (2) is constructed under the MCAR assumption, which has been assumed in Sections 3.1 and 3.2. In this section, we investigate the performance of the sample size formula under non-ignorable missingness, where the missing probability depends on unobserved outcomes [

7]. Specifically, we assume that a higher outcome value leads to a lower chance of being observed. We define

*μ*_{0}_{j} =

*β*_{1}+

*β*_{3}*t*_{j},

*j*=1,…,

*K*, to be the mean response at time

*j* for the control arm. The probability of followup for an individual

*i* at time

*j* is

with

*v*_{ij} = 1 −

*a*(Φ(

*y*_{ij};

*μ*_{0}_{j}, 1) − 0.5). Here Φ(·;

*μ*,

*s*^{2}) is the cumulative normal distribution function with mean

*μ* and variance

*s*^{2},

*p*_{j} is the overall assessment probability at time

*j* and

*a* is a tuning parameter controlling the impact of outcome on followup. For example, when

*a*=0, we have

*q*_{ij} =

*p*_{j} for

*i*=1,…,

*n*. Model (

5) clearly shows the dependence of

*q*_{ij} on

*y*_{ij}. When

*a*>0, larger values of

*y*_{ij} lead to smaller values of

*v*_{ij} or a lower followup rate

*q*_{ij}. For the control group,

*y*_{ij} centers around

*μ*_{0}_{j}, which means that

*v*_{ij} fluctuates around 1. On the other hand, with

*β*_{2}=0.3 and

*β*_{4}=0.1, the treatment group on average has outcomes greater than

*μ*_{0}_{j} or on average

*v*_{ij}<1 for a positive value of

*a*. As a result, the treatment group tends to have a lower followup rate. We consider four possible values of the tuning parameter

*a*=(0.5, 0.75, 1, 1.25), and the same four vectors of overall followup probability specified in (

3).

We generate the simulated data by imposing non-ignorable missingness, and evaluate the performance of the sample size formula in terms of empirical powers when the MCAR assumption is violated. We compute the empirical powers as the proportion of samples rejecting

*H*_{0}:

*β*_{4}=0 among 5000 simulated samples. gives the sample size estimates and empirical powers under monotone missing data patterns. Larger values of

*a* lead to greater loss in power. The simulation results under independent missing data patterns are similar to those under monotone missing data patterns. The table is not included here for independent missing data patterns. In both monotone and independent missing data patterns, we did not observe obvious trend in the effects of

*θ*,

*ρ* or assessment probability. gives the impact of non-ignorable missingness. Specifically, a great deal of empirical powers are below the nominal value 0.8, which is especially true for

*a*=1.25. Note that the missing mechanism assumed in (

5) leads to higher dropout in subjects with above-average measurements. That is, the large measurement value yields high dropout. Such a mechanism causes an underestimation of

*β*_{40} and a smaller empirical power than the nominal level. plots the summary of assessment probabilities under the non-ignorable missingness with independent missing pattern, averaged over

*ρ* and

*θ*. The baseline assessment probability is

*P*_{1}=(1, 0.82, 0.79, 0.76, 0.73, 0.7). Under the assumed missing mechanism (

5), the treatment group suffers higher dropout than the control group. Thus, each value of

*a* corresponds to two curves, the upper curve for the control group and the lower curve for the treatment group. This group disparity grows as the value of

*a* increases.

| **Table IV**Sample size *n* (empirical power) based on *P*_{i} (*i*=1,…,4) as defined in (3), where sample size *n* is for effect size *β*_{40} = 0.1, type I error=0.05, nominal power=0.8, damped correlation *ρ*_{j}_{,}_{j}_{+}_{s} = *ρ*^{c}, where *c* = *s*^{θ}, *ρ* (more ...) |