PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of ijbiostatThe International Journal of BiostatisticsThe International Journal of BiostatisticsSubmit to The International Journal of BiostatisticsSubscribe
 
Int J Biostat. 2011 January 1; 7(1): Article 29.
Published online 2011 July 22. doi:  10.2202/1557-4679.1286
PMCID: PMC3154087

Combining Censored and Uncensored Data in a U-Statistic: Design and Sample Size Implications for Cell Therapy Research

Abstract

The assumptions that anchor large clinical trials are rooted in smaller, Phase II studies. In addition to specifying the target population, intervention delivery, and patient follow-up duration, physician-scientists who design these Phase II studies must select the appropriate response variables (endpoints). However, endpoint measures can be problematic. If the endpoint assesses the change in a continuous measure over time, then the occurrence of an intervening significant clinical event (SCE), such as death, can preclude the follow-up measurement. Finally, the ideal continuous endpoint measurement may be contraindicated in a fraction of the study patients, a change that requires a less precise substitution in this subset of participants.

A score function that is based on the U-statistic can address these issues of 1) intercurrent SCE's and 2) response variable ascertainments that use different measurements of different precision. The scoring statistic is easy to apply, clinically relevant, and provides flexibility for the investigators' prospective design decisions. Sample size and power formulations for this statistic are provided as functions of clinical event rates and effect size estimates that are easy for investigators to identify and discuss. Examples are provided from current cardiovascular cell therapy research.

Keywords: U-statistic, clinical trials, score function, stem cells

Introduction

This manuscript develops a U-statistic that incorporates two nettlesome but unavoidable features for collecting continuous endpoint measures in modern cardiology clinical trials; 1) the occurrence of a clinical event (e.g., death) during the trial that precludes the measurement of the endpoint at the end of the study, and 2) the requirement through clinical circumstances that a less precise determination of the endpoint (e.g., echocardiographic determination) be substituted for a most precise determination (e.g., magnetic resonance imaging).

Endpoint selection is challenging in early human cardiovascular cell therapy clinical trials. Possible choices for the endpoint are the size of the heart damaged by a heart attack, known as the infarct region [Strauer BE, et al. (2002), Abdel-Latif A, et al. (2007)], or changes in the percent of blood ejected by the left ventricle with each heart beat, or left ventricular ejection fraction (LVEF) [Assmus B, et al. (2002)]. Recent attention has focused on other measures of left ventricular dysfunction e.g., left ventricular end-diastolic volume (LVEDV) (how large the left ventricle becomes at the peak of the cardiac cycle when it is full of blood), and left ventricular end systolic volume (LVESV) (how small the ventricle is after it has ejected its blood content [Penicka M, et al. (2007)]).

Continuous response variables (endpoints) provide necessary statistical power in well designed clinical experiments. However, since continuous endpoints require measurements at both baseline and during the follow-up period, clinical events can complicate the collection of these important measures. For example, the occurrence of an intervening significant clinical event (SCE) (e.g., death) precludes the follow-up measurement, reducing the precision of the overall measure of therapy effect by reducing the number of endpoint-evaluable subjects. In addition, the observation that there may be a greater proportion of subjects with an SCE in the control group than in the treatment group introduces a new informative censoring complication to the analysis. The informative censoring approach of Follmann, Wu, et. al. [Follmann D and Wu M. (1995)] provides a useful tool for analyzing data in the presence of informative censoring; however, there is no literature on trial design and sample size computations using the informative censoring procedure.

In addition, the most precise measure of the continuous endpoint may not be in the subject’s best interest. For example, while some believe that cardiac magnetic resonance (cMR) is superior to echocardiographic measures of heart function [Grothues (2002)] cMR measures cannot be obtained in everyone. Subjects who have an implantable (metallic) device e.g., a pace maker, remain contraindications to cardiac magnetic resonance imaging. If a substantial fraction of the subjects who were recruited for the study had one of these devices implanted during their follow-up procedure, they will be unable to undergo cMR at the end of the study. A statistical test that incorporates the most precise heart function measure when available, but uses the less precise measure in its absence would permit the study to use each of these subject’s data, regardless of the measurement that is indicated by the subject’s condition.

This manuscript discusses the development of a U-statistic to permit 1) the inclusion of a dichotomous endpoint (SCE) when its occurrence precludes measuring the continuous endpoint measure, and 2) the use of the most precise endpoint information when it is available and less precise information when it is not. The statistic’s development and calibration are based on commonly used clinical research measures that physician scientists understand. Examples are provided from ongoing cardiovascular research.

1. Background

Coronary artery disease (CAD) remains the single largest killer of Americans, producing myocardial infarctions and heart failure (HF). [Rosamond W, et al. (2006)] Recent research has reduced coronary heart disease mortality [Ford ES, et al. (2007)]. However CAD remains a leading cause of HF. Seven million heart attack hospitalizations in the US have generated almost 5 million subjects living with HF who face end-stage HF with its 5-year mortality of approximately 50%. [Levy D, et al. (2002), Roger VL, et al. (2004)] The large burden faced by these subjects with limited options has spurred the investigation of alternative treatments. One potential treatment strategy is the use of bone marrow–derived mononuclear cells (BMMNCs), a source of stem cells that shows promise for the treatment of these subjects.

Phase II clinical trials in cell therapy research are underway in the United States. Studies in animal models have demonstrated that heart function can be significantly improved with bone marrow-derived stem cells following experimental heart attacks induced in animals [Orlic D, et al. (2001), Kocher AA, et al. (2001), Jackson KA, et al. (2001), Yoon YS, et al. (2005)]. Although data supporting significant heart regeneration in these preclinical studies has not been uniform [Murry CE, et al. (2004), Balsam LB, et al. (2004)], it has led to a number of clinical trials testing the strategy that delivery of a subject’s own (or autologous) bone marrow-derived mononuclear cells (BMMNCs) into the infarct region following AMI may improve heart function [Schächinger V, et al. (2006), Janssens S, et al. (2006), Wollert KC, et al. (2004), Lunde K, et al. (2006)].

In light of the relative paucity of mechanistic studies into important questions, such as timing of cell delivery, the National Heart, Lung, and Blood Institute (NHLBI) established the Cardiovascular Cell Therapy Research Network (CCTRN) to accelerate research into the use of cell-based therapies for the management of cardiovascular diseases. The Network is simultaneously conducting two trials in the acute myocardial infarction environment, TIME [Traverse, et. al. (2010)] LateTIME [Traverse, et. al. (2010)] and one trial in subjects with chronic heart failure and ongoing ischemia, FOCUS [Willerson, et. al. (2010)]. Each of these trials has the characteristics of 1) having continuous measure primary endpoints, 2) enrolling patients who can have SCE’s that preclude the final endpoint measure, and 3) having the most precise of the primary endpoint unavailable in a substantial fraction of the population, requiring the use of a less precise measure available.

2. Methods

This method is based on the two-sample U-statistic [Kowalski J and Tu XM.] (2007)], a well established, nonparametric measure of effect based on an investigator-determined scoring mechanism. Our development is modeled after the U-statistic’s implementation to score the occurrence of a combination of two discrete endpoints in a cardiovascular clinical trial [Moyé LA, et al. (1992), Moyé LA, (1991), Penicka M, et al. (2007)]. A recent use of this statistic in medical research has been its application to multivariate ordinal data [Wittkowski KM, et al. (2004)].

In its simplest adaptation, the U-statistic “builds itself up” from a prospectively selected scoring procedure. Let there be n observations in the control group. Let each of the n subjects in the control group have a continuous endpoint measure xi, i = 1, 2, 3, ..., n. Similarly, let the primary endpoint measure for each of the m subjects in the active group be indexed by yj, j = 1, 2, 3, …, m.

The U-statistic requires a simple scoring mechanism, denoted by [var phi]i,j. This is the assignment of a score designed in this paper based on comparing the ith subject in the control group with the jth subject in the active group. The score may be as simple as [var phi]i,j = 1 if xi > yj; [var phi]i,j = 0 if xi = yj; or [var phi]i,j = −1 if xi < yj. Since each of the n control group subjects will be compared to each of the m active group subjects, there are nm comparisons. The U-score statistic, We is simply the average of these nm scores,

equation M1
(1)

The normalized statistic based on these scores for a test of the null hypothesis (H0) of no treatment effect versus the alternative hypothesis (Ha) of a change in the distribution of the yj’s based on the treatment is

equation M2
(2)

Under mild regulatory conditions and adequate sample size, we assume that (2) follows a standard normal distribution, then we can compute the sample size from (assuming n = m)

equation M3
(3)

where v0 = Var [We | H0], va = Var [We | Ha], α is the probability of a type I error, β is the probability of a type II error, and Zc is the cth percentile value from the standard normal distribution, Alternatively, power may be computed from

equation M4
(4)

where ΦZ (z) is the cumulative distribution function of the standard normal distribution.

However, the adoption of this statistic requires a careful justification of the scoring mechanism required for the response variables (endpoints). The setting for our evaluations is that of a randomized clinical trial with both a control and an active group. We will construct the score statistic in two cases:

  • Case 1. A dichotomous right censored measure combined with a single continuous response variable.
  • Case 2. A dichotomous right censored measure combined with two continuous response variables to be used in a hierarchy determined by the precision of the two response variables.

The mathematics of Case 1 will be developed in detail, and then applied to the Case 2 scenario, which is the scenario that we face in the CCTRN network cell therapy studies.

Case 1: A dichotomous right censored measure combined with a single continuous response variable

The investigators’ goal is to compare the change in the measure of a single continuous response variable over time in the control group to the change in that same variable in the active group. Left ventricular ejection fraction (LVEF) is an example of a commonly measured continuous endpoint. LVEF is the percent of the blood in the left ventricle ejected at each beat (for subjects without heart disease LVEF is typically larger than 80%.). To assess changes over time in a variable such as LVEF, there should be a measurement at baseline and at the end of the study. However, the investigators recognize that this goal may not be achievable in all subjects because of the occurrence of death or another SCE. We will assume that (as is the case with LVEF) an increase in the response variable over time corresponds to improved health status.

Let r be this continuous endpoint variable. Then, for the ith subject in the control group, i = 1, 2, 3, …, n, let di (x) = ri,2(x) − ri,1(x) be the change in this endpoint variable over the duration of the study. Assume that di(x) has mean μΔR (x) and variance equation M5. Analogously, let dj(y) = rj,2(y) − rj,1(y) be the change in the endpoint measure for the jth subject in the active group, which mean μΔR (y)and known variance equation M6. Under the null hypothesis of the study, μΔR (x) = μΔR (y). If we assume that larger values of μΔR correspond to improved health, then under the alternative hypothesis, the researchers expect that μΔR (x) < μΔR (y).

However, the occurrence of a significant event (SCE) (e.g., a death, a recurrent myocardial infarction (MI), can affect the follow-up measurement of the continuous variable. The hallmark of the SCE is that 1) its occurrence during the trial either precludes the follow-up measurement (as in the case of death), or perturbs the measurement to the point that the effect of therapy can be difficult to assess (e.g., the occurrence of an intercurrent heart attack), and 2) the SCE event rates in the randomized groups may themselves be related to the therapy effect. The occurrence of an intervening SCE (itself an underpowered evaluation in a small study) reduces the power of the LVEF measure by decreasing the number of subjects who survive to have the follow-up measurement.

In this case we define the scoring mechanism, [var phi]i,,j as follows:

[var phi]i,,j = 1
if both the ith subject if the control group and the jth subject in the active group experience an SCE during the study, and the time to event for the control group subject is less than the time to event for the active group subject.
[var phi]i,j = 1
if the ith subject in the control group experiences an SCE during the study and the jth subject in the active group does not experience an SCE during the study.
[var phi]i,,j = −1
if both the ith subject in the control group and the jth subject in the active group experiences an SCE during the course of the study, but the time to event for the control group subject is greater than the time to event for the active group subject.
[var phi]i,,j = −1
if the ith subject in the control group does not experience an SCE during the study and the jth subject in the active group does experience an SCE during the study.
[var phi]i,,j = c
if neither the ith subject in the control group nor the jth subject in the active group experience an SCE during the study, and the change in the continuous measure r for the control group subject is less than the change in continuous measure for the active group subject.
[var phi]i,,j = c
If neither the ith subject in the control group nor the jth subject in the active group experiences an SCE during the study, and the change in the continuous measure ri for the control group subject is greater than the change in continuous measure for the active group subject, rj.
[var phi]i,,j = 0
otherwise.

Under this mechanism, the occurrence of an early SCE (e.g., a death) in one group is considered worse than a subject survival or a later occurring SCE in the other treatment group. If both subjects in the comparison have no SCE, then the change in the response variable is compared.

With some additional notation, the assignment of this scoring system permits the computation of the mean and variance of We under the null and alternative hypothesis.

Notation:

Define CX(i)(E, R) as the endpoint status of the ith subject in the control group, and CY(j)(E, R) as the endpoint status of the jth subject in the active group. We will use this notation to allow us to capture either 1) the time to the occurrence of an SCE if one has occurred during the course of the trial, or 2) the change in the continuous variable if an SCE has not occurred.

If an SCE has occurred for the ith subject in the control group, then CX(i)(E, R) = CX(i)(+, R), and its value is the time to the occurrence of the SCE. Since the SCE has occurred during the course of the study, then 0 ≤ CX(i)(+, R) ≤ T where T is the maximum time a subject is to be followed in the research protocol. If an SCE has not occurred, then CX(i)(E, R) = CX(i)(–, R), and we set CX(i)(–, R) to equal the change in the continuous measure. Identical notation applies to the jth subject in the active group, CY(j)(E, R).

For example, if in a 180 day clinical trial, the 4th subject in the control group died on day 117, then CX(4)(E, R) = CX(4)(+, R) = 117, the positive sign signifying that the SCE event occurred. Alternatively, if the 5th subject in the active group survived the trial and experienced a six unit increase in the continuous response variable, then CY(5)(E, R) = CY(5)(–, R) = 6, the minus sign in CY(5)(, R) indicating that no SCE occurred during the study.

Using this notation and letting 1X[set membership]A be the indicator function that takes the value of 1 when x is a member of set A and 0 otherwise, we can write the score function [var phi]i,j as

equation M7
(5)

For this function we can compute its expected value under both the null (E[[var phi]i,j | H0]) and alternative E[[var phi]i,,j | Ha] hypotheses (Appendix A).

Sample size computation

Assuming a normal distribution for We, we compute that,

equation M8
(6)

And from consideration of the type II error, we may write,

equation M9
(7)

and

equation M10
(8)

The variance terms in (8), are computed (Appendix B) to be

equation M11

Where the constants A0 B0, and Aa, and Ba are functions of expectations [var phi]i,,j under the null and alternative hypothesis. In this scenario, an exact solution for solution for the sample size n is available. If we write m = kn, where k is known (for example if there are an equal number of subjects in the active group as in the control group, then k = 1), then substituting for the variance term, we may write

equation M12
(9)

Squaring both sides, simplifying, and squaring again, with expansion and further simplification produces the quartic equation

equation M13
(10)

where

equation M14
(11)

Power can be more directly computed as

equation M15
(12)

An asymptotic solution is also available (Appendix C)

equation M16
(13)

Now substituting equations for the Var[[var phi]i,j | H0] and Var[[var phi]i,j | Ha] from (13) into (8) to compute the sample size of the trial, we write

equation M17
(14)

Noting that the total number subjects in the study is n control group plus kn in the active group, we can write

equation M18
(15)

Power can be expressed as

equation M19
(16)

Case 2. Competing clinical measures of different precision

This second case adds one level of complexity to Case 1. The investigators’ goal is to compare the change in a continuous measure over time in the control group to that of the active group. However in this circumstance the investigators have two competing assessments of the same endpoint continuous variable. For example, in the TIME study, while cMR measure of LVEF is the most precise measure, it will need to be substituted by echocardiographic determinations of ejection fraction. The first, denoted by the continuous variable r, is the most precise but is not available in all subjects. The second, denoted by s, is less accurate, but is available for everyone.

For this case, we can modify the scoring function from Case 1 adding the following conditions

[var phi]i,,j = d
if neither the ith subject in the control group nor the jth subject in the active group experience an SCE during the study, the change in the variable r is not available for both control and active group subjects and the change in s in control group subject is less than the change in s in the active group subject.
[var phi]i,,j = –d
if neither the ith subject in the control group nor the jth subject in the active group experience an SCE during the study, the change in the variable r is not available for both active and control group subjects and the change in s in control group subject is greater than the change in s in the active group subject.

The computations follow the development of Case 1. E[[var phi]i,,j [var phi]i,,j] now requires 22 terms (Appendix D).

3. Results

A series of evaluations of this U-statistic for clinical measures of event rates and the effect of therapy on the continuous variable were carried out (Figures 1 and and22).

Figure 1
Continuous endpoint weight c influence on the relationship between SCE prevalence and sample size
Figure 2
Influence of efficacy on the relationship between the continuous endpoint weight c and trial size

Figure 1 identifies the relationship between the trial size (total number of subjects in both the active and the placebo group) and the probability of a significant clinical event as a function of the effect of cell therapy on the significant clinical event rate as a function of c, the weight ascribed to the continuous endpoint measure in the analysis. In this circumstance we assume that the change in the response variable in the active group is five units greater than the change in the control group (Δ = 5). We also assume the standard deviation of this change is 7 for each group (90% power and a type I error rate a two sided alpha of 0.05 is assumed for all analyses). In each of the curves in Figure 1, curves, the trial size is larger for larger probabilities of an SCE. Larger probabilities of an SCE increase the proportion of subjects who have no measure of the continuous endpoint that is obtained at the conclusion of the study, and larger sample sizes are required in order to main the power of the evaluation of the therapy’s impact on the continuous measure.

We also note the sample size increases as the value of c decreases. The value of c is the relative weight in the scoring system. As c decreases the impact of a nonzero comparison between the active and control group measures has less weight than that of the comparison of SCE timings. This diminished weight for comparison generates the need for more continuous measure comparisons in the cohort, thereby increasing the sample size. Figure 2 demonstrates the same effect of decreasing sample size for larger values of the continuous weighting function c. Here e represents the effect of the therapy on the SCE rate (represented as the percent reduction in the SCE control group rate experienced by the active group subjects). Sample size decreased as e increased. Note that for all values of efficacy evaluated, the sample size stabilized for values of c greater than 3.

The well established relationship between sample size and treatment effect (Δ) are demonstrated in Figure 3. In the paradigm of combining a continuous and a dichotomous endpoint, the sample size decreases as the effect size increases, and increases as the treatment standard deviation of the difference (σΔ) increases. Analogously, it is well accepted that when a dichotomous measure is used as a response variable in a clinical trial, the trial size increases as the prevalence of the dichotomous variable increases and decreases with increasing efficacy of treatment against that response variable. This is demonstrated in Figure 4. Thus, the score statistic is a function of the effect of the cell therapy on the continuous measure, as reflected by Δ and σ(Δ), and also by the efficacy of the therapy on the SCE rate as well, e. Figure 4 demonstrates that, while larger values of the probability of an SCE still produce larger trial sizes, the efficacy of the therapy on the SCE rate moderates this relationship.

Figure 3
Trial size as a function of standard deviation (σ) and effect size (Δ) of the response variable.
Figure 4
Trial size as a function of the SCE prevalence (p) and the intervention’s efficacy on prevalence (e)

Case 2

In this research scenario there are two continuous measures, each with weights c and d. As both c and d increase, the weight of each continuous endpoint increases, and the sample size decreases. However, the larger values of c and d have diminishing impact on the sample size.

Example. In a study in heart failure [Pfeffer MA, et. al. 1992] LVEF was expected to increase slightly in the control group, and anticipated to increase to a greater degree in the active group. However subjects with heart failure die or have clinical events precluding the assessment of this measure. In addition LVEF was measured in two ways. The premier measurement was through a radionuclide ventriculogram (RVG). RVG-LVEF measures were the most accurate; however, the requirement of radiation exposure limited the utility of this procedure. An alternative was to use echocardiography to obtain the LVEF. Echo-based LVEF, new at the time, was safe but less precise. The presence of two continuous measures with one preferred over the other, in addition to the occurrence of SCE’s is a circumstance in which the proposed sample size computation was designed.

From a sample of data from this study (Table 2), we use the percent of significant clinical events, in concert with the data for expected changes in RVGLVEF and echo muscle mass to compute E[ϕij | Ha]. The terms for the variance E[ϕijϕij | H0], E[ϕijϕij | Ha], and E[ϕijϕij | Ha] are computed from the eighteen terms in Table 1 each individual expression based on the data from Table 2 reflecting the expected changes in both continuous endpoint measures during the course of the study in both the control and active groups. Using equations (10) and (11), an exact sample size of 540 (active plus control group subjects) was identified. The quantiles or probabilities of a bivariate (or general k-variate) normal distribution for under the null and alternative hypotheses are numerically available, e.g in the R library multcomp [Hothorn T, et. al., 2011 ], mvtnorm [Genz et. al., 2011] or as an add-in package in Excel. Using equation (15) the asymptotic solution of 536 is identified with little difference seen between the asymptotic and exact solutions.

Table 1
Values of [var phi]ij[var phi]ij used in computing E[[var phi]ij[var phi]ij] for Case 1.
Table 2
LVEF Heart Failure Example

4. Discussion

This manuscript demonstrates a method to combine prospectively declared mortality measures with continuous endpoints that maintain the clinical hierarchy of the occurrence of events, using information from the continuous effect size. It is based on work involving a less complex function in a clinical trial with two dichotomous endpoints [Pfeffer MA et. al., 1992]. No imputation is required, and the difficulties with worse rank assignments to missing continuous endpoint data are avoided.

The score function used is very specific to the problems provided here and all mathematical derivations are tied to the score function. Although the general formulations for the mean and variance of We will be the same for any score function, these computations will reflect the score function used. As an example, there is a common index used in research known as a heart failure index. It is nonparametric composed of clinical assessments (medication changes) hospitalizations, and the occurrence of death. The U statistic procedure proposed here would be of value in the heart failure score scenario; however, in that case, the score function would be based on pair wise changing in heart function scores between active and control group patients.

The problem posed in this manuscript is distinct from the multiple endpoint scenario where investigators choose from among several different endpoint measures. This latter dilemma has been central to clinical trial interpretation, and many important contributions to the literature have addressed this complex challenge.. Clinical trialists commonly face the issue of endpoint selection and cannot resolve it in the favor of one or the other. Clinical trials can have endpoints with no priority among their selection at all [Tilley BC, et al. (1996), National Institute of Neurological Disorders and Stroke rt- PA Stroke Study Group. (1995)]. O’Brien examined the role of a rank sum test in 34 endpoint setting [O'Brien PC. (1984)]. Lachin suggested the use of imputation, assigning a worst rank score to those subjects who are missing the continuous endpoint measure due to a mortal event [Lachin J. (1999)]. The use of Area Under the Curve (AUC) data has been particularly helpful in tumor models [Wu J. (2010)]. In addition several workers [Tan M., (2002)] proposed a heuristic test and a Bayesian procedure for the analysis of two small-sample parametric inference procedures for incomplete longitudinal data with truncation and informative censoring arising in cancer therapy development. Other authors have proposed alternative solutions [O'Brien PC. (1984), Tang DI, et al. (1989a), Tang DI, et al. (1989b), Tang DI, et al. (1993), Tang DI and Getter NL. (1999)].

Of particular use are the weighting values c and d. The investigator has complete control over the values of these weights but must choose them carefully. For example, in clinical trials in which the predominant response value is a dichotomous random variable, weights in the range 0 ≤ dc ≤ 1 are attractive. Since the dichotomous variable occurs so frequently (e.g., mortality) and is only replaced by the continuous measures in the cases where vital status information is not available, discounting the contribution of the continuous variables is appropriate. However, in studies, such as smaller cell therapy studies where the response variable is continuous, and relatively small numbers of subjects have SCE’s, a greater weight for the continuous measure can be justified. In our cell therapy studies, the value of c = 4 is appropriate. We advocate selecting d such that 0 ≤ d < c since the less precise measure should have less influence on the test statistic than the more precise one. However, these values must be chosen before any endpoint analysis takes place to avoid selections that are biased by the investigators observations of the values of the final response variables. The U statistic itself is close to normality for small samples [Mann, Whitney (1947)].

The analysis that we propose creates a new endpoint, and that new endpoint as defined by the score function is more complex. However that change does not overly complicate the interpretation of the result. For example in this case, the score function generates an analysis of either 1) an improvement in heart function or 2) longer survival without a death or heart attack. We believe that this new endpoint is understandable to a research, regulatory and clinical community already comfortable with complex endpoints e.g., fatal and nonfatal heart attacks, or fatal and nonfatal strokes. In addition, the fact that the score function never makes cross modality comparisons, comparing only MRI to MRI changes, or echo to echo changes, helps to keep the interpretation clear.

Complications of the application of this procedure include the observation that the event rates of the significant clinical event and the standard deviation of the continuous measure differs from that assumed during the study’s design phase. In addition, interim review of the statistics by Data Safety and Monitoring Boards introduces new complexities. Neither of these is assessed specifically in this manuscript.

Three clinical trials in the NHLBI sponsored Cardiovascular Cell Therapy Research Network (CCTRN) are currently underway in which we will assess the utility of this approach.

Figure 5
Relationship between sample sizes and weighting factors c and d.

Appendix A.. Computing E[[var phi]i,j]

The notation from the previous section permits us to write the expected value of [var phi]i,j under the hypothesis Hk, k = 0 for the null hypothesis, and k = a for the alternative hypothesis. We assume throughout this manuscript that the time to an SCE and the continuous measure are independent. From equation (5), we may write

equation M20
(17)

This computation is straightforward when the probability distributions of 1) the occurrence of SCE’s and 2) the probability distribution of the continuous response variable r are known. For example, assume the time to an SCE follows an exponential distribution with parameter λx in the control group and λy in the active group. Also assume that the change in the continuous measure r follows a normal distribution with mean as before μΔR (x) and standard deviation σΔR (x) in the control group, and analogously mean μΔR (y), and standard deviation σΔR (y) in the active group. The first term on the right hand side of equation (17) is

equation M21

As another example, the last term on the right hand side of equation (17) is

equation M22
(18)

Since the null hypothesis assumes no treatment effect, we let λx = λy and μΔR (x) = μΔR (y), to see that

equation M23
(19)

Under the alternative hypothesis Ha of a treatment effect, then either λxλy and/or μΔR (x) ≠ μΔR (y), permitting us to write,

equation M24
(20)

Appendix B.. Variance Computation

The computation of the variance of We, while somewhat more complicated than the mean, is executable. Assume n subjects in the control group and m subjects in the active group, then,

equation M25
(21)

Further,

equation M26
(22)

The last term on the right of the second line of (22) is easily evaluated.

equation M27

where the expected value of E[[var phi]ij] has already been computed both under the null (19) and alternative (20) hypotheses.

To evaluate equation M28 from (22), we rewrite equation M29 as

equation M30

This helpful simplification is due to Gehan [Gehan EA. (1965)]. We may now pass the expectation argument through the preceding equation to find

equation M31
(23)

and evaluating term by term we see

equation M32

E[[var phi]ij[var phi]ij] is the expected value of the product of the scoring function between 1) the ith control group subject and the jth active group subject, and 2) the same ith control group subject but a different j′th active group subject where jj. E[[var phi]ij[var phi]ij] is an analogous computation involving the ith and i′th in the control group and the jth subject in the active group. We may now rewrite (23) as

equation M33
(24)

Thus

equation M34

Rather than solve for two unknowns, m and n, we let n = kn and where k is known (for example, if there are twice as many subjects in the active group than in the control group then k = 2). We may then write

equation M35

Further simplication reveals

equation M36
(25)

Since the variance can be computed under both the null and alternative hypothesis, we may write

equation M37

where A0 and B0 are computed under the null hypothesis and Aa and Ba are computed under the alternative.

Appendix C.

The Asymptotic approach for Var [We] :

Working from equation (25)

equation M38

Ignoring terms on the order of n−2, we have

equation M39
(26)

Equation (26) is the variance of We under the alternative hypothesis, Var [We | Ha]. Under the null hypothesis, we assume E[[var phi]ij[var phi]ij] = E[[var phi]ij[var phi]ij], and E[[var phi]ij] = 0, producing equation M40. We therefore may write

equation M41
(27)

Now substituting equations for the Var [We | H0] and Var [We | Ha] from (13) into (8) to compute the sample size of the trial, we write

equation M42
(28)

Noting that the total number subjects in the study is n control group plus kn in the active group, we can write

equation M43
(29)

Power can be expressed as

equation M44
(30)

Appendix D. Structure for E[[var phi]ij[var phi]ij]

The structure for E[[var phi]ij[var phi]ij] can be identified and tabulated (Table 1) revealing 18 terms, each of which is evaluated under the null and alternative hypothesis. For example one of the terms may be written as,

equation M45
(31)

Here U follows an exponential distribution with parameter λx, T is the duration of the study, and V and W are i.i.d. exponentially distributed random variables with parameter λy. Note that the final expressions for the expectation are in terms of the parameters λx, λy, μΔR (x), μΔR (y), equation M46, and equation M47. These are available from the clinical scientists. Thus (31) may be written as

equation M48
(32)

Which may be evaluated under the null hypothesis where λx = λy, or the alternative where λxλy.

However, terms that involve comparison of the continuous response variable between three subjects must be handled differently. Consider the circumstance where the ith subject in the control group’s LVEF has increased by more than the jth and the jth subject in the active group. Then one of the expressions required for E[[var phi]ij[var phi]ij] is

equation M49
(33)

where di is the change in the response variable for the ith subject in the control group over the duration of the study, and dj and dj are the response variable changes for the jth and jth subjects in the active group respectively. The expression e−(λx + 2λy)T is the probability that all three subjects (one in the control group and two in the active group) have no SCE throughout the course of the trial and therefore have the continuous measure assessed at baseline and at the end of the study.

To compute P[(di < dj) ∩ (di < dj)], recall that for the one control group subject, di follows a equation M50, and for the two active group subjects, dj, and dj are identically distributed as equation M51. If we define the random variables U and V in the affine transformation

equation M52
(34)

We may then write, P[(di < dj) ∩ (di < dj)] = P[U < 0 ∩ V < 0]. Since the joint distribution of ri (x), rj (y), and rj (y), is multivariate normal with mean vector u and variance Σ,

equation M53
(35)

we can apply the transformation of (34) we see that the joint distribution of U and V is bivariate normal with mean and variance

equation M54
(36)

Thus, the desired probability P[U < 0 ∩ V < 0] is simply the evaluation of this region over the bivariate normal distribution defined in (36), and the probability required by P[(di < dj) ∩ (di < dj)] is therefore available.

Each of the eighteen terms is computed similarly, and assembled in accordance with Table 1 to construct E[[var phi]ij[var phi]ij] under the null and alternative hypotheses. An analogous table can be constructed for computing E[[var phi]ij[var phi]ij]. Then E[[var phi]ij[var phi]ij | H0] and E[[var phi]ij[var phi]ij | H0] can be substituted into equation (13) to compute the Var [We | H0] and analogously, E[[var phi]ij[var phi]ij | Ha] and E[[var phi]ij[var phi]ij | Ha] will be used to compute Var [We | Ha]. Alternatively, E[[var phi]ij[var phi]ij | H0], E[[var phi]ij[var phi]ij | H0] and E[[var phi]ij[var phi]ij | Ha], E[[var phi]ij[var phi]ij | Ha] can be substituted into equation (25) and (10) to compute the sample size for the exact computation, or equation (15) in the case of asymptotic solution. These two expectations can be used to compute the power in equations (12) and (16) for the exact and asymptotic solutions respectively.

Contributor Information

Lemuel A Moyé, University of Texas Health Science Center at Houston.

Dejian Lai, University of Texas School of Public Health.

Kaiyan Jing, University of Texas School of Public Health.

Mary Sarah Baraniuk, University of Texas School of Public Health.

Minjung Kwak, National Heart, Lung and Blood Institute.

Marc S. Penn, The Cleveland Clinic Foundation.

Colon O. Wu, National Heart, Lung and Blood Institute.

References

1. Abdel-Latif A, et al. Adult bone marrow-derived cells for cardiac repair. Arch Intern Med. 20072007:167, 989–997. [PubMed]
2. Assmus B, et al. Transplantation of progenitor cells and regeneration enhancement in acute myocardial infarction (TOPCARE-AMI) Circulation. 2002;2002;106:3009–3017. doi: 10.1161/01.CIR.0000043246.74879.CD. [PubMed] [Cross Ref]
3. Balsam LB, et al. Haematopoietic stem cells adopt mature haematopoietic fates in ischemic myocardium. Nature. 2004;2004;428:668–673. doi: 10.1038/nature02460. [PubMed] [Cross Ref]
4. Follmann D, Wu M. An approximate generalized linear model with random effects for informative missing data. Biometrics. 1995 1995 Mar;51(1):151–68. doi: 10.2307/2533322. [PubMed] [Cross Ref]
5. Ford ES, et al. Explaining the decrease in U.S. deaths from coronary disease, 1980–2000. N Engl J Med. 2007;2007;356:2388–98. doi: 10.1056/NEJMsa053935. [PubMed] [Cross Ref]
6. Gehan EA. “A generalized Score test for comparing arbitrarily singly-censored samples” Biometrika. 1965;1965;52:203–223. [PubMed]
7. Genz A, Bretz F, Miwa T, I X, Leisch F, Scheipl F, Bornkamp B, Hothorn T. Mvtnorm: multivariate normal and t distributions. R package version 0.9-96. 2011. URL http://CRAN.R-project.org/package=mvtnorm.
8. Grothues F, Smith GC, Moon JC, Bellenger NG, Collins P, Klein HU, Pennel DJ. Comparison of Interstudy Reproducibility of Cardiovascular Mangetic Resonance With Two-Dimensional Echocardiography in Normal Subjects and in Subjects With Heart Failure or Left Ventricular Hypertrophy. Am J Cardiol. 2002;90:29–34. doi: 10.1016/S0002-9149(02)02381-0. [PubMed] [Cross Ref]
9. Hothorn T, Bretz F, Westfall P. Simultaneous Inference in General Parametric Models. Biometrical Journal. 2008;50(3):346–363. doi: 10.1002/bimj.200810425. [PubMed] [Cross Ref]
10. Jackson KA, et al. Regeneration of ischemic cardiac muscle and vascular endothelium by adult stem cells. J Clin Invest. 2001;2001;107:1395–1402. doi: 10.1172/JCI12150. [PMC free article] [PubMed] [Cross Ref]
11. Janssens S, et al. Autologous bone marrow-derived stemcell transfer in subjects with ST-segment elevation myocardial infarction: doubleb-lind, randomised controlled trial. Lancet. 2006;2006;367:113–121. doi: 10.1016/S0140-6736(05)67861-0. [PubMed] [Cross Ref]
12. Kocher AA, et al. Neovascularization of ischemic myocardium by human bone-marrow-derived angioblasts prevents cardiomyocyte apoptosis, reduces remodeling and improves cardiac function. Nat Med. 2001;2001;7:430–436. doi: 10.1038/86498. [PubMed] [Cross Ref]
13. Kowalski J, Tu XM. New York: Wiley; 2007. Modern Applied U-statistics. 2007.
14. Lachin J. Worst-rank score analysis with informatively missing observations in clinical trials. Controlled Clinical Trials. 1999;1999;20:408–422. doi: 10.1016/S0197-2456(99)00022-7. [PubMed] [Cross Ref]
15. Levy D, et al. Long-term trends in the incidence of and survival with heart failure. N Engl J Med. 2002;2002;347:1397–402. doi: 10.1056/NEJMoa020265. [PubMed] [Cross Ref]
16. Lunde K, et al. Intracoronary injection of mononuclear bone marrow cells in acute wall infarction. NEJM. 2006;2006;355:1199–1209. doi: 10.1056/NEJMoa055706. [PubMed] [Cross Ref]
17. Mann B, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics. 1947;18:50–60. doi: 10.1214/aoms/1177730491. [Cross Ref]
18. Moyé LA, et al. Analysis of a Clinical Trial Involving a Combined Mortality and Adherence Dependent Interval Censored Endpoint. Statistics in Medicine. 1992;1992;11:1705–1717. doi: 10.1002/sim.4780111305. [PubMed] [Cross Ref]
19. Moyé LA, for the SAVE Cooperative Group Rationale and Design of a Trial to Assess Subject Survival and Ventricular Enlargement after Myocardial Infarction. Am J Cardiol. 1991;1991;68:70D–79D. [PubMed]
20. Murry CE, et al. Haematopoietic stem cells do not transdifferentiate into cardiac myocytes in myocardial infarcts. Nature. 2004;2004;428:664–668. doi: 10.1038/nature02446. [PubMed] [Cross Ref]
21. National Institute of Neurological Disorders and Stroke rt- PA Stroke Study Group Tissue plasminogen activator for acute ischemic stroke. New England Journal of Medicine. 1995;333:1581–1587. [PubMed]
22. O'Brien PC. Procedures for Comparing Samples with Multiple Endpoints Biometrics, 1984. 1984;40:1079–1087. [PubMed]
23. Orlic D, et al. Bone marrow cells regenerate infracted myocardium. Nature. 2001;2001;410:701–705. doi: 10.1038/35070587. [PubMed] [Cross Ref]
24. Penicka M, et al. Research Correspondence: Intracornonary injection of autologous bone marrow-derived mononuclear cells in subjects with large anterior acute myocardial infarction: A prematurely terminated randomized study. Journal of the American College of Cardiology. 2007;49:2373–2374. doi: 10.1016/j.jacc.2007.04.009. [PubMed] [Cross Ref]
25. Pfeffer MA, et al. Effect of Captopril on mortality and morbidity in subjects with left ventricular dysfunction after myocardial infarction - results of the Survival and Ventricular Enlargement Trial. N Eng J Med. 1992;327(10):669–677. doi: 10.1056/NEJM199209033271001. Sep 3, 1992. [PubMed] [Cross Ref]
26. Roger VL, et al. Trends in heart failure incidence and survival in a community-based population. JAMA. 2004;2004;292:344–50. doi: 10.1001/jama.292.3.344. [PubMed] [Cross Ref]
27. Rosamond W, et al. Heart disease and stroke statistics--2007 update: a report from the American Heart Association Statistics Committee and Stroke Statistics Subcommittee. Circulation. 2007;2007;115:e69–171. doi: 10.1161/CIRCULATIONAHA.106.179918. [PubMed] [Cross Ref]
28. Schächinger V, et al. Intracoronary bone marrow-derived progenitor cells in acute myocardial infarction. NEJM. 2006;2006;355:1210–1221. doi: 10.1056/NEJMoa060186. [PubMed] [Cross Ref]
29. Strauer BE, et al. Repair of infarcted myocardium by autologous intracoronary mononuclear bone marrow cell transplantation in humans. Circulation. 2002;2002;106:1913–1918. doi: 10.1161/01.CIR.0000034046.87607.1C. [PubMed] [Cross Ref]
30. Tan M, Fang HB, Tian GL, Houghton PJ. Small sample inference for incomplete longitudinal data with truncation and censoring in tumor xeno-graft models. Biometrics. 2002;58:612–620. doi: 10.1111/j.0006-341X.2002.00612.x. [PubMed] [Cross Ref]
31. Tang DI, et al. An approximate likelihood ratio test for a normal mean vector with nonnegative components with application to clinical trials. Biometrika. 1989a;76:577–583. doi: 10.1093/biomet/76.3.577. [Cross Ref]
32. Tang DI, et al. Design of group sequential clinical trials with multiple endpoints. Journal of the American Statistical Association. 1989b;84:776–779. doi: 10.2307/2289665. [Cross Ref]
33. Tang DI, et al. On the design and analysis of randomized clinical trials with multiple endpoints. Biometrics. 1993;49:23–30. doi: 10.2307/2532599. [PubMed] [Cross Ref]
34. Tang DI, Getter NL. Closed Testing Procedures for Group Sequential Clinical Trials with Multiple Endpoints Biometrics. 1999;55:1188–1192. [PubMed]
35. Tilley BC, et al. Use of a global test for multiple outcomes in stroke trials with application to the National Institute of Neurological Disorders and Stroke t-PA stroke trial. Stroke. 1996;27:2136–2142. doi: 10.1161/01.STR.27.11.2136. [PubMed] [Cross Ref]
36. Traverse JH, Henry TD, Vaughan D, Ellis SG, Pepine CJ, Willerson JT, Zhao DXM, Piller LB, Penn MS, Byrne BJ, Perin EC, Gee AP, Hatzopoulous AK, McKenna DH, Forder JR, Taylor DA, Cogle CR, Olson RE, Jorgenson BC, Sayre SL, Vojvodic RW, Gordon DJ, Skarlatos SI, Moyé LA, Simari RD, for the Cardiovascular Cell Therapy Research Network Rationale and Design for TIME: A Phase-II, Randomized, Double-Blind, Placebo-Controlled Pilot Trial Evaluating the Safety and Effect of Timing of Administration of Bone Marrow Mononuclear Cells Following Acute Myocardial Infarction. American Heart Journal. 158:356–63. [PMC free article] [PubMed]
37. Traverse JH, Henry TD, Vaughan D, Ellis SG, Pepine CJ, Willerson JT, Zhao DXM, Piller LB, Penn MS, Byrne BJ, Perin EC, Gee AP, Hatzopoulous AK, McKenna DH, Forder JR, Taylor DA, Cogle CR, Olson RE, Jorgenson BC, Sayre SL, Vojvodic RW, Gordon DJ, Skarlatos SI, Moyé LA, Simari RD, for the Cardiovascular Cell Therapy Research Network Rationale and Design for TIME: A Phase-II, Randomized, Double-Blind, Placebo-Controlled Pilot Trial Evaluating the Safety and Effect of Timing of Administration of Bone Marrow Mononuclear Cells Following Acute Myocardial Infarction. American Heart Journal. 158:356–63. [PMC free article] [PubMed]
38. Willerson JT, Perin EC, Ellis SG, Pepine CJ, Henry TD, Zhao DX, Lai D, Penn MS, Byrne BJ, Silva G, Gee A, Traverse JH, Hatzopoulos AK, Forder JR, Martin D, Kronenberg M, Taylor DA, Cogle CR, Baraniuk S, Westbrook L, Sayre SL, Vojvodic RW, Gordon DJ, Skarlatos SI, Moyé LA, Simari RD, Cardiovascular Cell Therapy Research Network (CCTRN) Intramyocardial injection of autologous bone marrow mononuclear cells for subjects with chronic ischemic heart disease and left ventricular dysfunction (First Mononuclear Cells injected in the US [FOCUS]): Rationale and design. Am Heart J. 2010 Aug;160(2):215–23. doi: 10.1016/j.ahj.2010.03.029. [PMC free article] [PubMed] [Cross Ref]
39. Wittkowski KM, et al. Combining several ordinal measures in clinical studies. Statistics in Medicine. 2004;2004;23:1579–1592. doi: 10.1002/sim.1778. [PubMed] [Cross Ref]
40. Wollert KC, et al. Intracoronary autologous bone-marrow cell transfer after myocardial infarction: the BOOST randomized controlled clinical trial. Lancet. 2004;2004;364:141–148. doi: 10.1016/S0140-6736(04)16626-9. [PubMed] [Cross Ref]
41. Wu JR, Houghton PJ. Interval approach to assessing antitumor activity for tumor xenograft studies. Pharmaceutical Statistics. 2010;9(1):46–54. doi: 10.1002/pst.369. [PMC free article] [PubMed] [Cross Ref]
42. Yoon YS, et al. Clonally expanded novel multipotent stem cells from human bone marrow regenerate myocardium after myocardial infarction. JCI. 2005;2005;115:326–338. [PMC free article] [PubMed]

Articles from The International Journal of Biostatistics are provided here courtesy of Berkeley Electronic Press