|Home | About | Journals | Submit | Contact Us | Français|
The assumptions that anchor large clinical trials are rooted in smaller, Phase II studies. In addition to specifying the target population, intervention delivery, and patient follow-up duration, physician-scientists who design these Phase II studies must select the appropriate response variables (endpoints). However, endpoint measures can be problematic. If the endpoint assesses the change in a continuous measure over time, then the occurrence of an intervening significant clinical event (SCE), such as death, can preclude the follow-up measurement. Finally, the ideal continuous endpoint measurement may be contraindicated in a fraction of the study patients, a change that requires a less precise substitution in this subset of participants.
A score function that is based on the U-statistic can address these issues of 1) intercurrent SCE's and 2) response variable ascertainments that use different measurements of different precision. The scoring statistic is easy to apply, clinically relevant, and provides flexibility for the investigators' prospective design decisions. Sample size and power formulations for this statistic are provided as functions of clinical event rates and effect size estimates that are easy for investigators to identify and discuss. Examples are provided from current cardiovascular cell therapy research.
This manuscript develops a U-statistic that incorporates two nettlesome but unavoidable features for collecting continuous endpoint measures in modern cardiology clinical trials; 1) the occurrence of a clinical event (e.g., death) during the trial that precludes the measurement of the endpoint at the end of the study, and 2) the requirement through clinical circumstances that a less precise determination of the endpoint (e.g., echocardiographic determination) be substituted for a most precise determination (e.g., magnetic resonance imaging).
Endpoint selection is challenging in early human cardiovascular cell therapy clinical trials. Possible choices for the endpoint are the size of the heart damaged by a heart attack, known as the infarct region [Strauer BE, et al. (2002), Abdel-Latif A, et al. (2007)], or changes in the percent of blood ejected by the left ventricle with each heart beat, or left ventricular ejection fraction (LVEF) [Assmus B, et al. (2002)]. Recent attention has focused on other measures of left ventricular dysfunction e.g., left ventricular end-diastolic volume (LVEDV) (how large the left ventricle becomes at the peak of the cardiac cycle when it is full of blood), and left ventricular end systolic volume (LVESV) (how small the ventricle is after it has ejected its blood content [Penicka M, et al. (2007)]).
Continuous response variables (endpoints) provide necessary statistical power in well designed clinical experiments. However, since continuous endpoints require measurements at both baseline and during the follow-up period, clinical events can complicate the collection of these important measures. For example, the occurrence of an intervening significant clinical event (SCE) (e.g., death) precludes the follow-up measurement, reducing the precision of the overall measure of therapy effect by reducing the number of endpoint-evaluable subjects. In addition, the observation that there may be a greater proportion of subjects with an SCE in the control group than in the treatment group introduces a new informative censoring complication to the analysis. The informative censoring approach of Follmann, Wu, et. al. [Follmann D and Wu M. (1995)] provides a useful tool for analyzing data in the presence of informative censoring; however, there is no literature on trial design and sample size computations using the informative censoring procedure.
In addition, the most precise measure of the continuous endpoint may not be in the subject’s best interest. For example, while some believe that cardiac magnetic resonance (cMR) is superior to echocardiographic measures of heart function [Grothues (2002)] cMR measures cannot be obtained in everyone. Subjects who have an implantable (metallic) device e.g., a pace maker, remain contraindications to cardiac magnetic resonance imaging. If a substantial fraction of the subjects who were recruited for the study had one of these devices implanted during their follow-up procedure, they will be unable to undergo cMR at the end of the study. A statistical test that incorporates the most precise heart function measure when available, but uses the less precise measure in its absence would permit the study to use each of these subject’s data, regardless of the measurement that is indicated by the subject’s condition.
This manuscript discusses the development of a U-statistic to permit 1) the inclusion of a dichotomous endpoint (SCE) when its occurrence precludes measuring the continuous endpoint measure, and 2) the use of the most precise endpoint information when it is available and less precise information when it is not. The statistic’s development and calibration are based on commonly used clinical research measures that physician scientists understand. Examples are provided from ongoing cardiovascular research.
Coronary artery disease (CAD) remains the single largest killer of Americans, producing myocardial infarctions and heart failure (HF). [Rosamond W, et al. (2006)] Recent research has reduced coronary heart disease mortality [Ford ES, et al. (2007)]. However CAD remains a leading cause of HF. Seven million heart attack hospitalizations in the US have generated almost 5 million subjects living with HF who face end-stage HF with its 5-year mortality of approximately 50%. [Levy D, et al. (2002), Roger VL, et al. (2004)] The large burden faced by these subjects with limited options has spurred the investigation of alternative treatments. One potential treatment strategy is the use of bone marrow–derived mononuclear cells (BMMNCs), a source of stem cells that shows promise for the treatment of these subjects.
Phase II clinical trials in cell therapy research are underway in the United States. Studies in animal models have demonstrated that heart function can be significantly improved with bone marrow-derived stem cells following experimental heart attacks induced in animals [Orlic D, et al. (2001), Kocher AA, et al. (2001), Jackson KA, et al. (2001), Yoon YS, et al. (2005)]. Although data supporting significant heart regeneration in these preclinical studies has not been uniform [Murry CE, et al. (2004), Balsam LB, et al. (2004)], it has led to a number of clinical trials testing the strategy that delivery of a subject’s own (or autologous) bone marrow-derived mononuclear cells (BMMNCs) into the infarct region following AMI may improve heart function [Schächinger V, et al. (2006), Janssens S, et al. (2006), Wollert KC, et al. (2004), Lunde K, et al. (2006)].
In light of the relative paucity of mechanistic studies into important questions, such as timing of cell delivery, the National Heart, Lung, and Blood Institute (NHLBI) established the Cardiovascular Cell Therapy Research Network (CCTRN) to accelerate research into the use of cell-based therapies for the management of cardiovascular diseases. The Network is simultaneously conducting two trials in the acute myocardial infarction environment, TIME [Traverse, et. al. (2010)] LateTIME [Traverse, et. al. (2010)] and one trial in subjects with chronic heart failure and ongoing ischemia, FOCUS [Willerson, et. al. (2010)]. Each of these trials has the characteristics of 1) having continuous measure primary endpoints, 2) enrolling patients who can have SCE’s that preclude the final endpoint measure, and 3) having the most precise of the primary endpoint unavailable in a substantial fraction of the population, requiring the use of a less precise measure available.
This method is based on the two-sample U-statistic [Kowalski J and Tu XM.] (2007)], a well established, nonparametric measure of effect based on an investigator-determined scoring mechanism. Our development is modeled after the U-statistic’s implementation to score the occurrence of a combination of two discrete endpoints in a cardiovascular clinical trial [Moyé LA, et al. (1992), Moyé LA, (1991), Penicka M, et al. (2007)]. A recent use of this statistic in medical research has been its application to multivariate ordinal data [Wittkowski KM, et al. (2004)].
In its simplest adaptation, the U-statistic “builds itself up” from a prospectively selected scoring procedure. Let there be n observations in the control group. Let each of the n subjects in the control group have a continuous endpoint measure xi, i = 1, 2, 3, ..., n. Similarly, let the primary endpoint measure for each of the m subjects in the active group be indexed by yj, j = 1, 2, 3, …, m.
The U-statistic requires a simple scoring mechanism, denoted by i,j. This is the assignment of a score designed in this paper based on comparing the ith subject in the control group with the jth subject in the active group. The score may be as simple as i,j = 1 if xi > yj; i,j = 0 if xi = yj; or i,j = −1 if xi < yj. Since each of the n control group subjects will be compared to each of the m active group subjects, there are nm comparisons. The U-score statistic, We is simply the average of these nm scores,
The normalized statistic based on these scores for a test of the null hypothesis (H0) of no treatment effect versus the alternative hypothesis (Ha) of a change in the distribution of the yj’s based on the treatment is
Under mild regulatory conditions and adequate sample size, we assume that (2) follows a standard normal distribution, then we can compute the sample size from (assuming n = m)
where v0 = Var [We | H0], va = Var [We | Ha], α is the probability of a type I error, β is the probability of a type II error, and Zc is the cth percentile value from the standard normal distribution, Alternatively, power may be computed from
where ΦZ (z) is the cumulative distribution function of the standard normal distribution.
However, the adoption of this statistic requires a careful justification of the scoring mechanism required for the response variables (endpoints). The setting for our evaluations is that of a randomized clinical trial with both a control and an active group. We will construct the score statistic in two cases:
The mathematics of Case 1 will be developed in detail, and then applied to the Case 2 scenario, which is the scenario that we face in the CCTRN network cell therapy studies.
The investigators’ goal is to compare the change in the measure of a single continuous response variable over time in the control group to the change in that same variable in the active group. Left ventricular ejection fraction (LVEF) is an example of a commonly measured continuous endpoint. LVEF is the percent of the blood in the left ventricle ejected at each beat (for subjects without heart disease LVEF is typically larger than 80%.). To assess changes over time in a variable such as LVEF, there should be a measurement at baseline and at the end of the study. However, the investigators recognize that this goal may not be achievable in all subjects because of the occurrence of death or another SCE. We will assume that (as is the case with LVEF) an increase in the response variable over time corresponds to improved health status.
Let r be this continuous endpoint variable. Then, for the ith subject in the control group, i = 1, 2, 3, …, n, let di (x) = ri,2(x) − ri,1(x) be the change in this endpoint variable over the duration of the study. Assume that di(x) has mean μΔR (x) and variance . Analogously, let dj(y) = rj,2(y) − rj,1(y) be the change in the endpoint measure for the jth subject in the active group, which mean μΔR (y)and known variance . Under the null hypothesis of the study, μΔR (x) = μΔR (y). If we assume that larger values of μΔR correspond to improved health, then under the alternative hypothesis, the researchers expect that μΔR (x) < μΔR (y).
However, the occurrence of a significant event (SCE) (e.g., a death, a recurrent myocardial infarction (MI), can affect the follow-up measurement of the continuous variable. The hallmark of the SCE is that 1) its occurrence during the trial either precludes the follow-up measurement (as in the case of death), or perturbs the measurement to the point that the effect of therapy can be difficult to assess (e.g., the occurrence of an intercurrent heart attack), and 2) the SCE event rates in the randomized groups may themselves be related to the therapy effect. The occurrence of an intervening SCE (itself an underpowered evaluation in a small study) reduces the power of the LVEF measure by decreasing the number of subjects who survive to have the follow-up measurement.
In this case we define the scoring mechanism, i,,j as follows:
Under this mechanism, the occurrence of an early SCE (e.g., a death) in one group is considered worse than a subject survival or a later occurring SCE in the other treatment group. If both subjects in the comparison have no SCE, then the change in the response variable is compared.
With some additional notation, the assignment of this scoring system permits the computation of the mean and variance of We under the null and alternative hypothesis.
Define CX(i)(E, R) as the endpoint status of the ith subject in the control group, and CY(j)(E, R) as the endpoint status of the jth subject in the active group. We will use this notation to allow us to capture either 1) the time to the occurrence of an SCE if one has occurred during the course of the trial, or 2) the change in the continuous variable if an SCE has not occurred.
If an SCE has occurred for the ith subject in the control group, then CX(i)(E, R) = CX(i)(+, R), and its value is the time to the occurrence of the SCE. Since the SCE has occurred during the course of the study, then 0 ≤ CX(i)(+, R) ≤ T where T is the maximum time a subject is to be followed in the research protocol. If an SCE has not occurred, then CX(i)(E, R) = CX(i)(–, R), and we set CX(i)(–, R) to equal the change in the continuous measure. Identical notation applies to the jth subject in the active group, CY(j)(E, R).
For example, if in a 180 day clinical trial, the 4th subject in the control group died on day 117, then CX(4)(E, R) = CX(4)(+, R) = 117, the positive sign signifying that the SCE event occurred. Alternatively, if the 5th subject in the active group survived the trial and experienced a six unit increase in the continuous response variable, then CY(5)(E, R) = CY(5)(–, R) = 6, the minus sign in CY(5)(–, R) indicating that no SCE occurred during the study.
Using this notation and letting 1XA be the indicator function that takes the value of 1 when x is a member of set A and 0 otherwise, we can write the score function i,j as
For this function we can compute its expected value under both the null (E[i,j | H0]) and alternative E[i,,j | Ha] hypotheses (Appendix A).
Assuming a normal distribution for We, we compute that,
And from consideration of the type II error, we may write,
The variance terms in (8), are computed (Appendix B) to be
Where the constants A0 B0, and Aa, and Ba are functions of expectations i,,j under the null and alternative hypothesis. In this scenario, an exact solution for solution for the sample size n is available. If we write m = kn, where k is known (for example if there are an equal number of subjects in the active group as in the control group, then k = 1), then substituting for the variance term, we may write
Squaring both sides, simplifying, and squaring again, with expansion and further simplification produces the quartic equation
Power can be more directly computed as
An asymptotic solution is also available (Appendix C)
Now substituting equations for the Var[i,j | H0] and Var[i,j | Ha] from (13) into (8) to compute the sample size of the trial, we write
Noting that the total number subjects in the study is n control group plus kn in the active group, we can write
Power can be expressed as
This second case adds one level of complexity to Case 1. The investigators’ goal is to compare the change in a continuous measure over time in the control group to that of the active group. However in this circumstance the investigators have two competing assessments of the same endpoint continuous variable. For example, in the TIME study, while cMR measure of LVEF is the most precise measure, it will need to be substituted by echocardiographic determinations of ejection fraction. The first, denoted by the continuous variable r, is the most precise but is not available in all subjects. The second, denoted by s, is less accurate, but is available for everyone.
For this case, we can modify the scoring function from Case 1 adding the following conditions
The computations follow the development of Case 1. E[i,,j i,,j′] now requires 22 terms (Appendix D).
Figure 1 identifies the relationship between the trial size (total number of subjects in both the active and the placebo group) and the probability of a significant clinical event as a function of the effect of cell therapy on the significant clinical event rate as a function of c, the weight ascribed to the continuous endpoint measure in the analysis. In this circumstance we assume that the change in the response variable in the active group is five units greater than the change in the control group (Δ = 5). We also assume the standard deviation of this change is 7 for each group (90% power and a type I error rate a two sided alpha of 0.05 is assumed for all analyses). In each of the curves in Figure 1, curves, the trial size is larger for larger probabilities of an SCE. Larger probabilities of an SCE increase the proportion of subjects who have no measure of the continuous endpoint that is obtained at the conclusion of the study, and larger sample sizes are required in order to main the power of the evaluation of the therapy’s impact on the continuous measure.
We also note the sample size increases as the value of c decreases. The value of c is the relative weight in the scoring system. As c decreases the impact of a nonzero comparison between the active and control group measures has less weight than that of the comparison of SCE timings. This diminished weight for comparison generates the need for more continuous measure comparisons in the cohort, thereby increasing the sample size. Figure 2 demonstrates the same effect of decreasing sample size for larger values of the continuous weighting function c. Here e represents the effect of the therapy on the SCE rate (represented as the percent reduction in the SCE control group rate experienced by the active group subjects). Sample size decreased as e increased. Note that for all values of efficacy evaluated, the sample size stabilized for values of c greater than 3.
The well established relationship between sample size and treatment effect (Δ) are demonstrated in Figure 3. In the paradigm of combining a continuous and a dichotomous endpoint, the sample size decreases as the effect size increases, and increases as the treatment standard deviation of the difference (σΔ) increases. Analogously, it is well accepted that when a dichotomous measure is used as a response variable in a clinical trial, the trial size increases as the prevalence of the dichotomous variable increases and decreases with increasing efficacy of treatment against that response variable. This is demonstrated in Figure 4. Thus, the score statistic is a function of the effect of the cell therapy on the continuous measure, as reflected by Δ and σ(Δ), and also by the efficacy of the therapy on the SCE rate as well, e. Figure 4 demonstrates that, while larger values of the probability of an SCE still produce larger trial sizes, the efficacy of the therapy on the SCE rate moderates this relationship.
In this research scenario there are two continuous measures, each with weights c and d. As both c and d increase, the weight of each continuous endpoint increases, and the sample size decreases. However, the larger values of c and d have diminishing impact on the sample size.
Example. In a study in heart failure [Pfeffer MA, et. al. 1992] LVEF was expected to increase slightly in the control group, and anticipated to increase to a greater degree in the active group. However subjects with heart failure die or have clinical events precluding the assessment of this measure. In addition LVEF was measured in two ways. The premier measurement was through a radionuclide ventriculogram (RVG). RVG-LVEF measures were the most accurate; however, the requirement of radiation exposure limited the utility of this procedure. An alternative was to use echocardiography to obtain the LVEF. Echo-based LVEF, new at the time, was safe but less precise. The presence of two continuous measures with one preferred over the other, in addition to the occurrence of SCE’s is a circumstance in which the proposed sample size computation was designed.
From a sample of data from this study (Table 2), we use the percent of significant clinical events, in concert with the data for expected changes in RVGLVEF and echo muscle mass to compute E[ϕij | Ha]. The terms for the variance E[ϕijϕi′j | H0], E[ϕijϕi′j | Ha], and E[ϕijϕij′ | Ha] are computed from the eighteen terms in Table 1 each individual expression based on the data from Table 2 reflecting the expected changes in both continuous endpoint measures during the course of the study in both the control and active groups. Using equations (10) and (11), an exact sample size of 540 (active plus control group subjects) was identified. The quantiles or probabilities of a bivariate (or general k-variate) normal distribution for under the null and alternative hypotheses are numerically available, e.g in the R library multcomp [Hothorn T, et. al., 2011 ], mvtnorm [Genz et. al., 2011] or as an add-in package in Excel. Using equation (15) the asymptotic solution of 536 is identified with little difference seen between the asymptotic and exact solutions.
This manuscript demonstrates a method to combine prospectively declared mortality measures with continuous endpoints that maintain the clinical hierarchy of the occurrence of events, using information from the continuous effect size. It is based on work involving a less complex function in a clinical trial with two dichotomous endpoints [Pfeffer MA et. al., 1992]. No imputation is required, and the difficulties with worse rank assignments to missing continuous endpoint data are avoided.
The score function used is very specific to the problems provided here and all mathematical derivations are tied to the score function. Although the general formulations for the mean and variance of We will be the same for any score function, these computations will reflect the score function used. As an example, there is a common index used in research known as a heart failure index. It is nonparametric composed of clinical assessments (medication changes) hospitalizations, and the occurrence of death. The U statistic procedure proposed here would be of value in the heart failure score scenario; however, in that case, the score function would be based on pair wise changing in heart function scores between active and control group patients.
The problem posed in this manuscript is distinct from the multiple endpoint scenario where investigators choose from among several different endpoint measures. This latter dilemma has been central to clinical trial interpretation, and many important contributions to the literature have addressed this complex challenge.. Clinical trialists commonly face the issue of endpoint selection and cannot resolve it in the favor of one or the other. Clinical trials can have endpoints with no priority among their selection at all [Tilley BC, et al. (1996), National Institute of Neurological Disorders and Stroke rt- PA Stroke Study Group. (1995)]. O’Brien examined the role of a rank sum test in 34 endpoint setting [O'Brien PC. (1984)]. Lachin suggested the use of imputation, assigning a worst rank score to those subjects who are missing the continuous endpoint measure due to a mortal event [Lachin J. (1999)]. The use of Area Under the Curve (AUC) data has been particularly helpful in tumor models [Wu J. (2010)]. In addition several workers [Tan M., (2002)] proposed a heuristic test and a Bayesian procedure for the analysis of two small-sample parametric inference procedures for incomplete longitudinal data with truncation and informative censoring arising in cancer therapy development. Other authors have proposed alternative solutions [O'Brien PC. (1984), Tang DI, et al. (1989a), Tang DI, et al. (1989b), Tang DI, et al. (1993), Tang DI and Getter NL. (1999)].
Of particular use are the weighting values c and d. The investigator has complete control over the values of these weights but must choose them carefully. For example, in clinical trials in which the predominant response value is a dichotomous random variable, weights in the range 0 ≤ d ≤ c ≤ 1 are attractive. Since the dichotomous variable occurs so frequently (e.g., mortality) and is only replaced by the continuous measures in the cases where vital status information is not available, discounting the contribution of the continuous variables is appropriate. However, in studies, such as smaller cell therapy studies where the response variable is continuous, and relatively small numbers of subjects have SCE’s, a greater weight for the continuous measure can be justified. In our cell therapy studies, the value of c = 4 is appropriate. We advocate selecting d such that 0 ≤ d < c since the less precise measure should have less influence on the test statistic than the more precise one. However, these values must be chosen before any endpoint analysis takes place to avoid selections that are biased by the investigators observations of the values of the final response variables. The U statistic itself is close to normality for small samples [Mann, Whitney (1947)].
The analysis that we propose creates a new endpoint, and that new endpoint as defined by the score function is more complex. However that change does not overly complicate the interpretation of the result. For example in this case, the score function generates an analysis of either 1) an improvement in heart function or 2) longer survival without a death or heart attack. We believe that this new endpoint is understandable to a research, regulatory and clinical community already comfortable with complex endpoints e.g., fatal and nonfatal heart attacks, or fatal and nonfatal strokes. In addition, the fact that the score function never makes cross modality comparisons, comparing only MRI to MRI changes, or echo to echo changes, helps to keep the interpretation clear.
Complications of the application of this procedure include the observation that the event rates of the significant clinical event and the standard deviation of the continuous measure differs from that assumed during the study’s design phase. In addition, interim review of the statistics by Data Safety and Monitoring Boards introduces new complexities. Neither of these is assessed specifically in this manuscript.
Three clinical trials in the NHLBI sponsored Cardiovascular Cell Therapy Research Network (CCTRN) are currently underway in which we will assess the utility of this approach.
The notation from the previous section permits us to write the expected value of i,j under the hypothesis Hk, k = 0 for the null hypothesis, and k = a for the alternative hypothesis. We assume throughout this manuscript that the time to an SCE and the continuous measure are independent. From equation (5), we may write
This computation is straightforward when the probability distributions of 1) the occurrence of SCE’s and 2) the probability distribution of the continuous response variable r are known. For example, assume the time to an SCE follows an exponential distribution with parameter λx in the control group and λy in the active group. Also assume that the change in the continuous measure r follows a normal distribution with mean as before μΔR (x) and standard deviation σΔR (x) in the control group, and analogously mean μΔR (y), and standard deviation σΔR (y) in the active group. The first term on the right hand side of equation (17) is
As another example, the last term on the right hand side of equation (17) is
Since the null hypothesis assumes no treatment effect, we let λx = λy and μΔR (x) = μΔR (y), to see that
Under the alternative hypothesis Ha of a treatment effect, then either λx ≠ λy and/or μΔR (x) ≠ μΔR (y), permitting us to write,
The computation of the variance of We, while somewhat more complicated than the mean, is executable. Assume n subjects in the control group and m subjects in the active group, then,
The last term on the right of the second line of (22) is easily evaluated.
where the expected value of E[ij] has already been computed both under the null (19) and alternative (20) hypotheses.
To evaluate from (22), we rewrite as
This helpful simplification is due to Gehan [Gehan EA. (1965)]. We may now pass the expectation argument through the preceding equation to find
and evaluating term by term we see
E[ijij′] is the expected value of the product of the scoring function between 1) the ith control group subject and the jth active group subject, and 2) the same ith control group subject but a different j′th active group subject where j ≠ j′. E[iji′j] is an analogous computation involving the ith and i′th in the control group and the jth subject in the active group. We may now rewrite (23) as
Rather than solve for two unknowns, m and n, we let n = kn and where k is known (for example, if there are twice as many subjects in the active group than in the control group then k = 2). We may then write
Further simplication reveals
Since the variance can be computed under both the null and alternative hypothesis, we may write
where A0 and B0 are computed under the null hypothesis and Aa and Ba are computed under the alternative.
The Asymptotic approach for Var [We] :
Working from equation (25)
Ignoring terms on the order of n−2, we have
Equation (26) is the variance of We under the alternative hypothesis, Var [We | Ha]. Under the null hypothesis, we assume E[iji′j] = E[ijij′], and E[ij] = 0, producing . We therefore may write
Now substituting equations for the Var [We | H0] and Var [We | Ha] from (13) into (8) to compute the sample size of the trial, we write
Noting that the total number subjects in the study is n control group plus kn in the active group, we can write
Power can be expressed as
The structure for E[ijij′] can be identified and tabulated (Table 1) revealing 18 terms, each of which is evaluated under the null and alternative hypothesis. For example one of the terms may be written as,
Here U follows an exponential distribution with parameter λx, T is the duration of the study, and V and W are i.i.d. exponentially distributed random variables with parameter λy. Note that the final expressions for the expectation are in terms of the parameters λx, λy, μΔR (x), μΔR (y), , and . These are available from the clinical scientists. Thus (31) may be written as
Which may be evaluated under the null hypothesis where λx = λy, or the alternative where λx ≠ λy.
However, terms that involve comparison of the continuous response variable between three subjects must be handled differently. Consider the circumstance where the ith subject in the control group’s LVEF has increased by more than the jth and the j’th subject in the active group. Then one of the expressions required for E[ijij′] is
where di is the change in the response variable for the ith subject in the control group over the duration of the study, and dj and dj′ are the response variable changes for the jth and j’th subjects in the active group respectively. The expression e−(λx + 2λy)T is the probability that all three subjects (one in the control group and two in the active group) have no SCE throughout the course of the trial and therefore have the continuous measure assessed at baseline and at the end of the study.
To compute P[(di < dj) ∩ (di < dj′)], recall that for the one control group subject, di follows a , and for the two active group subjects, dj, and dj′ are identically distributed as . If we define the random variables U and V in the affine transformation
We may then write, P[(di < dj) ∩ (di < dj′)] = P[U < 0 ∩ V < 0]. Since the joint distribution of ri (x), rj (y), and rj′ (y), is multivariate normal with mean vector u and variance Σ,
we can apply the transformation of (34) we see that the joint distribution of U and V is bivariate normal with mean and variance
Thus, the desired probability P[U < 0 ∩ V < 0] is simply the evaluation of this region over the bivariate normal distribution defined in (36), and the probability required by P[(di < dj) ∩ (di < dj′)] is therefore available.
Each of the eighteen terms is computed similarly, and assembled in accordance with Table 1 to construct E[ijij′] under the null and alternative hypotheses. An analogous table can be constructed for computing E[iji′j]. Then E[ijij′ | H0] and E[iji′j | H0] can be substituted into equation (13) to compute the Var [We | H0] and analogously, E[ijij′ | Ha] and E[iji′j | Ha] will be used to compute Var [We | Ha]. Alternatively, E[ijij′ | H0], E[iji′j | H0] and E[ijij′ | Ha], E[iji′j | Ha] can be substituted into equation (25) and (10) to compute the sample size for the exact computation, or equation (15) in the case of asymptotic solution. These two expectations can be used to compute the power in equations (12) and (16) for the exact and asymptotic solutions respectively.
Lemuel A Moyé, University of Texas Health Science Center at Houston.
Dejian Lai, University of Texas School of Public Health.
Kaiyan Jing, University of Texas School of Public Health.
Mary Sarah Baraniuk, University of Texas School of Public Health.
Minjung Kwak, National Heart, Lung and Blood Institute.
Marc S. Penn, The Cleveland Clinic Foundation.
Colon O. Wu, National Heart, Lung and Blood Institute.