Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3035160

Formats

Article sections

- SUMMARY
- 1. Introduction
- 2. Quality of Life Data in Clinical Trials
- 3. Principal Stratification Approach
- 4. Analysis of the QOL Data
- 5. Summary
- Supplementary Material
- References

Authors

Related links

J Am Stat Assoc. Author manuscript; available in PMC 2011 September 1.

Published in final edited form as:

J Am Stat Assoc. 2010 September 1; 105(491): 912–929.

doi: 10.1198/jasa.2010.ap08739.PMCID: PMC3035160

NIHMSID: NIHMS192613

See other articles in PMC that cite the published article.

In longitudinal clinical trials, when outcome variables at later time points are only defined for patients who survive to those times, the evaluation of the causal effect of treatment is complicated. In this paper, we describe an approach that can be used to obtain the causal effect of three treatment arms with ordinal outcomes in the presence of death using a principal stratification approach. We introduce a set of flexible assumptions to identify the causal effect and implement a sensitivity analysis for non-identifiable assumptions which we parameterize parsimoniously. Methods are illustrated on quality of life data from a recent colorectal cancer clinical trial.

In longitudinal clinical trials, outcome variables at later time points are only defined for patients who survive to those times. This complicates analyses, since those who survive on each treatment arm to any given time point are not a random sample of those randomized to each treatment arm. Standard approaches to account for missing data in the literature implicitly ‘impute’ values of outcome after dropout. For quality of life (QOL) data, if a subject drops out due to deteriorating disease status or death, the QOL will not be defined after the dropout time. If we restrict the analyses to subgroup of patients who survived, the resulting estimate of the treatment effect will no longer have a causal interpretation.

One way to conduct analyses in this circumstance is to construct the joint distribution of the longitudinal responses and death times (Hogan and Laird, 1997; Pauler et al., 2003; Kurland and Heagerty, 2005; Lee and Daniels, 2007). Hogan and Laird (1997) proposed a likelihood-based mixture model approach to model the joint distribution for incomplete repeated measurements and right-censored event times. Pauler et al. (2003) proposed a pattern mixture model for longitudinal quality of life data with nonignorable missingness due to dropout and censorship by death. Recently, Kurland and Heagerty (2005) explored regression models conditioning on being alive as a valid target of inference using regression models that condition on survival status rather than a specific survival time. Lee and Daniels (2007) used a pattern mixture approach with patterns defined on the observed tumor progression or death times which is similar to ideas in Hogan and Laird (1997) and Pauler et al. (2003). Each of these approaches to drawing inference in the presence of death, in effect, stratify the analysis using events that occur after randomization and do not maintain the causal interpretation of the treatment effect.

To obtain the causal effect of treatment, we start by defining potential outcomes. Potential outcomes are all the outcomes that would be observed if each of the treatments had been applied to each of the subjects (Rubin, 1974, 1978; Holland, 1986). Frangakis and Rubin (2002) used the concept of potential outcomes in an approach called ‘principal stratification’. Principal stratification partitions subjects into sets with respect to posttreatment variables. For example, consider two treatments, Tx = 0, 1 and a binary response *Y*(Tx). The causal effect of the treatment on response is *Y*(1) − *Y*(0). However, we only observe either *Y*(1) or *Y*(0) for each subject. Let *D*(Tx) be a binary posttreatment variable. In this case, there are four principal strata defined by the pairs of potential values of *D*(Tx), (i) ({*i*|*D*(0) = 0, *D*(1) = 0}), (ii) ({*i*|*D*(0) = 0, *D*(1) = 1}), (iii) ({*i*|*D*(0) = 1, *D*(1) = 0}), and (iv) ({*i*|*D*(0) = 1, *D*(1) = 1}). The principal strata are not affected by treatment assignment. Causal effects are defined within these principal strata.

Frangakis and Rubin (2002) discussed causal effects in studies where the outcome does not exist due to death. Rubin (2000) and Hayden et al. (2005) referred to the estimand in Frangakis and Rubin (2002) as ‘the survivors average causal effect (SACE)’. Egleston et al. (2007) proposed assumptions to identify the SACE among survivors at a fixed time point in a cohort study of older adults and implemented a sensitivity analysis for some of those assumptions. Rubin (2006) introduced the causal effect of a treatment on an outcome that is censored by death in a Quality of Life study. Recently, Lee and Daniels (2008) used a ‘principal stratification’ approach to estimate the SACE for ordinal data in a two treatment arm setting.

The identification of the *SACE* relies on untestable assumptions. For example, the widely used monotonicity assumption (in the context of death) states that “no subject would be dead if randomized to an active treatment, but would be alive if randomized to a control treatment”. This assumption (and variations) has been used in a variety of settings, including death and noncompliance (Zhang and Rubin, 2003; Gilbert et al., 2003; Cheng and Small, 2006; Egleston et al., 2007; Lee and Daniels, 2008; Cheng, 2009; Imai 2008; Jin and Rubin, 2008). Stochastic monotonicity assumptions have been developed recently as a weaker and more realistic alternative in the context of compliance (Roy et al., 2008) and instrumental variables (Small and Tan, 2008).

In this paper, we describe an approach that can be used to obtain the causal effect of treatment in the presence of death in a setting of three treatment arms with longitudinal ordinal outcomes. We replace the common deterministic monotonicity assumption with a more flexible set of assumptions that allows incorporation of subject specific covariates to increase its applicability to our setting. Overall, we provide a realistic framework that has only four sensitivity parameters for longitudinal ordinal data involving comparisons among three treatments. Although our main interest will be treatment differences at the end of the study, we will make efficient use of all available longitudinal data which is essential due to the high rate of dropouts unrelated to death.

The paper is arranged as follows. In section 2, we describe the clinical trial that motivated our analyses. In section 3, we formally define the causal effects of interest, provide some assumptions that are necessary to identify them, and then derive how the causal effects can be estimated from these assumptions along with modeling assumptions for the observed data in this complex setting. We also propose a sensitivity analysis procedure for some of these untestable assumptions. Methods are illustrated on the quality of life data in section 4. Section 5 contains conclusions and a discussion.

As a secondary outcome, many clinical trials measure aspects of a patient’s sense of well-being and ability to perform various tasks to assess the effects that cancer and its treatment have on the patient. Such quality of life (QOL) outcomes in clinical trials are generally obtained repeatedly over time using questionnaires that can be completed by the patient without direct supervision. These standardized questionnaires rate factors such as pain, treatment side-effects, mood, energy level, family and social interactions, sexual function, ability to work, and ability to maintain routine daily activities. Here, we describe the QOL data set with which we will work.

We analyze QOL data from a recent colorectal cancer clinical trial (Trial N9741, Goldberg et al., 2004). The trial randomized patients with advanced colorectal cancer to compare three different two drug combinations. The main objective of the trial was to find a better treatment for colorectal cancer. The trial led to a new standard of care.

A total of 795 patients with colorectal cancer were randomly assigned to one of three treatments between May 1999 and April 2001. Patients received either irinotecan and bolus fluorouracil plus leucovorin (IFL arm), oxaliplatin and infused fluorouracil plus leucovorin (FOLFOX arm), or irinotecan and oxaliplatin (IROX arm). At the time the trial was initiated, the IFL regimen was the FDA approved standard of care (Saltz et al., 2000). Based on the primary endpoints of time to progression and overall survival, the FOLFOX arm was identified as the superior treatment and FOLFOX was also comparatively safe relative to the other treatments.

However, given that the toxicity profiles were quite different on the three treatment arms, it was of interest to see if there was a negative impact of the ‘better’ treatment (FOLFOX) on patient’s QOL. In this paper we compare the quality of life data on the three treatments.

The QOL outcomes measured here included fatigue, nausea, vomiting, diarrhea, and dehydration. Previous analyses split the QOL scores into 12 week windows. The patient’s visit is the window number during which the survey was filled out (0=baseline, 1=1–84 days after going on study, 2=85–168 days after going on study,…, 5=337–420 days after going on study). We will analyze the data from the first five windows (up to about one year).

For illustration, we focus on one QOL measure, fatigue. Fatigue is measured on a 5 point ordinal scale (1: I am usually not tired at all; 2: I am occasionally rather tired; 3: There are frequently periods when I am quite tired; 4: I am usually very tired; 5: I feel exhausted most of the time). Because very few patients reported category 5, we collapsed category 4 and 5 into one category.

Similar to other longitudinal studies, the QOL data collection in this clinical trial had substantial dropout. The overall dropout rates were similar in the three treatment groups (96% for IFL; 98% for IROX; 98% for FOLFOX). About 95% of dropouts occurred between baseline and the fourth visit. Dropout was related to whether cancer had progressed, death, toxicity, and other factors. A large number of subjects dropped out due to tumor progression or death (40.5%). The dropout rates due to progression/death were marginally higher on the IFL and IROX arms than on the FOLFOX arm. An interesting feature of this dataset is that we have the actual progression/death times for all subjects including those who dropped out for other reasons. More detailed description of the pattern of dropouts in this study is given in Lee and Daniels (2007).

We will use this data to illustrate the models and methods introduced in the subsequent sections.

Consider a three-arm trial. In our application the three treatments were IFL (standard of care) and FOLFOX and IROX (both experimental). Let Tx_{i} be the treatment indicator for subject *i* = 1, … , *N*,

$${\text{Tx}}_{i}=\{\begin{array}{cc}0,\hfill & \text{if subject}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is assigned to IFL};\hfill \\ 1,\hfill & \text{if subject}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is assigned to IROX};\hfill \\ 2,\hfill & \text{if subject}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is assigned to FOLFOX},\hfill \end{array}$$

and let *D _{it}*(Tx

$${D}_{\mathit{\text{it}}}({\text{Tx}}_{i})=\{\begin{array}{cc}0,\hfill & \text{alive};\hfill \\ 1,\hfill & \text{tumor progressed or dead}.\hfill \end{array}$$

Note *T* is last visit. Obviously, if *D _{it}*(Tx

We define the following eight (principal) strata:

*A*(0) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (0, 0, 0)}: the subjects who would be alive under all three arms at visit_{t}*t*.*A*(1) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (1, 0, 0)}: the subjects who would be alive under the two experimental treatments but not alive under the standard of care (IFL) at visit_{t}*t*.*A*(2) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (1, 1, 0)}: the subjects who would be alive under FOLFOX but not alive under the standard of care (IFL) and IROX at visit_{t}*t*.*A*(3) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (1, 0, 1)}: the subjects who would be alive under IROX but not alive under the standard of care (IFL) and FOLFOX at visit_{t}*t*.*A*(4) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (0, 0, 1)}: the subjects who would be alive under the standard of care (IFL) and IROX but not alive under FOLFOX at visit_{t}*t*.*A*(5) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (0, 1, 0)}: the subjects who would be alive under the standard of care (IFL) and FOLFOX but not alive under IROX at visit_{t}*t*.*A*(6) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (0, 1, 1)}: the subjects who would be alive under standard of care (IFL) but not alive under the two experimental treatments at visit_{t}*t*.*A*(7) = {_{t}*i*|(*D*(0),_{t}*D*(1),_{t}*D*(2)) = (1, 1, 1)}: the subjects who would not be alive under any treatment at visit_{t}*t*.

Now, let *Y _{it}*(Tx

$${t}_{=}$$

where, for treatment tx, *Y _{t}*(tx) is the potential outcome if a subject is alive at visit

$${\mathrm{\mathit{\text{it}}}}_{=}$$

Unfortunately, given * _{it}* alone, we cannot determine to which stratum individual

In the next section, we formally define the causal effects of interest and then state some additional assumptions that are necessary to estimate them.

Suppress *i* here for clarity. We assume each individual at visit *T* (last visit) belongs to one of 8 basic principal strata (Frangakis and Rubin, 2002) defined by the unique combinations of (*D _{T}*(0),

For each pairwise treatment comparison, we define the three causal effects of interest at time *T* (last visit) for *k* = 1, … , *K* − 1,

- $$\begin{array}{cc}{\mathit{\text{SACE}}}_{k,1}(1,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(1)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,0)\}}{\text{odds}\{{Y}_{T}(0)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,0)\}}\hfill \\ \hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(1)>k|{A}_{T}(0)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|{A}_{T}(0)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,2}(1,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(1)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,1)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,1)\}}\hfill \\ \hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(1)>k|{A}_{T}(4)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|{A}_{T}(4)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,3}(1,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(1)>k|{A}_{T}(0){A}_{T}(4)\}\mathit{\text{odds}}\{{Y}_{T}(0)k|{A}_{T}(0){A}_{T}(4)\},}{}\hfill \end{array}$$
- $$\begin{array}{cc}{\mathit{\text{SACE}}}_{k,1}(2,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,0)\}}{\text{odds}\{{Y}_{T}(0)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,0,0)\}}\hfill \\ \hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|{A}_{T}(0)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|{A}_{T}(0)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,2}(2,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,1,0)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))=(0,1,0)\}}\hfill \\ \hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|{A}_{T}(5)\}}{\mathit{\text{odds}}\{{Y}_{T}(0)>k|{A}_{T}(5)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,3}(2,0)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|{A}_{T}(0){A}_{T}(5)\}\mathit{\text{odds}}\{{Y}_{T}(0)k|{A}_{T}(0){A}_{T}(5)\},}{}\hfill \end{array}$$
- $$\begin{array}{cc}{\mathit{\text{SACE}}}_{k,1}(2,1)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)>k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))(0,0,0)\}\mathit{\text{odds}}\{{Y}_{T}(1)k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))(0,0,0)\}}{}\hfill & =\frac{\text{odds}\{{Y}_{T}(2)k|{A}_{T}(0)\}}{\mathit{\text{odds}}\{{Y}_{T}(1)k|{A}_{T}(0)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,2}(2,1)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))(1,0,0)\}\mathit{\text{odds}}\{{Y}_{T}(1)k|({D}_{T}(0),{D}_{T}(1),{D}_{T}(2))(1,0,0)\}}{}\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)k|{A}_{T}(1)\}}{\mathit{\text{odds}}\{{Y}_{T}(1)k|{A}_{T}(1)\}},\hfill \\ {\mathit{\text{SACE}}}_{k,3}(2,1)\hfill & =\frac{\mathit{\text{odds}}\{{Y}_{T}(2)k|{A}_{T}(0){A}_{T}(1)\}\mathit{\text{odds}}\{{Y}_{T}(1)k|{A}_{T}(0){A}_{T}(1)\},}{}\hfill \hfill \hfill \end{array}$$There are three SACEs for comparing IROX and IFL, three for comparing FOLFOX and IFL, and three for comparing IROX and FOLFOX. The
*SACE*_{k,1}(1, 0) is the odds ratio that the outcome at time*T*is larger than*k*for comparing IROX and IFL for subjects who would be alive under all three treatment arms. The*SACE*_{k,2}(1, 0) is the odds ratio that the outcome at time*T*is larger than*k*for comparing IROX and IFL for subjects who would be alive under IROX and IFL but not alive under FOLFOX. The*SACE*_{k,3}(1, 0) is the odds ratio that the outcome at time*T*is larger than*k*for comparing IROX and IFL for subjects who would be alive either under all three treatment arms or only under IROX and IFL. The*SACE*_{k,.}(2, 0) and*SACE*_{k,.}(2, 1) are defined similarly for comparisons of FOLFOX and IFL, and FOLFOX and IROX, respectively.

Define _{t} = (_{1}, … , _{t}) to be the history of potential outcomes up to and including the outcomes at visit *t*. Before introducing the assumptions, it is convenient to define some additional quantities. Let *g _{T}* (tx,

$${\text{logit}g}_{T}(\text{tx},\mathit{w})={\mathit{w}}^{T}{\mathit{\psi}}_{\text{tx}}.$$

(1)

That is, the probability of being alive on the treatments tx = 0, 1, 2 is a function of baseline covariates with coefficients depending on treatment tx. Also, let

$${p}_{T}^{*}(1,\mathit{w})=P({D}_{T}(1)=0|{D}_{T}(0)=0,\mathit{w}),$$

(2)

$${p}_{T}^{*}(2,\mathit{w})=P({D}_{T}(2)=0|{D}_{T}(0)=0,\mathit{w}),$$

(3)

$${q}_{T}^{*}(\mathit{w})=P({D}_{T}(2)=1|{D}_{T}(1)=1,{D}_{T}(0)=1);$$

(4)

thus, ${p}_{T}^{*}(1,\mathit{w})$ is the probability of a patient with baseline covariates *w* being alive on IROX at time *T*, given that they would be alive on IFL at time T. Similarly, ${p}_{T}^{*}(2,\mathit{w})$ represents the probability of being alive on FOLFOX at time *T*, given that they would be alive on IFL at time *T* for a patient with baseline covariates, * w*. The ${q}_{T}^{*}(\mathit{w})$ is the probability of a patient with baseline covariates

$${h}_{T,\text{tx}}(k)=P({Y}_{T}>k|S>T,\text{tx}).$$

(5)

The *h*_{T,tx}(*k*) is the probability that the (ordinal) QOL response is greater than *k* for a subject on treatment tx that has not died or progressed by time *T*. We now describe the assumptions.

We first make the common Stable Unit Treatment Value Assumption (SUTVA, Rubin, 1978) which states that a patient’s potential outcomes are unrelated to the treatment status of other patients. Next, we list the assumptions necessary to identify the causal effect of interest:

**Deterministic**: The assumption thatimplies that there will be no subjects who die on both experimental treatments and survive on the standard of care.$$P({D}_{T}(1)=1|{D}_{T}(0)=0,{D}_{T}(2)=1)=0,$$(6)**Stochastic I**: We assume ${p}_{T}^{*}(j,\mathit{w})$ for*j*= 1, 2 are given bywhere ${U}_{j}(\mathit{w})=\text{min}\phantom{\rule{thinmathspace}{0ex}}\{1,\frac{{g}_{T}(j,\mathit{w})}{{g}_{T}(0,\mathit{w})}\}$. To understand$${p}_{T}^{*}(j,\mathit{w})={g}_{T}(j,\mathit{w})+\rho \{{U}_{j}(\mathit{w})-{g}_{T}(j,\mathit{w})\},\text{0\u2264\rho \u22641,}$$(7)*U*(_{j}*w*), note that given*g*(_{T}*j, w*), we have the following bounds on the conditional distribution,$${g}_{T}(j,\mathit{w})\le {p}_{T}^{*}(\mathit{w})\le \text{min}\phantom{\rule{thinmathspace}{0ex}}\left\{1,\frac{{g}_{T}(j,\mathit{w})}{{g}_{T}(0,w)}\right\}.$$This modeling assumption was used in Roy et al. (2008) in the context of compliance in a smoking cessation trial. Note that if ρ = 0, then ${p}_{T}^{*}(j,\mathit{w})={g}_{T}(j,\mathit{w})$, which implies that*D*(1) is independent of_{T}*D*(0) and_{T}*D*(2) is also independent of_{T}*D*(0). If ρ = 1, then ${p}_{T}^{*}(j,\mathit{w})={U}_{j}(\mathit{w})$, which is the largest possible probability that is compatible with the marginal distributions. The assumption states that being alive on FOLFOX (or IROX) is positively correlated with being alive on IFL (standard of care); i.e., if a subject is alive on IFL, they are more likely to be alive on FOLFOX (or IROX)._{T}**Stochastic II:**We assume ${q}_{T}^{*}(\mathit{w})$ is given bywhere ${U}_{3}(\mathit{w})=\text{min}\phantom{\rule{thinmathspace}{0ex}}\left\{1,\frac{1-{g}_{T}(2,\mathit{w})}{P({D}_{T}(1)=1,{D}_{T}(0)=1)}\right\}$. Similarly, we have$${q}_{T}^{*}(\mathit{w})=(1-{g}_{T}(2,\mathit{w}))+\nu \{{U}_{3}(\mathit{w})-(1-{g}_{T}(2,\mathit{w}))\},\text{0\u2264\nu \u22641,}$$(8)This assumption is similar to b) above, but addresses all three treatments. It says that dying on FOLFOX is positively correlated with dying on both IROX and IFL; that is, patients who would die on IROX and IFL (standard of care) would be more likely to die on FOLFOX.$$(1-{g}_{T}(2,\mathit{w})\le {q}_{T}^{*}(\mathit{w})\le \text{min}\left\{1,\frac{(1-{g}_{T}(2,\mathit{w}))}{P({D}_{T}(1)=1,{D}_{T}(0)=1)}\right\}.$$

Both stochastic monotonicity assumptions address the concern of a deterministic monotonicity assumption not being reasonable between two treatment arms and implicitly allows the assumption to vary with subject specific covariates, * w*. In addition, they both imply a positive covariance (correlation) between the event on the left and right of the conditioning. The sensitivity parameters in Assumption I are ρ and ν.

Recall _{t} is all the potential outcome up to and including time *t*. Assumption 2 states that the treatment arm is unrelated to the set of potential outcomes; this holds by randomization of the treatments.

For *k* = 1, … , *K* − 1,

$$\frac{\mathit{\text{odds}}({Y}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(j))}{\mathit{\text{odds}}({Y}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(0))}=\tau ,\text{ifj=4,5}$$

(9)

$$\frac{\mathit{\text{odds}}({Y}_{T}(1)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(j))}{\mathit{\text{odds}}({Y}_{T}(1)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(0))}=\tau ,\text{ifj=1,4}$$

(10)

$$\frac{\mathit{\text{odds}}({Y}_{T}(2)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(j))}{\mathit{\text{odds}}({Y}_{T}(2)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(0))}=\tau ,\text{ifj=1,5}$$

(11)

$$\frac{\mathit{\text{odds}}({Y}_{T}(1)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(3))}{\mathit{\text{odds}}({Y}_{T}(1)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(0))}=\mathrm{\lambda},$$

(12)

and

$$\frac{\mathit{\text{odds}}({Y}_{T}(2)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(2))}{\mathit{\text{odds}}({Y}_{T}(2)\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|{A}_{T}(0))}=\mathrm{\lambda},$$

(13)

Note that *A _{T}*(1),

Note also that the two sensitivity parameters, τ and λ, do not depend on *k*. As such, the assumption is referred to as *proportional odds*. In total, Assumption 3 has *two* sensitivity parameters.

Assumptions like Assumption 3, which involve a few sensitivity parameters, are fundamental to the ability to reasonably elicit sensitivity parameters to conduct inference (Egleston et al., 2009). The semiparametric models of Robins and colleagues (Robins et al., 1995; Rotnitzky et al., 1998; Scharfstein et al., 1999) for MNAR missingness also make simplifying assumptions to reduce the sensitivity parameters (in that context, in the missing data mechanism). Egleston et al. (2007) and Lee and Daniels (2008) use similar assumptions to Assumption 3 in much simpler contexts. Other variations on Assumption 3 are possible and would be specific to the application. The assumptions in this case were carefully formulated with the expertise and guidance of Dr. Sargent, who was the statistician on the N9741 clinical trial.

Our goal in the following will be to use the observed data, _{t}, along with the three assumptions in Section 3.2 to draw inference about the *SACE*. For clarity of presentation, we demonstrate the identification of the *SACE* in two parts. First, we identify the joint distribution of **D _{T}** = (

We identify the marginal and joint distributions of **D _{T}** using Theorem I.

(Identification of the joint distribution of **D _{T}**): Under Assumptions 1 and 2,

A detailed proof of this result can be found in the Appendix.

Using Theorem I, the marginal and joint probabilities of **D _{T}** can be written as functions of (1)–(4) as follows:

- The marginal distributions,
*P*(*D*(tx) = 0) for tx = 0, 1, 2 can be expressed as_{T}$$P({D}_{T}(\text{tx})=0)=P({D}_{T}(\text{tx})=0)={E}_{w}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(\text{tx},\mathit{w})\},$$(14) - The joint distribution,
*P*(**D**), which corresponds to_{T}*P*(*A*(_{T}*j*)) for*j*= 0, … , 6 can be expressed as$$\begin{array}{cc}P({A}_{T}(0))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(2,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(5)),\hfill \\ P({A}_{T}(1))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(2,\mathit{w})-{g}_{T}(0,\mathit{w}){p}_{T}^{*}(2,\mathit{w})\}-P({A}_{T}(2)),\hfill \\ P({A}_{T}(2))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}[(1-{q}_{T}^{*}(\mathit{w}))\phantom{\rule{thinmathspace}{0ex}}\{1-{g}_{T}(0,\mathit{w})\phantom{\rule{thinmathspace}{0ex}}(1-{p}_{T}^{*}(1,\mathit{w}))-{g}_{T}(1,\mathit{w})\}],\hfill \\ P({A}_{T}(3))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(1,\mathit{w})-{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(1)),\hfill \\ P({A}_{T}(4))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(0)),\hfill \\ P({A}_{T}(5))=\hfill & {E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}[{g}_{T}(0,w)\phantom{\rule{thinmathspace}{0ex}}\{1-{p}_{T}^{*}(1,\mathit{w})\}],\hfill \\ P({A}_{T}(6))=\hfill & 0.\hfill \end{array}$$These expressions are used for the identification and estimation of the causal effects in the subsequent sections.

We now consider identification of the conditional distributions of *Y _{T}* given

$$P({Y}_{T}(\text{tx})>k|{D}_{T}(\text{tx})=0)\text{}=\text{}P({Y}_{T}k|{D}_{T}=0,\text{Tx}=\text{tx})\stackrel{\text{let}}{=}{h}_{T,\text{tx}}(k),$$

for tx = 0, 1, 2. We demonstrate the identification of the *SACE*s using this result, Theorem I, and Assumption 3.

Under assumptions 1, 2, and 3, *SACE*_{k,.}(1, 0) is identified.

A simple proof of this result starts by showing that the numerators and denominators of *SACE*_{k,.}(1, 0) can all be written as functions of *P*(*Y _{T}*(0) >

From Assumption 3, we have

$$P({Y}_{T}(0)>k|{A}_{T}(4))=\frac{\tau P({Y}_{T}(0)>k|{A}_{T}(0))}{1+(\tau -1)P({Y}_{T}(0)>k|{A}_{T}(0))},$$

(15)

$$P({Y}_{T}(1)>k|{A}_{T}(4))=\frac{\tau P({Y}_{T}(1)>k|{A}_{T}(0))}{1+(\tau -1)P({Y}_{T}(1)>k|{A}_{T}(0))}.$$

(16)

By identifying *P*(*Y _{T}*(0) >

For *SACE*_{k,3}(1, 0), we need

$$P({Y}_{T}(0)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(4))=\frac{P({Y}_{T}(0)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(0)k|{A}_{T}(4))P({A}_{T}(4))}{P({A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(4))}$$

(17)

$$P({Y}_{T}(1)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(4))=\frac{P({Y}_{T}(1)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(1)k|{A}_{T}(4))P({A}_{T}(4))}{P({A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(4))}$$

(18)

which are functions of *P*(*A _{T}*(0)

Similar arguments can be used to identify *SACE*_{k,.}(2, 0) and *SACE*_{k,.}(2, 1). We state this formally in the following two corollaries.

Under assumptions 1, 2, and 3, *SACE*_{k,.}(2, 0) is identified.

Given the identification results in Theorems I and II and following a similar argument to Theorem I, all the relevant numerators and denominators can be written as a function of the quantities necessary to compute *SACE*_{k,.}(1, 0) except for *P*(*Y _{T}*(2) >

Under assumptions 1, 2, and 3, *SACE*_{k,.}(2, 1) is identified.

The entire proof is in the Appendix.

As shown in the previous two subsections (and the Appendix), the *SACE* is a function of *g _{T}* (tx,

Breakdown of subjects who were alive by treatment group at the time of study completion. Proportions are in parentheses

Note that we can use the empirical expectation to estimate *E_{w}* {

$${\widehat{E}}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{f(\mathit{w};\text{tx})\}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}\widehat{f}({\mathit{w}}_{i};\text{tx}).}$$

For example, *E _{w}* {

$$\begin{array}{c}{\widehat{E}}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(\text{tx},\mathit{w})\}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{\widehat{g}}_{T}(\text{tx},{\mathit{w}}_{i}),}\hfill \\ {\widehat{E}}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(j,\mathit{w}){g}_{T}(\text{tx},\mathit{w})\}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{\widehat{p}}_{T}^{*}(j,{\mathit{w}}_{i}){\widehat{g}}_{T}(\text{tx},{\mathit{w}}_{i}).}\hfill \end{array}$$

To estimate *h*_{T,tx}(*k*), given in (5), we need to specify models for the observed data and missing data due to loss to follow-up (not progression/death). Before stating the models, we specify one additional assumption that allows us to estimate the above quantities from the observed data. Recall that *S* is the progression/death time. So, *S* = *j* {*D _{j}* = 1,

*R _{t}* {

This assumption states that missingness of outcome is independent of the value of the outcome given the previous history of outcomes and Tx for all subjects with progress/death time *j*. It is MAR conditional on progression/death time. Missingness assumptions like this have recently been termed *partially ignorable* (Harel and Schafer, 2009). This assumption will allow us to use all the available data on survivors for the analysis.

We specify marginalized transition models of order two, IOMTM(2) (Lee and Daniels, 2007), conditional on progression/death time as follows, *S*,

$$\text{log}\frac{P({Y}_{t}\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|S=j,\text{tx})}{1-P({Y}_{t}\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|S=j,\text{tx})}={\beta}_{0k}(j)+\beta (j,\text{tx}),$$

(19)

$$\text{log}\frac{P({Y}_{t}=k|{Y}_{i,t-1},{Y}_{\mathit{\text{it}}-2},S=j,\text{tx})}{P({Y}_{t}=K|{Y}_{i,t-1},{Y}_{i,t-2},S=j,\text{tx})}={\mathrm{\Delta}}_{\mathit{\text{itk}}}(j,\text{tx})+{\gamma}_{11}^{(k)}(j,\text{tx}){Y}_{i,t-1}+{\gamma}_{12}^{(k)}(j,\text{tx}){Y}_{i,t-1}^{2}+{\gamma}_{21}^{(k)}(j,\text{tx}){Y}_{i,t-2}+{\gamma}_{22}^{(k)}(j,\text{tx}){Y}_{i,t-2}^{2},$$

(20)

where β(*j*, tx) are the coefficients of treatment arm tx, β_{01}(*j*) > > β_{0K−1}(*j*) and ${\gamma}_{1l}^{(k)}(j,\text{tx}),\phantom{\rule{thinmathspace}{0ex}}{\gamma}_{2l}^{(k)}(j,\text{tx})$ for *l* = 1, 2 are the linear and quadratic coefficients of of the lag one and two responses, respectively. Model (19) is a cumulative logit model, a common model for ordinal data. However, the link function in the conditional probabilities (20) is a multinomial logit because we expect the *K* − 1 (here, *K* = 4) conditional probabilities to have different coefficients ${\gamma}_{1l}^{(k)}(\xb7,\xb7)$, depending on *k*.

Clearly, for models (19) and (20) to form a coherent probability model, the Δ_{itk}(*j*, tx) will be constrained. The Δ_{itk}(*j*, tx) is found using the following identity

$$P({Y}_{\mathit{\text{it}}}=k|S=j,\text{tx})={\displaystyle \sum _{{g}_{1}=1}^{K}{\displaystyle \sum _{{g}_{2}=1}^{K}P({Y}_{\mathit{\text{it}}}=k|{Y}_{\mathit{\text{it}}-1}={g}_{1},}{Y}_{\mathit{\text{it}}-2}={g}_{2},S=j,\text{tx})\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}}P({Y}_{\mathit{\text{it}}-1}={g}_{1},{Y}_{\mathit{\text{it}}-2}-{g}_{2}|S=j,\text{tx}).$$

We refer the reader to Lee and Daniels (2007) for more details on these models. Within this modeling framework, *h*_{T,tx}(*k*) is given by

$$\begin{array}{cc}{h}_{T,\text{tx}}(k)\hfill & =P({Y}_{T}\phantom{\rule{thinmathspace}{0ex}}>\phantom{\rule{thinmathspace}{0ex}}k|S>T,\text{tx})\hfill \\ \hfill & =\frac{{\displaystyle {\sum}_{j>T}P({Y}_{T}>k|S=j,\text{tx})P(S=j|\text{tx})}}{{\displaystyle {\sum}_{j>T}P(S=j|\text{tx})}},\hfill \end{array}$$

(21)

where *P*(*S* = *j*|tx) is a multinomial mass function for *S*, and *P*(*Y _{T}* >

To calculate standard errors for *SACE*, we use the delta method. Let

$${\mathit{\theta}}^{T}=({\psi}_{0}^{T},{\psi}_{1}^{T},{\psi}_{2}^{T},{\mathit{\beta}}_{0}^{T}(\xb7),{\mathit{\beta}}^{T}(\xb7)),$$

where ${\mathit{\beta}}_{0}^{T}(\xb7)={({\beta}_{0}^{T}(0),{\beta}_{0}^{T}(1),{\beta}_{0}^{T}(2))}^{T}$ and **β**^{T}(·) = (β^{T}(0), β^{T}(1), β^{T}(2)). The variance of the *SACE*_{K−1}(Tx, Tx’) for Tx ≠ Tx’ is given by

$$\text{Var}({\mathit{\text{SACE}}}_{K-1}(\text{Tx},\text{Tx}\u2019))=\mathrm{\Delta}{\mathit{\text{SACE}}}_{K-1}^{T}(\text{Tx},\text{Tx}\u2019)\text{Var}(\widehat{\theta})\mathrm{\Delta}{\mathit{\text{SACE}}}_{K-1}(\text{Tx},\text{Tx}\u2019),$$

where **Δ***SACE*_{K−1}(Tx, Tx’) is the gradient of *SACE*_{K−1}(Tx, Tx’) with respect to **θ**. Writing ${\mathit{\theta}}_{1}^{T}=({\mathit{\psi}}_{0}^{T},{\mathit{\psi}}_{1}^{T},{\mathit{\psi}}_{2}^{T})\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\theta}}_{2}^{T}=({\mathit{\beta}}_{0}^{T}(\xb7),{\mathit{\beta}}^{T}(\xb7))$, we have

$$\mathit{\text{var}}(\widehat{\mathit{\theta}})=\left(\begin{array}{cc}\hfill \mathit{\text{var}}({\widehat{\mathit{\theta}}}_{1}(\xb7))\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill \mathit{\text{var}}({\widehat{\mathit{\theta}}}_{2}(\xb7))\hfill \end{array}\right)\phantom{\rule{thinmathspace}{0ex}}.$$

Details on computing the gradient and variance of are given in the Supplemental web materials (http://www.amstat.org/publications/jasa/supplemental materials).

The identification of the *SACE* relies on untestable assumptions (in particular, Assumptions 1 and 3). We will use a sensitivity analysis procedure to draw inference about *SACE* by varying the sensitivity parameters (τ,λ) in Assumption 3, and (ρ ν) in Assumption 1. We discuss reasonable ranges for the former two sensitivity parameters in Section 4.1.

We use the approach proposed in Section 3 to analyze the QOL data from Section 2. We start out with a discussion of the elicitation of reasonable ranges for the sensitivity parameters.

Dr. Sargent, the primary statistician for this clinical trial, provided the expertise on the sensitivity parameters for the analysis (including the expertise needed for the specification of Assumption 3). Dr. Sargent is both a PhD level statistician and a professor of oncology. Having both the quantitative and clinical knowledge is an important asset in eliciting values for sensitivity parameters. The feasibility of the eliciting of sensitivity parameters in general has been discussed in Shepherd et al. (2007). The following decisions regarding reasonable values for the sensitivity parameters were elicited through extensive discussion. The sensitivity parameters τ and λ are unlikely to be outside the range of (1/2, 1.5); that is, τ, λ [1/2, 1.5]. The key to these ranges is that it was thought very unlikely for more than a 50% increase or decrease in these odds ratios.

We conduct a sensitivity analysis and draw inferences in the next section.

We are interested in subjects who would be alive on at least two treatment arms until the end of the study. Table 1 gives the number of patients alive at the end of the study by treatment.

Baseline covariates related to survival, *w*, included in the analyses were age (AGE), patient performance status (PS), and number of sites of metastatic disease (NOSITE). These three factors are well established to impact patient survival in this setting. To simplify computations of the empirical expectations in Section 3.4, we categorized each of the covariates. The cutpoints were chosen by Dr. Sargent using his extensive experience with the trial and the disease. Age was discretized as < 70 (*AGE* = 0) and ≥ 70 (*AGE* = 1). Performance status (PS) is a measurement of the patient’s ability to carry out activities of daily living, measured on a 5 point ordinal scale, where 0 represents full health and 4 represents death. Thus a higher PS indicates *worse* health. We discretized performance status into high {0 or 1} (*PS* = 0) vs. low {2,3,4} (*PS* = 1). The number of sites of metastatic disease was broken into three categories, {1,2, > 2} and coded using two indicator variables, NOSITE1 and NOSITE2, respectively as {(0,0), (1,0), (0,1)}.

We fit the logistic regression models in (1) using these covariates. The regression coefficient estimates, , are given in Table 2. For FOLFOX, the estimated coefficients of performance status and number of metastatic sites were significant; for IFL, performance status was significant and for IROX, number of metastatic sites was significant. For all three arms, the coefficients indicated that the estimated survival probability decreased for patients with lower performance status and decreased for patients with more sites of metastases.

Maximum likelihood estimates of ψ_{tx}. Standard errors are in parentheses. AGE=I{age < 70}; PS=I{performance status (2; 3; 4)}; NOSITE1 = I{# of sites = 2}; NOSITE2 = I{# of sites > 2}.

The MLE’s of *h*_{T,.}(K − 1) based on the IOMTM(2) (specified in (19) and (20)) are given in Table 3. The probability of fatigue for survivors under IROX was highest among the three arms. The probability under IFL was the lowest.

In the following, we present inferential results and sensitivity analyses for log *SACE*_{K−1,.}(1, 0), log *SACE*_{K−1,.}(2, 0), and log *SACE*_{K−1,.}(2, 1). Note that under Assumption 3, we have following equalities

$$\begin{array}{c}\text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,1}(1,0)\text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,2}(1,0),\hfill & \text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,1}(2,0)\text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,2}(2,0),\hfill & \text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,1}(2,1)\text{log}\phantom{\rule{thinmathspace}{0ex}}{\mathit{\text{SACE}}}_{K-1,2}(2,1).\hfill \end{array}$$

As such, we only focus *SACE*_{K−1,1}(·,·) and *SACE*_{K−1,3}(·,·) in the following. For each given set of sensitivity parameters, log *SACE*_{k,1}(1, 0) and log S*ACE*_{k,3}(1, 0) are the log odds ratios that the fatigue score at time *T* is larger than *k* for comparing IROX and IFL (standard of care) for subjects who would be alive under all three treatment arms and who would be alive either under all three treatment arms or only under IFL (standard of care) and IROX. The log *SACE*_{k,1}(2, 0) and log *SACE*_{k,3}(2, 0) are the log odds ratios similarly defined for comparing IFL and FOLFOX; and log *SACE*_{k,1}(2, 1) and log *SACE*_{k,3}(2, 1), for comparing the two experimental treatments, IROX and FOLFOX.

The estimated values of log *SACE*_{K−1,1}(1, 0) and associated p-values testing whether they are different from zero over the range of τ and λ elicited in Section 4.1 are given in Figure 1 for various values of ρ and ν. When (ρ, ν) = (0, 0), which corresponds to dying on IROX being independent of dying on IFL, dying on FOLFOX being independent of dying on IFL, and dying on FOLFOX being independent of dying on both IROX and IFL, the estimated values of log *SACE*_{K−1,1}(1, 0) increased as τ and λ respectively increased and decreased. The estimated values of log *SACE*_{K−1,1}(1, 0) were greater than 0 for all τ and λ (Figure 1(a)), implying that the probability of fatigue on the IROX treatment was greater than the probability of a patient’s fatigue on the IFL treatment. The estimated values of log *SACE*_{K−1,1}(1, 0) increased as λ decreased for (ρ, ν) = (0.5, 0.5) which corresponds to a moderate correlation between death probabilities on the two treatments and three treatments, respectively (Figure 1(c)). In contrast, when (ρ,ν) = (1, 1) (highest correlation), the log *SACE*_{K−1,1}(1, 0) decreased as τ increased (Figure 1(e)). However, for all values of (ρ, ν) there were no significant differences between IROX and IFL. (Figure 1(b), 1(d), 1(f)).

Maximum likelihood estimates of log *SACE*_{K−1,1}(1, 0) with ρ, ν {0, 0.5, 1.0}, respectively, and corresponding p-values for log *SACE*_{K−1,1}(1, 0).

Figure 2 presents estimated values of log *SACE*_{K𢄒1,3}(1, 0) and corresponding p-values. The pattern was similar for log *SACE*_{K𢄒1,1}(1, 0) (Figure 2(a), 2(c), 2(e)) and there were no significant differences between the two treatment arms for all values of (ρ, ν) (Figure 2(b), 2(d), 2(f)). We note that the plots for estimated values of log *SACE*_{K−1,1}(1, 0) and log *SACE*_{K−1,3}(1, 0) looked similar. However, there were small difference between these estimated values.

Maximum likelihood estimates of log *SACE*_{K−1,1}(1, 0) with ρ, ν {0, 0.5, 1.0}, respectively, and corresponding p-values for log *SACE*_{K−1,3}(1, 0).

Figures 3 and and44 show similar results for log *SACE*_{K−1,1}(2, 0) and log *SACE*_{K−1,3}(2, 0). The estimated values of log *SACE*_{K−1,1}(2, 0) and log *SACE*_{K−1,3}(2, 0) were greater than 0 for all values of (ρ, ν). However, there were no significant differences between FOLFOX and IFL.

Maximum likelihood estimates of log *SACE*_{K−1,1}(2, 0) with ρ, ν {0, 0.5, 1.0}, respectively, and corresponding p-values for log *SACE*_{K−1,1}(2, 0).

Maximum likelihood estimates of log *SACE*_{K−1,3}(2, 0) with ρ, ν {0, 0.5, 1.0}, respectively, and corresponding p-values for log *SACE*_{K−1,3}(2, 0).

Figure 5 and and66 give the estimated values of log *SACE*_{K−1,1}(2, 1) and log *SACE*_{K−1,3}(2, 1) and the corresponding p-values. The estimated values of *SACE*_{K−1,1}(2, 1) were less than 0 for all values of (ρ, ν), implying that the probability of fatigue on IROX was larger than the probability of fatigue on FOLFOX. However, again the differences were not significant.

Maximum likelihood estimates of log *SACE*_{K−1,1}(2, 1) with ρ, ν {0, 0.5, 1.0}, respectively, and corresponding p-values for log *SACE*_{K−1,1}(2, 1).

In this paper, we have proposed a principal stratification approach to estimate the causal effect of treatment in the presence of death for a three arm study with two active treatments and a longitudinal ordinal outcome. Our formulation of sensitivity parameters in Assumption 3 is very much context specific. Based on this approach, we concluded that patients’ fatigue levels were not influenced by treatment, an important finding that allows the use of the treatment with the most favorable anti-cancer activities. However, we note that the power was low for this comparison with few patients still on study at one year.

In our approach, instead of just using the data *Y _{T}* observed at the end of the study, we used all available longitudinal data using the models in Section 3.4 along with Assumption 4. This was essential to the viability of our approach as only 18 subjects in total had the QOL measure at time

A key assumption in our proposed approach is deterministic monotonicity, Assumption Ia, which assumes that there are no patients who would die on both experimental arms but survive on the control arm. While this is weaker than a more typical assumption that no patient would die on either experimental arm but be alive on the control arm, it still requires justification. In this trial, both experimental arms were superior for overall survival compared to the control, and were also less toxic. In addition, despite extensive analyses of both baseline clinical and blood-based predictors (pharmacogenomics) reported elsewhere (Sanoff et al., 2008; McLeod et al., 2003), no predictors of differential treatment efficacy between the three treatments has been identified. Thus, while personalized medicine has the hope to identify the optimal therapy for each patient (which for some patients may be the control), at the present time for this trial, Assumption Ia is not unreasonable.

A difficult research problem is the best way to ’honestly’ elicit sensitivity parameters in causal inference problems, often from physicians. Here we had the benefit of a statistician, Dr. Sargent, who is also a professor of oncology and was intimately involved in this trial. However eliciting from experts *without* this dual expertise remains a challenge.

In the current approach, we compute the large sample variance of SACE using the delta method. We plan in future work to examine its operating characteristic in small samples. We also plan to explore a fully Bayesian approach and as such, use informative priors for the sensitivity parameters, (τ, λ, ρ, ν), to obtain a single inference that characterizes our uncertainty about the sensitivity parameters. This will also allow us to easily make non-large sample inferences. This is ongoing work.

We would like to thank Ms. Erin Green of the Mayo Clinic for her help in data collection and clarifying issues with the data. The authors gratefully acknowledge the associate editor and two referees for insightful comments that have greatly improved the manuscript. This project was supported by NIH grants CA85295, HL079457, and CA25224.

We know that the marginal distributions, *P*(*D _{T}*(

$$P({D}_{T}(\text{tx})=0)=P({D}_{T}=0|\text{Tx}=\text{tx})={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{P({D}_{T}=0|\text{Tx}=\text{tx},\mathit{w})\}={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(\text{tx},\mathit{w})\}.$$

(22)

Now we move to identification of the bivariate distributions *P*(*D _{T}* (

$$\begin{array}{cc}P({D}_{T}(1)=0,{D}_{T}(0)=0)\hfill & =P({D}_{T}(1)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)\hfill \\ \hfill & ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\},\hfill \\ P({D}_{T}(1)=1,{D}_{T}(0)=0)\hfill & =P({D}_{T}(1)=1|{D}_{T}(0)=0)P({D}_{T}(0)=0)\hfill \\ \hfill & ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{(1-{p}_{T}^{*}(1,\mathit{w})){g}_{T}(0,\mathit{w})\}.\hfill \end{array}$$

Therefore, both *P*(*D _{T}*(1) = 0,

$$\begin{array}{c}\text{P(DT(1)=0)}\hfill & =P({D}_{T}(0)=0,{D}_{T}(1)=0)+P({D}_{T}(0)=1,{D}_{T}(1)=0)\hfill \\ =P({D}_{T}(0)=0,{D}_{T}(1)=0)+P({D}_{T}(1)=0|{D}_{T}(0)=1)P({D}_{T}(0)=1),\hfill \end{array}$$

*P*(*D _{T}*(1) = 0|

$$\sum _{j,j\prime}P({D}_{T}(1)=j,{D}_{T}(0)=j\prime )=1,$$

*P*(*D _{T}*(1) = 1;

We now show that

$$P({D}_{T}(2)=j,{D}_{T}(0)=j\prime )\text{for}\phantom{\rule{thinmathspace}{0ex}}j,j\prime =0,1,$$

are identified. From Assumption 1 and (3), we have the following results

$$\begin{array}{cc}P({D}_{T}(2)=0,{D}_{T}(0)=0)\hfill & =P({D}_{T}(2)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)\hfill \\ \hfill & ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(2,\mathit{w}){g}_{T}(0,\mathit{w})\},\hfill \\ P({D}_{T}(2)=1,{D}_{T}(0)=0)\hfill & =\{1-P({D}_{T}(2)=0|{D}_{T}(0)=0)\}\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(0)=0)\hfill \\ \hfill & ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{(1-{p}_{T}^{*}(2,\mathit{w})){g}_{T}(0,\mathit{w})\}.\hfill \end{array}$$

Therefore, both *P*(*D _{T}*(2) = 0,

Since

$$\begin{array}{c}\text{P(DT(2)=0)}\hfill & =P({D}_{T}(2)=0,{D}_{T}(0)=0)+P({D}_{T}(2)=0|{D}_{T}(0)=1)P({D}_{T}(0)=1),\hfill \end{array}$$

*P*(*D _{T}*(2) = 0|

$$\sum _{j,j\prime}P({D}_{T}(2)=j,{D}_{T}(0)=j\prime )=1.$$

Now we will prove *P*(*A _{T}*(

$$P({D}_{T}(0)=j,{D}_{T}(1)=l,{D}_{T}(2)=k),$$

for *j, l, k* = 0, 1 are identified (except (*j, l, k*) = (1, 1, 1)). We first identify *P*(*A _{T}* (5)) =

$$\begin{array}{c}\text{P(DT(0)=0,DT(1)=1,DT(2)=0)}\hfill & =P({D}_{T}(0)=0)-P({D}_{T}(0)=0,{D}_{T}(1)=0)\hfill \\ =P({D}_{T}(0)=0)-P({D}_{T}(1)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)\hfill \\ ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}[{g}_{T}(0,\mathit{w})\phantom{\rule{thinmathspace}{0ex}}\{1-{p}_{T}^{*}(1,\mathit{w})\}].\text{}\text{Assumption Ib}.\hfill \end{array}$$

We now identify *P*(*A _{T}* (0)) =

$$\begin{array}{c}\text{P(DT(0)=0,DT(1)=0,DT(2)=0)}\hfill & =P({D}_{T}(0)=0,{D}_{T}(2)=0)-P({D}_{T}(0)=0,{D}_{T}(1)=1,{D}_{T}(2)=0)\hfill \\ =P({D}_{T}(2)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)-P({A}_{T}(5))\hfill \\ ={E}_{\mathit{w}}\{{p}_{T}^{*}(2,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(5)).\hfill \end{array}$$

To identify *P*(*A _{T}*(2)) =

$$\begin{array}{c}\text{P(DT(0)=1,DT(1)=1,DT(2)=0)}\hfill & =P({D}_{T}(2)=0|{D}_{T}(0)=1,{D}_{T}(1)=1)\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(0)=1,{D}_{T}(1)=1)\hfill \\ =P({D}_{T}(2)=0|{D}_{T}(0)=1,{D}_{T}(1)=1)\phantom{\rule{thinmathspace}{0ex}}\{1-P({D}_{T}(0)=0,{D}_{T}(1)=0)-P({D}_{T}(0)=0,{D}_{T}(1)=1)-P({D}_{T}(0)=1,{D}_{T}(1)=0)\}\hfill \\ =P({D}_{T}(2)=0|{D}_{T}(0)=1,{D}_{T}(1)=1)\phantom{\rule{thinmathspace}{0ex}}[1-P({D}_{T}(1)=0|{D}_{T}(0)=0)\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(0)=0)-P({D}_{T}(1)=1|{D}_{T}(0)=0)\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(0)=0)-\{P({D}_{T}(1)=0)-P({D}_{T}(1)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)\}]\hfill \\ ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}[(1-{q}_{T}^{*}(\mathit{w}))\{1-{g}_{T}(0,\mathit{w})-{g}_{T}(1,\mathit{w})+{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\}].\phantom{\rule{thinmathspace}{0ex}}\text{Assumption}\phantom{\rule{thinmathspace}{0ex}}1\mathrm{c}).\hfill \end{array}$$

To identify *P*(*A _{T}*(1)) =

$$\begin{array}{c}\text{P(DT(0)=1,DT(1)=0,DT(2)=0)}\hfill & =P({D}_{T}(2)=0,{D}_{T}(0)=1)-P({D}_{T}(0)=1,{D}_{T}(1)=1,{D}_{T}(2)=0)\hfill \\ =\{1-P({D}_{T}(0)=0|{D}_{T}(2)=0)\}\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(2)=0)-P({A}_{T}(2))\hfill \\ =P({D}_{T}(2)=0)-P({D}_{T}(0)=0)P({D}_{T}(2)=0)|{D}_{T}(0)-P({A}_{T}(2))\hfill \\ ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(2,\mathit{w})-{g}_{T}(0,\mathit{w}){p}_{T}^{*}(2,\mathit{w})\}-P({A}_{T}(2)).\hfill \end{array}$$

To identify *P*(*A _{T}* (4)) =

$$\begin{array}{c}\text{P(DT(0)=0,DT(1)=0,DT(2)=1)}\hfill & =P({D}_{T}(0)=0,{D}_{T}(1)=0)-P({D}_{T}(0)=0,{D}_{T}(1)=0,{D}_{T}(2)=0)\hfill \\ =P({D}_{T}(1)=0|{D}_{T}(0)=0)P({D}_{T}(0)=0)-P({A}_{T}(0))\hfill \\ ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(0)).\hfill \end{array}$$

To identify *P*(*A _{T}*(3)) =

$$\begin{array}{c}\text{P(DT(0)=1,DT(1)=0,DT(2)=1)}\hfill & =P({D}_{T}(0)=1,{D}_{T}(1)=0)-P({D}_{T}(0)=1,{D}_{T}(1)=0,{D}_{T}(2)=0)\hfill \\ =\{1-P({D}_{T}(0)=0|{D}_{T}(1)=0)\}\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(1)=0)-P({A}_{T}(1))\hfill \\ =P({D}_{T}(1)=0)-P({D}_{T}(1)=0|{D}_{T}(0)=0)\phantom{\rule{thinmathspace}{0ex}}P({D}_{T}(0)=0)-P({A}_{T}(1))\hfill \\ ={E}_{\mathit{w}}\phantom{\rule{thinmathspace}{0ex}}\{{g}_{T}(1,\mathit{w})-{p}_{T}^{*}(1,\mathit{w}){g}_{T}(0,\mathit{w})\}-P({A}_{T}(1)).\hfill \end{array}$$

Finally, to identify *P*(*A _{T}* (6)) =

$$\begin{array}{c}\text{P(DT(0)=0,DT(1)=1,DT(2)=1)}\hfill & =P({D}_{T}(1)=1|{D}_{T}(0)=0,{D}_{T}(2)=1)P({D}_{T}(0)=0,{D}_{T}(2)=1)=0.\text{Assumption}\phantom{\rule{thinmathspace}{0ex}}1\mathrm{a}).\hfill \end{array}$$

In addition to (15) and (16), we have from Assumption 3

$$P({Y}_{T}(0)>k|{A}_{T}(5))=\frac{\tau P({Y}_{T}(0)>k|{A}_{T}(0))}{1+(\tau -1)\phantom{\rule{thinmathspace}{0ex}}P({Y}_{T}(0)>k|{A}_{T}(0))},$$

(23)

$$P({Y}_{T}(1)>k|{A}_{T}(1))=\frac{\tau P({Y}_{T}(1)>k|{A}_{T}(0))}{1+(\tau -1)\phantom{\rule{thinmathspace}{0ex}}P({Y}_{T}(1)>k|{A}_{T}(0))},$$

(24)

$$P({Y}_{T}(1)>k|{A}_{T}(3))=\frac{\mathrm{\lambda}P({Y}_{T}(1)>k|{A}_{T}(0))}{1+(\mathrm{\lambda}-1)\phantom{\rule{thinmathspace}{0ex}}P({Y}_{T}(1)>k|{A}_{T}(0))}.$$

(25)

To identify *P*(*Y _{T}*(0) >

$$P({Y}_{T}(0)>k|{D}_{T}(1)=0)=P({Y}_{T}(0)>k|{A}_{T}(0))P({A}_{T}(0)|{D}_{T}(0)=0)+P({Y}_{T}(0)>k|{A}_{T}(4))P({A}_{T}(4)|{D}_{T}(0)=0)+P({Y}_{T}(0)>k|{A}_{T}(5))P({A}_{T}(5)|{D}_{T}(0)=0).$$

Now we replace *P*(*Y _{T}*(0) >

$${A}_{0}{P}^{3}({Y}_{T}(0)>k|{A}_{T}(0))+{B}_{0}{P}^{2}({Y}_{T}(0)>k|{A}_{T}(0))+{C}_{0}P({Y}_{T}(0)>k|{A}_{T}(0))-{h}_{T,0}(k)=0,$$

(26)

where

$$\begin{array}{c}{A}_{0}=P({A}_{T}(0)|{D}_{T}(0)=0)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2},\hfill \\ {B}_{0}=P({A}_{T}(0)|{D}_{T}(0)=0)\phantom{\rule{thinmathspace}{0ex}}(2\tau -2)-{h}_{T,0}(k)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2},+\tau \phantom{\rule{thinmathspace}{0ex}}(\tau -1)P({A}_{T}(4)|{D}_{T}(0)=0)+\tau (\tau -1)P({A}_{T}(5)|{D}_{T}(0)=0)\hfill \\ {C}_{0}=P({A}_{T}(0)|{D}_{T}(0)=0)+\tau P({A}_{T}(4)|{D}_{T}(0)=0)+\tau P({A}_{T}(5)|{D}_{T}(0)=0)-(2\tau -2)\phantom{\rule{thinmathspace}{0ex}}{h}_{T,0}(k).\hfill \end{array}$$

The probabilities *P*(*A _{T}*(

To identify *P*(*Y _{T}*(1) >

$$\begin{array}{c}\text{P(YT(1)k|DT(1)=0)}\hfill & =P({Y}_{T}(1)k|{A}_{T}(0))P({A}_{T}(0)|{D}_{T}(1)=0)+P({Y}_{T}(1)k|{A}_{T}(1))P({A}_{T}(1)|{D}_{T}(1)=0)+P({Y}_{T}(1)k|{A}_{T}(3))P({A}_{T}(3)|{D}_{T}(1)=0)+P({Y}_{T}(1)k|{A}_{T}(4))P({A}_{T}(4)|{D}_{T}(1)=0).\hfill \end{array}$$

We replace *P*(*Y _{T}*(1) >

$${A}_{1}{P}^{4}({Y}_{T}(1)>k|{A}_{T}(0))+{B}_{1}{P}^{3}({Y}_{T}(1)>k|{A}_{T}(0))+{C}_{1}{P}^{2}({Y}_{T}(1)>k|{A}_{T}(0))+{D}_{1}P({Y}_{T}(1)>k|{A}_{T}(0))-{h}_{T,1}(k)=0,$$

(27)

where

$$\begin{array}{c}{A}_{1}=P({A}_{T}(0)|{D}_{T}(1)=0)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2}(\mathrm{\lambda}-1),\hfill \\ {B}_{1}=P({A}_{T}(0)|{D}_{T}(1)=0)\phantom{\rule{thinmathspace}{0ex}}\{(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)+(\tau +\mathrm{\lambda}-2)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)\}+P({A}_{T}(1)|{D}_{T}(1)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)+P({A}_{T}(3)|{D}_{T}(1)=0)\mathrm{\lambda}{(\tau -1)}^{2}+P({A}_{T}(4)|{D}_{T}(1)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)-{h}_{T,1}(k)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2}(\mathrm{\lambda}-1),\hfill \\ {C}_{1}=P({A}_{T}(0)|{D}_{T}(1)=0)\phantom{\rule{thinmathspace}{0ex}}(2\tau +\mathrm{\lambda}-3)+P({A}_{T}(1)|{D}_{T}(1)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau +\mathrm{\lambda}-2)+P({A}_{T}(3)|{D}_{T}(1)=0)\mathrm{\lambda}(2\tau -2)+P({A}_{T}(4)|{D}_{T}(1)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau +\mathrm{\lambda}-2)-{h}_{T,1}(k)\phantom{\rule{thinmathspace}{0ex}}\{(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)+(\tau +\mathrm{\lambda}-2)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)\},\hfill \\ {D}_{1}=P({A}_{T}(0)|{D}_{T}(1)=0)+P({A}_{T}(1)|{D}_{T}(1)=0)\tau +P({A}_{T}(3)|{D}_{T}(1)=0)\mathrm{\lambda}+P({A}_{T}(4)|{D}_{T}(1)=0)\tau -{h}_{T,1}(k)\phantom{\rule{thinmathspace}{0ex}}(2\tau +\mathrm{\lambda}-3).\hfill \end{array}$$

The probabilities *P*(*A _{T}* (0)|

Via the following equations,

$$P({Y}_{T}(0)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(5))=\frac{P({Y}_{T}(0)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(0)k|{A}_{T}(5))P({A}_{T}(5))}{P({A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(5))}$$

(28)

$$P({Y}_{T}(2)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(5))=\frac{P({Y}_{T}(2)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(2)k|{A}_{T}(5))P({A}_{T}(5))}{P({A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(5))}$$

(29)

and the fact that *P*(*A _{T}*(0)

From Assumption 3, we have following relations

$$P({Y}_{T}(2)>k|{A}_{T}(1))=\frac{\tau P({Y}_{T}(2)>k|{A}_{T}(0))}{1+(\tau -1)P({Y}_{T}(2)>k|{A}_{T}(0))},$$

(30)

$$P({Y}_{T}(2)>k|{A}_{T}(2))=\frac{\mathrm{\lambda}P({Y}_{T}(2)>k|{A}_{T}(0))}{1+(\mathrm{\lambda}-1)P({Y}_{T}(2)>k|{A}_{T}(0))},$$

(31)

$$P({Y}_{T}(2)>k|{A}_{T}(5))=\frac{\tau P({Y}_{T}(2)>k|{A}_{T}(0))}{1+(\tau -1)P({Y}_{T}(2)>k|{A}_{T}(0))}.$$

(32)

Therefore, we can identify *P*(*Y _{T}*(2) >

To identify *P*(*Y _{T}*(2) >

$$\begin{array}{c}\text{P(YT(2)k|DT(2)=0)}\hfill & =P({Y}_{T}(2)k|{A}_{T}(0))P({A}_{T}(0)|{D}_{T}(2)=0)+P({Y}_{T}(2)k|{A}_{T}(1))P({A}_{T}(1)|{D}_{T}(2)=0)+P({Y}_{T}(2)k|{A}_{T}(2))P({A}_{T}(2)|{D}_{T}(2)=0)+P({Y}_{T}(2)k|{A}_{T}(5))P({A}_{T}(5)|{D}_{T}(2)=0).\hfill \end{array}$$

Now we replace *P*(*Y _{T}*(2) >

$${A}_{2}{P}^{4}({Y}_{T}(2)>k|{A}_{T}(0))+{B}_{2}{P}^{3}({Y}_{T}(2)>k|{A}_{T}(0))+{C}_{2}{P}^{2}({Y}_{T}(2)>k|{A}_{T}(0))+{D}_{2}P({Y}_{T}(2)>k|{A}_{T}(0))-{h}_{T,2}(k)=0,$$

(33)

where

$$\begin{array}{c}{A}_{2}=P({A}_{T}(0)|{D}_{T}(2)=0)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2}(\mathrm{\lambda}-1),\hfill \\ {B}_{2}=P({A}_{T}(0)|{D}_{T}(2)=0)\phantom{\rule{thinmathspace}{0ex}}\{(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)+(\tau +\mathrm{\lambda}-2)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)\}+P({A}_{T}(1)|{D}_{T}(2)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)+P({A}_{T}(2)|{D}_{T}(2)=0)\phantom{\rule{thinmathspace}{0ex}}\mathrm{\lambda}\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2}+P({A}_{T}(5)|{D}_{T}(2)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)-{h}_{T,2}(k)\phantom{\rule{thinmathspace}{0ex}}{(\tau -1)}^{2}(\mathrm{\lambda}-1),\hfill \\ {C}_{2}=P({A}_{T}(0)|{D}_{T}(2)=0)\phantom{\rule{thinmathspace}{0ex}}(2\tau +\mathrm{\lambda}-3)+P({A}_{T}(1)|{D}_{T}(2)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau +\mathrm{\lambda}-2)+P({A}_{T}(2)|{D}_{T}(2)=0)\phantom{\rule{thinmathspace}{0ex}}\mathrm{\lambda}\phantom{\rule{thinmathspace}{0ex}}(2\tau -2)+P({A}_{T}(5)|{D}_{T}(2)=0)\tau \phantom{\rule{thinmathspace}{0ex}}(\tau +\mathrm{\lambda}-2)-{h}_{T,2}(k)\phantom{\rule{thinmathspace}{0ex}}\{(\tau -1)\phantom{\rule{thinmathspace}{0ex}}(\mathrm{\lambda}-1)+(\tau +\mathrm{\lambda}-2)\phantom{\rule{thinmathspace}{0ex}}(\tau -1)\},\hfill \\ {D}_{2}=P({A}_{T}(0)|{D}_{T}(2)=0)+P({A}_{T}(1)|{D}_{T}(2)=0)\tau +P({A}_{T}(2)|{D}_{T}(2)=0)\mathrm{\lambda}+P({A}_{T}(5)|{D}_{T}(2)=0)\tau -{h}_{T,2}(k)\phantom{\rule{thinmathspace}{0ex}}(2\tau +\mathrm{\lambda}-3).\hfill \end{array}$$

The probabilities *P*(*A _{T}*(0)|

Similar to (17), (18), (28), and (29), we have following expressions for *SACE*_{k,3}(2, 1),

$$P({Y}_{T}(1)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(1))=\frac{P({Y}_{T}(1)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(1)k|{A}_{T}(1))P({A}_{T}(1))}{{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(1))}$$

(34)

$$P({Y}_{T}(2)>k|{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(1))=\frac{P({Y}_{T}(2)k|{A}_{T}(0))P({A}_{T}(0))+P({Y}_{T}(2)k|{A}_{T}(1))P({A}_{T}(1))}{{A}_{T}(0)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{A}_{T}(1)}$$

(35)

In the process of identifying *SACE*_{k,3}(1, 0) and *SACE*_{k,3}(2, 0), we identified *P*(*Y _{T}*(1) >

- Cheng J. Estimation and inference for the causal effect of receiving treatment on a multinomial outcome. Biometrics. 2009;65:96–103. [PubMed]
- Cheng J, Small DS. Bounds on causal effects in three-arm trials with non-compliance. Journal of the Royal Statistical Society, Series B. 2006;68:815–836.
- Egleston BL, Scharfstein DO, Freeman ER, West SK. Causal inference for non-mortality outcomes in the presence of death. Biostatistics. 2007;8:526–545. [PubMed]
- Egleston BL, Scharfstein DO, MacKenzie E. On estimation of the survivor average causal effect in observational studies when important confounders are missing due to death. Biometrics. 2009;65:497–504. [PMC free article] [PubMed]
- Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. [PubMed]
- Gilbert PB, Ronald JB, Hudgens MG. Sensitivity analysis for the assessment of causal vaccine effects on viral load in HIV vaccine trials. Biometrics. 2003;59:531–541. [PubMed]
- Goldberg RM, Sargent DJ, Morton RF, Fuchs CS, Ramanathan RK, Williamson SK, Findlay BP, Pitot HC, Albets SR. A randomized controlled trial of fluorouracil plus leucovorin, irinotecan, and oxaliplatin combinations in patients with previously untreated metastatic colorectal cancer. Journal of Clinical Oncology. 2004;22:23–30. [PubMed]
- Harel O, Schafer JL. Partial and latent ignorability in missing-data problems. Biometrika. 2009;96:37–50.
- Hayden D, Pauler DK, Schoenfeld D. An estimator for treatment comparisons among survivors in randomized trials. Biometrics. 2005;61:305–310. [PubMed]
- Hogan JW, Laird NM. Mixture models for the joint distribution of repeated measures and event times. Statistics in Medicine. 1997;16:239–257. [PubMed]
- Holland P. Statistics and causal inference. Journal of the American Statistical Association. 1986;81:945–961.
- Imai K. Sharp bounds on the causal effects in randomized experiments with truncation-by-death. Statistics and Probability Letters. 2008;78:144–149.
- Jin H, Rubin D. Principal stratification for causal inference with extended partial compliance. Journal of the American Statistical Association. 2008;103:101–110.
- Kurland BF, Heagerty PJ. Directly parameterized regression conditioning on being alive: analysis of longitudinal data truncated by deaths. Biostatistics. 2005;6:241–258. [PubMed]
- Lee K, Daniels M. A class of Markov models for longitudinal ordinal data. Biometrics. 2007;63:1060–1067. [PMC free article] [PubMed]
- Lee K, Daniels M. Marginalized Models for Longitudinal Ordinal Data with Application to Quality of Life Studies. Statistics in Medicine. 2008;27:4359–4380. [PMC free article] [PubMed]
- McLeod HL, Sargent DJ, Marsh S, Fuchs C, Ramanathan RK, Williamson S, Findlay B, Thibodeau S, Petersen G, Goldberg R. Pharmaco-genetic analysis of systemic toxicity and response after 5-fluorouracil (5FU)/CPT-11, 5FU/oxaliplatin (oxal), or CPT-11/oxal therapy for advanced colorectal cancer (CRC): Results from an intergroup trial. Proc Am Soc Clin Oncol. 2003;22 abstract 1013.
- Pauler DK, McCoy S, Moinpour C. Pattern mixture models for longitudinal quality of life studies in advanced stage disease. Statistics in Medicine. 2003;22:795–809. [PubMed]
- Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association. 1995;90:106–121.
- Rotnitzky A, Robins JM, Scharfstein DO. Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association. 1998;93:1321–1339.
- Roy J, Hogan JW, Marcus BH. Principal stratification with predictors of compliance for randomized trials with 2 active treatments. Biostatistics. 2008;9:277–289. [PubMed]
- Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. 1974;66:688–701.
- Rubin DB. Bayesian inference for causal effects. Annals of Statistics. 1978;6:34–58.
- Rubin DB. Dawid AP, editor. Causal inference without counterfactuals. Journal of the American Statistical Association. 2000;6:34–58. Comment on.
- Rubin DB. Causal inference through potential outcomes and principal stratification: Application to studies with ‘censoring’ due to death. Statistical Science. 2006;21:299–309.
- Saltz LB, Cox JV, Blanke C, Rosen LS, Fehrenbacher L, Moore MJ, Maroun JA, Ackland SP, Locker PK, Pirotta N, Elfring GL, Miller LL. for the Irinotecan Study Group. Irinotecan plus fluorouracil and leucovorin for metastatic colorectal cancer. New England Journal of Medicine. 2000;343:905–914. [PubMed]
- Sanoff HK, Sargent DJ, Campbell ME, Morton RF, Fuchs CS, Ramanathan RK, Williamson SK, Findlay BP, Pitot HC, Goldberg RM. Five-Year Data and Prognostic Factor Analysis of Oxaliplatin and Irinotecan Combinations for Advanced Colorectal Cancer: N9741. Journal of Clinical Oncology. 2008;26:5721–5727. [PMC free article] [PubMed]
- Scharfstein DO, Rotnitzky A, Robins JM. Adjusting for Non-ignorable Drop-out Using Semiparametric Non-response Models (with Discussion) Journal of the American Statistical Association. 1999;94:1096–1146.
- Shepherd BE, Gilbert PB, Mehrotra DV. Eliciting a counterfactual sensitivity parameter. American Statistician. 2007;61:56–63.
- Small D, Tan Z. A stochastic monotonicity assumption for the instrumental variables method. Working paper. 2008
- Zhang JL, Rubin DB. Estimation of causal effects via principal stratification when some outcomes are truncated by “death” Journal of Educational and Behavioral Statistics. 2003;28:353–368.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |