Given the distribution of

_{28} and the first three assumptions above, the potential survival function at day

*m* (

*m* ≤ 28) is identified via the following G-computation formula (

Robins, 1986). The identification formula is derived as follows. By conditioning on

*H*_{0},

*U*_{0} and

*S*_{1} and

*H*_{1}, we can write

Under sequential randomization,

By further conditioning on

*U*_{1},

*S*_{2} and

*H*_{2},

Using sequential randomization,

By continuing to condition sequentially in time and employing sequential randomization,

Finally, by employing stochastic consistency,

This latter multivariate integral has been called the “G-computation” integral or formula. To illustrate how the formula works, consider the following example.

4.1. Example

Consider the treatment of patients with ALI on the first two days. Suppose that (1) all patients survive day 1, (2) all patients are on assist-control or ‘at-risk’ on day one, (3) 70 percent of the patients on day 1 are treated with LTVV (*A*_{1} = 1) and the others with traditional ventilation (*A*_{1} = 0), (4) the assignment of treatment on day one is completely random, (5) organ failure on day 1 is highly associated with treatment on day 1, with patients on LTVV at much lower risk (20%) for organ failure as compared to those who are not (80%), (6) the risk of death on day 2 is higher for those with organ failure on day 1 and is lower for those treated with LTVV on day 1, (7) all surviving patients on day 2 are on assist-control ventilation on day 2, (8) treatment with LTVV on day 2 is positively associated with treatment with LTVV on day 1 and negatively associated with organ failure on day 1, (9) organ failure on day 2 is positively associated with organ failure on day 1 and negatively associated with treatment with LTVV on day 1 and (10) the risk of death on day 3 is higher for those with organ failure on day 2 and lower for those treated with LTVV on day 2. The distribution of the observed data is displayed in .

In this example, we are interested in comparing the probability of surviving past day 3 under the LTVV regime to (a) the overall probability of surviving past day 3 and (b) the probability of surviving past day 3 among patients who are observed to adhere to the LTVV regime. Using the G-computation formula and noting that

*S*_{1} = 1,

,

,

*U*_{t} =

*OF*_{t} (

*t* = 1, 2), we obtain

In contrast, the overall three-day survival probability is 0.60 and the survival probability among those observed to adhere to the LTVV protocol is 0.844.

In this example, organ failure serves as a time-varying confounder. That is, organ failure on day 1 is affected by treatment on day 1 and predicts subsequent treatment. Organ failure on day 1 is also negatively related to survival on day 3. The three-day survival probability for those who adhere to the LTVV regime is higher than the counterfactual survival probability because those who adhere to therapy tend to have less organ failure than those who do not. The overall three-day survival probability is even lower because non-adherence to therapy is associated with greater organ failure and worse survival, regardless of organ failure status.

4.2. Additional Modeling Assumptions

The G-computation formula teaches us that the

is a functional of the distribution of

_{28}. We can write the likelihood of

_{28} as follows:

In our analysis, we assume that a patient's status on day

*t* only depends on his most recent history and the baseline APACHE III score (

*H*_{0}). Thus, the likelhood can be written as:

Further modeling restrictions result from structural features of our datasets as well as from the clinical judgement of our physician co-authors (DN and RB). Structurally, we know that all patients survive day one and whenever a patient is on ventilator on day

*t* – 1, she either dies or remains in the hospital and in the ICU at time

*t*. Also, once a patient leaves the hospital (ICU), she does not return the hospital (ICU). With these considerations, we write

where

*PD*_{t} =

*PP*_{t} –

*PE*_{t}, conditioning on

*R*_{t},

*HSP*_{t},

*ICU*_{t} or

*VNT*_{t} implies that

*S*_{t} = 1, conditioning on

*ICU*_{t} or

*VNT*_{t} implies

*HSP*_{t} = 1 and conditioning on

*VNT*_{t} implies

*ICU*_{t} = 1. It is important to note that we model

*PD*_{t} instead of

*PP*_{t} since we are conditioning on

*PE*_{t} and that

*VT*_{t} does not appear in the likelihood because it is a deterministic function of

*LC*_{t} and

*PD*_{t} (i.e.,

*VT*_{t} =

*LC*_{t}PD_{t}). The distributions on the right hand side of the above three displays are modeled with separate regression models. The binary outcomes,

*S*_{t},

*HSP*_{t},

*ICU*_{t},

*VNT*_{t},

*DS*_{t} and

*OF*_{t}, are modeled using logistic regression. The continuous variables,

*H*_{0},

*LC*_{t},

*PE*_{t},

*PH*_{t},

*PD*_{t},

*SO*_{t}, and

*SR*_{t}, are linearly transformed to the range [0, 1] (using the bounds displayed in ) and modeled using Beta regression models (

Ferrari and Cribari-Neto, 2004). In these models, the logit of the mean (

*μ*) is modeled as a function of covariates and the variance is equal to

*μ*(1 –

*μ*)/(

*η* + 1), where

*η* is a scalar dispersion parameter. In all regression models, there are separate intercepts for days 1 to 13, and common intercepts for days 14 to 21 and days 22-28; there are no interactions between covariates; the regression coefficients for time-varying covariates are assumed to be time-invariant as the dispersion parameters (in the Beta regression models).

| **Table 4**Bounds of selected variables |

4.3. Inference

We adopt a Bayesian approach to inference. We assume independently normally distributed mean zero, variance 100 priors on all the regression parameters. For each model, we built up a shrinkage prior on the time-varying intercept parameters as follows: the intercept on day one was assumed to be normally distributed with mean zero with variance 5, the intercept on day *t* (*t* = 2, . . ., 13) was assumed to be normally distributed with mean equal to the intercept on day *t* – 1 and variance *σ*^{2}, the common intercept for days 14 to 21 (days 22 to 28) was assume to be normally distributed with mean equal to the intercept on day 13 (day 21) and variance *σ*^{2}, and *σ*^{2} was assumed to be uniform on the interval [0, 5]. Finally, the dispersion parameters *η* in the Beta regression models were assumed to be uniform on the interval [0, 100].

The posterior sampling for the model parameters was performed in WinBUGS. Twelve-thousand iterations of which six-thousand were used for burn-in were performed. Trace plots of multiple chains were examined to check for convergence. For each iteration, we evaluated the G-computation formula by Monte-Carlo integration. The Monte-Carlo scheme is complicated by the fact that the integration requires us to sample from *dF*(*S*_{t}|*S*_{t–1} = 1, *A*_{t–1}, *H*_{t–1}, *U*_{t–1}, *H*_{0}), *dF*(*H*_{t}|*S*_{t} = 1, *A*_{t–1} = 1, *H*_{t–1}, *U*_{t–1}, *H*_{0}, *S*_{t} = 1) *dF*(*U*_{t}|*S*_{t} = 1, *A*_{t} = 1, *H*_{t}, *U*_{t–1}, *H*_{0}) instead of *dF*(*S*_{t}|*S*_{t–1} = 1, *H*_{t–1}, *U*_{t–1}, *H*_{0}), *dF*(*H*_{t}|*S*_{t} = 1, *H*_{t–1}, *U*_{t–1}, *H*_{0}), *dF*(*U*_{t}|*S*_{t} = 1, *H*_{t}, *U*_{t–1}, *H*_{0}), where we know that *A*_{t} is a deterministic function of *U*_{t}.

Our sampling scheme proceeds according to . For the sake of completeness, we document the procedure in each of the letter-marked boxes. In Box (A), the model in

(4.1) is used. The sampling in Box (B) proceeds as follows:

- If
*ICU*_{t–1} = 0, sample *HSP*_{t} from (4.2), else set *HSP*_{t} = 1. - If
*ICU*_{t–1} = 1 and *VNT*_{t–1} = 0, sample *ICU*_{t} from (4.3), elseif *R*_{t–1} = 1, set *ICU*_{t} = 1, else set *ICU*_{t} = 0. - If
*R*_{t–1} = 1, sample *VNT*_{t} from (4.4) else sample *VNT*_{t} from (4.5);

the sampling in Box (C) proceeds as follows:

- If
*R*_{t–1} = 1, sample *LC*_{t} from (4.6), else sample *LC*_{t} from (4.7);

the sampling in Box (D) proceeds as follows:

and sampling in Box (E) proceeds as follows:

- If
*R*_{t} = 1, sample *OF*_{t} from (4.20), elseif *R*_{t} = 0 and *R*_{t–1} = 1, sample *OF*_{t} from (4.21), elseif *R*_{t} = 0, *R*_{t–1} = 0, *ICU*_{t–1} = 1, sample *OF*_{t} from (4.22), else sample *OF*_{t} from (4.23).

This scheme is repeated 10,000 times and the resulting survival curves are averaged. As this is performed for each posterior draw from the model parameters, we obtain 6000 posterior draws of the survival curve of interest.

To evaluate, at least in part, the goodness of fit of our model, we can use a modification of the above procedure to obtain posterior draws for the distribution of the “observed” survival curve and compare it to the empirically observed survival curve. We note that we can express the “observed” survival curve as the following integral:

This suggests that we can obtain posterior draws by using the same algorithm above, except that, in Fugure 4, we omit the evaluation of whether

*A*_{t} = 1 or not.