Given the distribution of
28 and the first three assumptions above, the potential survival function at day
m (
m ≤ 28) is identified via the following G-computation formula (
Robins, 1986). The identification formula is derived as follows. By conditioning on
H0,
U0 and
S1 and
H1, we can write
Under sequential randomization,
By further conditioning on
U1,
S2 and
H2,
Using sequential randomization,
By continuing to condition sequentially in time and employing sequential randomization,
Finally, by employing stochastic consistency,
This latter multivariate integral has been called the “G-computation” integral or formula. To illustrate how the formula works, consider the following example.
4.1. Example
Consider the treatment of patients with ALI on the first two days. Suppose that (1) all patients survive day 1, (2) all patients are on assist-control or ‘at-risk’ on day one, (3) 70 percent of the patients on day 1 are treated with LTVV (A1 = 1) and the others with traditional ventilation (A1 = 0), (4) the assignment of treatment on day one is completely random, (5) organ failure on day 1 is highly associated with treatment on day 1, with patients on LTVV at much lower risk (20%) for organ failure as compared to those who are not (80%), (6) the risk of death on day 2 is higher for those with organ failure on day 1 and is lower for those treated with LTVV on day 1, (7) all surviving patients on day 2 are on assist-control ventilation on day 2, (8) treatment with LTVV on day 2 is positively associated with treatment with LTVV on day 1 and negatively associated with organ failure on day 1, (9) organ failure on day 2 is positively associated with organ failure on day 1 and negatively associated with treatment with LTVV on day 1 and (10) the risk of death on day 3 is higher for those with organ failure on day 2 and lower for those treated with LTVV on day 2. The distribution of the observed data is displayed in .
In this example, we are interested in comparing the probability of surviving past day 3 under the LTVV regime to (a) the overall probability of surviving past day 3 and (b) the probability of surviving past day 3 among patients who are observed to adhere to the LTVV regime. Using the G-computation formula and noting that
S1 = 1,

,

,
Ut =
OFt (
t = 1, 2), we obtain
In contrast, the overall three-day survival probability is 0.60 and the survival probability among those observed to adhere to the LTVV protocol is 0.844.
In this example, organ failure serves as a time-varying confounder. That is, organ failure on day 1 is affected by treatment on day 1 and predicts subsequent treatment. Organ failure on day 1 is also negatively related to survival on day 3. The three-day survival probability for those who adhere to the LTVV regime is higher than the counterfactual survival probability because those who adhere to therapy tend to have less organ failure than those who do not. The overall three-day survival probability is even lower because non-adherence to therapy is associated with greater organ failure and worse survival, regardless of organ failure status.
4.2. Additional Modeling Assumptions
The G-computation formula teaches us that the

is a functional of the distribution of
28. We can write the likelihood of
28 as follows:
In our analysis, we assume that a patient's status on day
t only depends on his most recent history and the baseline APACHE III score (
H0). Thus, the likelhood can be written as:
Further modeling restrictions result from structural features of our datasets as well as from the clinical judgement of our physician co-authors (DN and RB). Structurally, we know that all patients survive day one and whenever a patient is on ventilator on day
t – 1, she either dies or remains in the hospital and in the ICU at time
t. Also, once a patient leaves the hospital (ICU), she does not return the hospital (ICU). With these considerations, we write
where
PDt =
PPt –
PEt, conditioning on
Rt,
HSPt,
ICUt or
VNTt implies that
St = 1, conditioning on
ICUt or
VNTt implies
HSPt = 1 and conditioning on
VNTt implies
ICUt = 1. It is important to note that we model
PDt instead of
PPt since we are conditioning on
PEt and that
VTt does not appear in the likelihood because it is a deterministic function of
LCt and
PDt (i.e.,
VTt =
LCtPDt). The distributions on the right hand side of the above three displays are modeled with separate regression models. The binary outcomes,
St,
HSPt,
ICUt,
VNTt,
DSt and
OFt, are modeled using logistic regression. The continuous variables,
H0,
LCt,
PEt,
PHt,
PDt,
SOt, and
SRt, are linearly transformed to the range [0, 1] (using the bounds displayed in ) and modeled using Beta regression models (
Ferrari and Cribari-Neto, 2004). In these models, the logit of the mean (
μ) is modeled as a function of covariates and the variance is equal to
μ(1 –
μ)/(
η + 1), where
η is a scalar dispersion parameter. In all regression models, there are separate intercepts for days 1 to 13, and common intercepts for days 14 to 21 and days 22-28; there are no interactions between covariates; the regression coefficients for time-varying covariates are assumed to be time-invariant as the dispersion parameters (in the Beta regression models).
| Table 4Bounds of selected variables |
4.3. Inference
We adopt a Bayesian approach to inference. We assume independently normally distributed mean zero, variance 100 priors on all the regression parameters. For each model, we built up a shrinkage prior on the time-varying intercept parameters as follows: the intercept on day one was assumed to be normally distributed with mean zero with variance 5, the intercept on day t (t = 2, . . ., 13) was assumed to be normally distributed with mean equal to the intercept on day t – 1 and variance σ2, the common intercept for days 14 to 21 (days 22 to 28) was assume to be normally distributed with mean equal to the intercept on day 13 (day 21) and variance σ2, and σ2 was assumed to be uniform on the interval [0, 5]. Finally, the dispersion parameters η in the Beta regression models were assumed to be uniform on the interval [0, 100].
The posterior sampling for the model parameters was performed in WinBUGS. Twelve-thousand iterations of which six-thousand were used for burn-in were performed. Trace plots of multiple chains were examined to check for convergence. For each iteration, we evaluated the G-computation formula by Monte-Carlo integration. The Monte-Carlo scheme is complicated by the fact that the integration requires us to sample from dF(St|St–1 = 1, At–1, Ht–1, Ut–1, H0), dF(Ht|St = 1, At–1 = 1, Ht–1, Ut–1, H0, St = 1) dF(Ut|St = 1, At = 1, Ht, Ut–1, H0) instead of dF(St|St–1 = 1, Ht–1, Ut–1, H0), dF(Ht|St = 1, Ht–1, Ut–1, H0), dF(Ut|St = 1, Ht, Ut–1, H0), where we know that At is a deterministic function of Ut.
Our sampling scheme proceeds according to . For the sake of completeness, we document the procedure in each of the letter-marked boxes. In Box (A), the model in
(4.1) is used. The sampling in Box (B) proceeds as follows:
- If ICUt–1 = 0, sample HSPt from (4.2), else set HSPt = 1.
- If ICUt–1 = 1 and VNTt–1 = 0, sample ICUt from (4.3), elseif Rt–1 = 1, set ICUt = 1, else set ICUt = 0.
- If Rt–1 = 1, sample VNTt from (4.4) else sample VNTt from (4.5);
the sampling in Box (C) proceeds as follows:
- If Rt–1 = 1, sample LCt from (4.6), else sample LCt from (4.7);
the sampling in Box (D) proceeds as follows:
and sampling in Box (E) proceeds as follows:
- If Rt = 1, sample OFt from (4.20), elseif Rt = 0 and Rt–1 = 1, sample OFt from (4.21), elseif Rt = 0, Rt–1 = 0, ICUt–1 = 1, sample OFt from (4.22), else sample OFt from (4.23).
This scheme is repeated 10,000 times and the resulting survival curves are averaged. As this is performed for each posterior draw from the model parameters, we obtain 6000 posterior draws of the survival curve of interest.
To evaluate, at least in part, the goodness of fit of our model, we can use a modification of the above procedure to obtain posterior draws for the distribution of the “observed” survival curve and compare it to the empirically observed survival curve. We note that we can express the “observed” survival curve as the following integral:
This suggests that we can obtain posterior draws by using the same algorithm above, except that, in Fugure 4, we omit the evaluation of whether
At = 1 or not.