Home | About | Journals | Submit | Contact Us | Français |

**|**Int J Biostat**|**PMC3204669

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Mediation analysis for a marginal survival probability
- 3. Mediation analysis for two survival models
- 4. Dependent censoring
- 5. A semiparametric sensitivity analysis
- 6. Discussion
- References

Authors

Related links

Int J Biostat. 2011 January 1; 7(1): 33.

Published online 2011 September 2. doi: 10.2202/1557-4679.1351

PMCID: PMC3204669

Eric J. Tchetgen Tchetgen, Harvard University;

Copyright © 2011 The Berkeley Electronic Press. All rights reserved

This article has been cited by other articles in PMC.

Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also known as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional hazards model, or a marginal structural additive hazards model. The proposed theory delivers new estimators for mediation analysis in each of these models, with appealing robustness properties. Specifically, in order to guarantee ignorability with respect to the exposure and mediator variables, the approach, which is multiply robust, allows the investigator to use several flexible working models to adjust for confounding by a large number of pre-exposure variables. Multiple robustness is appealing because it only requires a subset of working models to be correct for consistency; furthermore, the analyst need not know which subset of working models is in fact correct to report valid inferences. Finally, a novel semiparametric sensitivity analysis technique is developed for each of these models, to assess the impact on inference, of a violation of the assumption of ignorability of the mediator.

Suppose that, upon establishing a marginal total effect of a point exposure on an outcome of interest, an investigator wishes to decompose this effect into its direct and indirect pathways, also known as natural or pure direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome (Robins and Greenland, 1992, Pearl, 2001). The literature on statistical methods for causal mediation analysis has blossomed in recent years with new results on identification of direct and indirect effects, and a number of novel techniques for obtaining statistical inferences about these effects (van der Laan and Petersen, 2005, VandeWeele, 2009, Imai 2010a,b, Lange and Hansen, 2011, VanderWeele, 2011, Tchetgen Tchetgen and Shpitser, 2011a,b). With the exception of Tein and Mackinnon (2003), and the recent paper by Lange and Hansen (2011) and the accompanying commentary by VanderWeele (2011), who consider a survival context, the existing literature on causal mediation analysis has largely focused on structural models for a mean effect. The current paper aims to further develop methodology for mediation analysis for survival data. In fact, we propose a general theory of estimation of natural direct and indirect effects for two important semiparametric models of a failure time outcome. We assume that the underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional hazards model as in Robins (1998), or a marginal structural additive hazards model. Lange and Hansen (2011) were the first to consider the use of the additive hazards model for causal mediation analysis in a survival context; whereas Tein and Mackinnon (2003) and VanderWeele (2011) also consider the use of a Cox proportional hazards model for mediation analysis.

The current paper aims to extend these existing results in several important ways. Thus, we develop some new semiparametric estimators of direct and indirect effects for each of these models, with appealing robustness properties. Specifically, the proposed approach which is so-called multiply robust, allows the investigator to use several flexible working models in order to adjust for a possibly large number of pre-exposure confounders for both exposure and mediating variables. Multiple robustness is appealing because it only requires a subset of these working models to be correct for unbiasedness (more precisely for consistency); furthermore, the analyst need not know which subset of working models is in fact correct to report valid inferences. Finally, in this paper, a novel semiparametric sensitivity analysis technique is also developed for each model, to assess the impact on mediation inferences, of a violation of the assumption of ignorability of the mediating variable. This is an important contribution in its own right, particularly because no methodology currently exist, for performing a sensitivity analysis for unmeasured confounding in the current survival context.

The theory developed in this paper closely parallels similar theory recently proposed by Tchetgen Tchetgen and Shpitser (2011a,b), who were the first to formalize the semipararametric theory for making multiply robust inferences about natural direct and indirect effect of the exposure on the mean of the outcome. In section 2, we adapt their previous results to obtain multiply robust inferences about natural direct and indirect effects of a binary exposure on the marginal survival curve in the presence of confounding and right censoring. Because the previous theory does not directly apply to semiparametric regression models for survival data, new methodology is developed in Section 3 for obtaining multiply robust inferences about natural direct and indirect effects under a Cox proportional hazards model and an additive hazards model. Then, we develop similar multiply robust estimators of natural indirect effects for each model. Finally, Section 4 gives new results on semiparametric sensitivity analysis in a survival context.

First we introduce some notation. Throughout, we suppose independent and identically distributed data on a vector (*E*, *M*, *X*, *T**, Δ) is collected for *n* subjects. Here, *E* is the binary exposure variable, *M* is a mediator variable with support , known to occur subsequently to *E* and prior to *T**, and *X* is a vector of pre-exposure variables with support that confound the association between (*E*, *M*) and the underlying failure time of interest *T*. Because of censoring, we observe Δ = *I*(*T* ≤ *C*) and *T** = min(*T*, *C*) where *C* denotes an individual’s right censoring time. Unless stated otherwise, we assume that conditional on *E*, censoring is independent of (*M*, *X*, *T*). Although, we show in Section 4 how this latter assumption can be relaxed. To limit the amount of unmeasured confounding, we suppose that *X* contains several variables, and thus is likely of moderate to high dimension. We assume that for each level {*E* = *e*, *M* = *m*}, there exist a counterfactual variable *T _{e, m}* corresponding to the outcome had possibly contrary to fact the exposure and mediator variables taken the value (

Although the paper focuses on a binary exposure, we note that the extension to a polytomous exposure is trivially deduced from the exposition.

Let *D**(*t*) denote *I*(*T** ≥ *t*), *D*(*t*) denote *I*(*T* ≥ *t*) and define the corresponding counterfactual at risk process *D _{em}*(

$$\stackrel{\text{total}\hspace{0.17em}\text{effect}}{\overbrace{{S}_{1{M}_{1}}(t)-{S}_{0{M}_{0}}(t)}}=\stackrel{\text{natural}\hspace{0.17em}\text{indirect}\hspace{0.17em}\text{effect}}{\overbrace{{S}_{1{M}_{1}}(t)-{S}_{1{M}_{0}}(t)}}+\underset{\text{natural}\hspace{0.17em}\text{direct}\hspace{0.17em}\text{effect}}{\underbrace{{S}_{1{M}_{0}}(t)-{S}_{0{M}_{0}}(t)}}$$

(1)

As shown in the display above, the natural direct effect captures the effect of the exposure when one intervenes to set the mediator to the (random) level it would have been in the absence of exposure (Robins and Greenland, 1992, Pearl 2001). Such an effect generally differs from the controlled direct effect which refers to the exposure effect that arises upon intervening to set the mediator to a fixed level that may differ from its actual observed value (Robins and Greenland, 1992, Pearl, 2001, Robins, 2003). As noted by Pearl (2001), controlled direct and indirect effects are particularly relevant for policy making whereas natural direct and indirect effects are more useful for understanding the underlying mechanism by which the exposure operates.

Identification of natural direct and indirect effects requires additional assumptions. To proceed, we make the consistency assumption also known as the stable unit treatment value assumption (SUTVA):

$$\begin{array}{c}\text{if}\hspace{0.17em}E=e,\hspace{0.17em}\text{then}\hspace{0.17em}{M}_{e}=M\hspace{0.17em}\text{w.p.}1\\ \text{and}\hspace{0.17em}\text{if}\hspace{0.17em}E=e\hspace{0.17em}\text{and}\hspace{0.17em}M=m\hspace{0.17em}\text{then}\hspace{0.17em}{T}_{e,m}=T\hspace{0.17em}\text{w.p.}1\end{array}$$

In addition, we adopt the sequential ignorability assumption of Imai et al (2010) which states that for *e*, *e′* {0, 1}:

$$\{{T}_{{e}^{\prime},m},{M}_{e}\}\coprod E|X$$

(2)

$${T}_{{e}^{\prime}m}\coprod M|E=e,X$$

(3)

paired with a standard positivity assumption:

$$\begin{array}{c}{f}_{M|E,X}\hspace{0.17em}(m|E,X)\hspace{0.17em}>0\hspace{0.17em}\text{w.p.}1\hspace{0.17em}\text{for}\hspace{0.17em}\text{each}\hspace{0.17em}m\in \mathcal{S}\\ \text{and}\hspace{0.17em}{f}_{E|X}\hspace{0.17em}(e|X)\hspace{0.17em}>0\hspace{0.17em}\text{w.p.}1\hspace{0.17em}\text{for}\hspace{0.17em}\text{each}\hspace{0.17em}e\in \{0,1\}\end{array}$$

where *f*_{M|E,X} is the density of [*M*|*E*, *X*] and *f*_{E|X} is the density of [*E*|*X*]. Then, under the consistency assumption, the first part of the sequential ignorability assumption (2) and the positivity assumption, one can show that *S*_{eMe}(*t*) is identified by the g-formula of Robins (1997); under the additional assumption given by the second part of the sequential ignorability assumption (3), one can further show as in Imai et al (2010a), that:

$$\begin{array}{l}{S}_{1{M}_{0}}(t)={\theta}_{t}\\ =\underset{S\times \mathcal{X}}{\iint}{S}_{T|E,M,X}(t|E=1,M=m,X=x){f}_{M|E,X}(m|E=0,X){f}_{X}(x)d\mu (m,x)\end{array}$$

(4)

where *f*_{M|E,X} and *f _{X}* are respectively the conditional density of the mediator

Theorem 1 of Tchetgen Tchetgen and Shpitser (2011a) implies that in order to obtain a consistent and asymptotically normal (CAN) estimator of the functional displayed in equation (4) and thus a CAN estimator of *S*_{1M0} (*t*) under the three assumptions given above, one must consistently estimate a subset of the following quantities {*S*_{T|E,M,X}, *f*_{M|E,X}, *f*_{E|X}}. Thus, let {*Ŝ*_{T|E,M,X}, _{M|E,X}, _{E|X}} denote estimates of these required quantities, based on standard parametric or semiparametric working models for regression and density estimation. Because of the curse of dimensionality due to a high dimensional *X*, nonparametric methods for estimating these quantities are likely impractical for the sample sizes encountered in practice, and thus parametric/semiparametric models must be used. We emphasize that these three models are not of primary scientific interest but as later demonstrated, are needed for making inferences about mediation.

In principle, one could simply evaluate the functional under the estimated model to obtain the maximum likelihood estimator (MLE):

$${\widehat{\theta}}_{t}^{\mathit{tm}}={\mathbb{P}}_{n}\underset{\mathcal{S}}{\int}{\widehat{S}}_{T|E,M,X}\hspace{0.17em}(t|E=1,M=m,X)\hspace{0.17em}{\widehat{f}}_{M|E,X}\hspace{0.17em}(m|E=0,X)\hspace{0.17em}d\mu (m)$$

where _{n} [·] *n*^{−1} ∑_{i} [·]_{i}. However, one should then be concerned that model mis-specification of either *Ŝ*_{T|E,M,X} or _{M|E,X} would likely lead to biased estimates of direct and indirect effects. Note that the MLE does not rely on a model for *f*_{E|X} and thus is completely robust to a mis-specified and thus likely biased estimate _{E|X}. Two alternative estimators can be constructed, by essentially following the approach of Tchetgen Tchetgen and Shpitser (2011a) for mean effects, that respectively use {*Ŝ*_{T|E,M,X}, _{E|X}} only and {_{M|E,X}, _{E|X}} only, and therefore, that are respectively robust to mis-specification of _{M|E,X} and *Ŝ*_{T|E,M,X}. Indeed, in the first case, one could use:

$${\widehat{\theta}}_{t}^{\mathit{te}}={\mathbb{P}}_{n}\left\{\frac{I(E=0)}{{\widehat{f}}_{E|X}(0|X)}{\widehat{S}}_{T|E,M,X}(t|E=1,M=m,X)\right\}$$

and in the second case one could use :

$${\widehat{\theta}}_{t}^{\mathit{em}}={\mathbb{P}}_{n}\left\{\frac{\mathrm{\Delta}}{{\widehat{S}}_{C|E}({T}^{*-}|E=1)}\frac{I(E=1)I({T}^{*}\ge t)}{{\widehat{f}}_{E|X}(E|X)}\frac{{\widehat{f}}_{M|E,X}(M|E=0,X)}{{\widehat{f}}_{M|E,X}(M|E,X)}\right\}$$

where *Ŝ*_{C|E} (*T**^{−}|*E* = *e*) denotes the exposure arm specific Kaplan-Meier estimator of the survival curve of censoring.
${\widehat{S}}_{C|E}^{-1}({T}^{*-}|E=1)$ weights are needed here to correctly account for censoring (Robins and Rotnitzky, 1992, Satten and Datta, 2001) under the current assumption that censoring is ignorable conditional on *E*, and an additional standard positivity assumption for the censoring mechanism (Robins and Rotnitzky, 1992). Unfortunately, as was the case for the MLE, these alternative estimators are likely severely biased if any of the required working models is incorrect.

Theorem 2 of Tchetgen Tchetgen and Shpitser (2011a) allows us to partially resolve this potential difficulty, by providing a basic roadmap for constructing an estimator * _{t}* =

- the estimates of the conditional survival probability
*Ŝ*_{T|E,M,X}and of the conditional density of the mediator_{M|E,X}are consistent; - the estimates of the conditional survival probability
*Ŝ*_{T|E,M,X}and of the conditional density of the exposure_{E|X}are consistent - the estimates of the conditional densities of the exposure and mediator variables are consistent.

Clearly, such an estimator * _{t}* should generally be preferred to
${\widehat{\theta}}_{t}^{\mathit{tm}}$,
${\widehat{\theta}}_{t}^{\mathit{te}}$ and
${\widehat{\theta}}_{t}^{\mathit{em}}$ because an inference using

$${\widehat{\theta}}_{t}={\mathbb{P}}_{n}\left[\begin{array}{c}\frac{\mathrm{\Delta}}{{\widehat{S}}_{C,1}({T}^{*-}|E=1)}\frac{I(E=1)}{{\widehat{f}}_{E|X}(E|X)}\frac{{\widehat{f}}_{M|E,X}(M|E=0,X)}{{\widehat{f}}_{M|E,X}(M|E,X)}\\ \times \left\{I({T}^{*}\ge t)-{\widehat{S}}_{T|E,M,X}(t|E=1,M,X)\right\}\\ +\frac{I(E=0)}{{\widehat{f}}_{E|X}(0|X)}\left\{{\widehat{S}}_{T|E,M,X}(t|E=1,M,X)-{\widehat{\eta}}_{t}(1,0,X)\right\}\\ +{\widehat{\eta}}_{t}(1,0,X)\end{array}\right]$$

where

$${\widehat{\eta}}_{t}(1,0,X)=\underset{\mathcal{S}}{\int}{\widehat{S}}_{T|E,M,X}(t|E=1,M=m,X){\widehat{f}}_{M|E,X}(m|E=0,X)d\mu (m)$$

* _{t}* may in turn be combined as in Tchetgen Tchetgen and Shpitser (2011a) with an existing doubly robust estimator of the g-formula for

In this section, we consider the estimation of natural direct effects under two alternative structural models for the total effect of exposure: a Cox proportional hazards model (Cox PH) and an additive hazards model.

The first model posits a Cox PH regression for the average total effect of the exposure, that is

$${\lambda}_{{T}_{e}}(t)={\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}e)$$

where λ_{Te} (*t*) denotes an individual’s average hazard of experiencing an event at time *t*, had possibly contrary to fact, the person been exposed to *E* = *e*, and *β _{c}* encodes on the log-hazards scale, the total causal effect of exposure. As in VanderWeele (2011), one can decompose exp (

$$\frac{{\lambda}_{{T}_{1}}(t)}{{\lambda}_{{T}_{0}}(t)}=\frac{\stackrel{\text{total}\hspace{0.17em}\text{effect}}{\overbrace{{\lambda}_{{T}_{1{M}_{1}}}(t)}}}{{\lambda}_{{T}_{0{M}_{0}}}(t)}=\frac{\stackrel{\text{natural}\hspace{0.17em}\text{indirect}\hspace{0.17em}\text{effect}}{\overbrace{{\lambda}_{{T}_{1{M}_{1}}}(t)}}}{{\lambda}_{{T}_{1{M}_{0}}}(t)}\times \frac{{\lambda}_{{T}_{1{M}_{0}}}(t)}{\underset{\text{natural}\hspace{0.17em}\text{direct}\hspace{0.17em}\text{effect}}{\underbrace{{\lambda}_{{T}_{0{M}_{0}}}(t)}}}$$

(5)

For estimation, whereas VanderWeele (2011) requires an additional assumption that the outcome is rare over the entire follow-up period, here we make no such rare disease assumption. However, it is assumed that the natural direct hazards ratio, and thus the indirect hazards ratio, both agree with the proportional hazards assumption of the total effect; specifically, we assume that

$${\lambda}_{{T}_{eM0}}(t)={\lambda}_{{T}_{0{M}_{0}}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)$$

follows a Cox PH model where ${\beta}_{c}^{\mathit{dir}}$ represents the direct effect of exposure, and similarly

$${\lambda}_{{T}_{1{M}_{e}}}(t)={\lambda}_{{T}_{1{M}_{0}}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{ind}}e)$$

where ${\beta}_{c}^{\mathit{ind}}$ represents the indirect effect of exposure. This is an additional assumption, since although unlikely in practice, in principle both direct and indirect effect could be functions of time in such a way that they combine to produce a time-constant total effect on the hazards ratio scale. Next, we describe some procedures for estimating the direct effect parameter ${\beta}_{c}^{\mathit{dir}}$.

Our first result generalizes the simple weighted strategy that previously gave
${\widehat{\theta}}_{t}^{\mathit{em}}$, and relies on the assumption that {_{M|E,X}, _{E|X}} converges to the truth, and it does not use *Ŝ*_{T|E,M,X}.

*Theorem 1: Under the consistency, sequential ignorability and positivity assumptions,*
${U}^{w}({\beta}_{c}^{\mathit{dir}})$ *is an unbiased estimating function for*
${\beta}_{c}^{\mathit{dir}}$, *where*

$${U}^{w}({\beta}_{c}^{\mathit{dir}})={U}^{w}({\beta}_{c}^{\mathit{dir}};{f}_{M|E,X},{f}_{E|X})=\int d{N}^{*}(t)W\left[E-\frac{{\xi}_{1}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}(t;{\beta}_{c}^{\mathit{dir}})}\right],$$

(6)

*with*

$$\begin{array}{c}{\xi}_{1}(t;{\beta}_{c}^{\mathit{dir}})=\mathbb{E}\{{D}^{*}(t)W\hspace{0.17em}E\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\},\\ {\xi}_{2}(t;{\beta}_{c}^{\mathit{dir}})=\mathbb{E}\{{D}^{*}(t)W\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\},\\ W=\frac{{f}_{M|E,X}(M|E=0,X)}{{f}_{E|X}(E|X){f}_{M|E,X}(M|E,X)}\end{array}$$

*and N** (*t*) = *I*(*T** ≤ *t*, Δ = 1) *is the counting process of an observed failure time. Thus,*
${\beta}_{c}^{\mathit{dir}}$ *is the solution of the equation:*

$$\mathbb{E}\{{U}^{w}({\beta}_{c}^{\mathit{dir}})\}=0$$

The proof of Theorem 1 is provided in the appendix; the result motivates the estimator ${\tilde{\beta}}_{c}^{\mathit{dir}}$ that solves:

$${\mathbb{P}}_{n}\left\{{\widehat{U}}^{w}\left({\tilde{\beta}}_{c}^{\mathit{dir}}\right)\right\}=0$$

where *Û ^{w}* (

$$\int d{N}^{*}(t)\widehat{W}\left[E-\frac{{\mathbb{P}}_{n}\left\{{D}^{*}(t)\widehat{W}\hspace{0.17em}E\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\right\}}{{\mathbb{P}}_{n}\left\{{D}^{*}(t)\widehat{W}\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\right\}}\right]$$

(7)

with

$$\begin{array}{c}{\widehat{\xi}}_{1}(t)={\mathbb{P}}_{n}\left\{{D}^{*}(t)\widehat{W}\hspace{0.17em}E\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\right\},\\ {\widehat{\xi}}_{2}(t)={\mathbb{P}}_{n}\left\{{D}^{*}(t)\widehat{W}\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\right\},\end{array}$$

and *Ŵ* defined as *W* under {_{M|E,X}, _{E|X}}. Thus, under the key assumption that {_{M|E,X}, _{E|X}} is consistent (and converges in probability at rates faster than *n*^{−1/4}, see Newey (1994)), and under further standard regularity conditions
${\tilde{\beta}}_{c}^{\mathit{dir}}$ is CAN with asymptotic variance that can be obtained by a standard Taylor expansion, or more conveniently by the nonparametric bootstrap.

This simple weighing strategy for estimating
${\beta}_{c}^{\mathit{dir}}$ holds appeal in that it is easy to implement in standard statistical software packages for survival analysis. This is because, equation (7) is a modified score equation for the partial likelihood of a marginal Cox proportional hazards model (which is recovered by setting the weights *Ŵ* 1), and thus the modification mainly entails setting non-unity weights, which can be done in most software packages for Cox regression analysis. For instance,
${\tilde{\beta}}_{c}^{\mathit{dir}}$ can be obtained using PROC PHREG in SAS with the WEIGHT statement to incorporate the individual specific weight *Ŵ*. However, as we have mentioned before, in the event that either _{M|E,X} or _{E|X} fails to converge to the truth,
${\tilde{\beta}}_{c}^{\mathit{dir}}$ will generally be biased. Thus we propose an alternative approach to estimate
${\beta}_{c}^{\mathit{dir}}$.

We proceed by first finding a modification to equation (6) that delivers the desired robustness property. Note that because both quantities *ξ*_{1} (*t*) and *ξ*_{2} (*t*) in equation (6) involve *W*, unbiased estimation (more precisely consistent estimation) of these two functions effectively requires correct models for {*f*_{M|E,X}, *f*_{E|X}}. Thus, a key step in developing a multiply robust estimator of
${\beta}_{c}^{\mathit{dir}}$ involves first finding multiply robust estimators of these two functions. In this vein, to further allow for generality, for any function of *E*, say *H* = *h*(*E*), let

$$\begin{array}{l}R(t,H;{\beta}_{c}^{\mathit{dir}})=R(t,H;{\beta}_{c}^{\mathit{dir}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E})\\ =\left\{{D}^{*}(t)-{S}_{C|E}(t|E){S}_{T|E,M,X}(t|E,M,X)\right\}W\hspace{0.17em}h(E)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\\ +\left\{\begin{array}{c}{\sum}_{e}\int {S}_{C|E})(t|E=e){S}_{T|E,M,X}(t|E=e,M=m,X\\ {f}_{M|E,X}(m|E=0,X)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)d\mu (m)\end{array}\right\}\\ +\frac{I(E=0)}{f(E|X)}{\sum}_{e}{S}_{C|E}(t|E=e){S}_{T|E,M,X}(t|E=e,M,X)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)\\ -\frac{I(E=0)}{f(E|X)}\left[\begin{array}{c}{\sum}_{e}\int {S}_{C|E}(t|E=e){S}_{T|E,M,X}(t|E=e,M=m,X)\\ {f}_{M|E,X}(m|E=0,X)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)d\mu (m)\end{array}\right]\end{array}$$

and let

$$\begin{array}{lll}{\xi}_{j}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})\hfill & =\hfill & {\xi}_{j}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E})\hfill \\ \hfill & =\hfill & \mathbb{E}\left\{R(t,{H}_{j};{\beta}_{c}^{\mathit{dir}})\right\},j=1,2;\hfill \end{array}$$

where *H*_{1} = *E* and *H*_{2} = 1.

Next, define *R*^{‡}(*t*, *H _{j}*;
${\beta}_{c}^{\mathit{dir}}$) as

*Theorem 2: Under the consistency, sequential ignorability and positivity assumptions,*
${U}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}})={U}^{\mathit{mr}}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)$ *is an unbiased estimating function for*
${\beta}_{c}^{\mathit{dir}}$*, where*

$$\begin{array}{l}{U}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}})\\ =\int \left\{\begin{array}{c}d{N}^{*}(t)\\ +{S}_{C|E}(t|E)d{S}_{T|E,M,X}(t|E,M,X)\end{array}\right\}W\left\{E-\frac{{\xi}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr}}(t)}\right\}\\ -\int \int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}\begin{array}{c}d{S}_{T|E,M,X}(t|E=e,m,X)\\ \times {S}_{C|E}(t|E=e){f}_{M|E,X}(m|E=0,X)\end{array}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr}}(t)}\right\}\end{array}\right]d\mu (m)\\ -\frac{I(E=0)}{{f}_{E|X}(E|X)}\int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M,X)\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}\right\}\end{array}\right]\\ +\frac{I(E=0)}{{f}_{E|X}(E|X)}\int \int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=0,X)\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}})}\right\}\end{array}\right]d\mu (m)\end{array}$$

*Furthermore,*

$$\mathbb{E}\left\{{U}^{\mathit{mr}}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$$

(8)

*if one but not necessarily all three of the following conditions holds: either* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{M|E,X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{M|E,X}} *or* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{E|X}}, *or* {
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*f*_{M|E,X}, *f*_{E|X}}.

According to Theorem 2, a multiply robust estimator ${\widehat{\beta}}_{c}^{\mathit{dir}}$ is obtained by solving the equation:

$${\mathbb{P}}_{n}\left\{{\widehat{U}}^{\mathit{mr}}\left({\widehat{\beta}}_{c}^{\mathit{dir}};{\widehat{S}}_{T|E,M,X},{\widehat{f}}_{M|E,X,}{\widehat{f}}_{E|X},{\widehat{S}}_{C|E}\right)\right\}=0$$

where *Û ^{mr}* (·; ·, ·, ·) is obtained by substituting

To estimate the indirect log hazards ratio
${\beta}_{c}^{\mathit{ind}}$, we observe that by the decomposition given in equation (5),
${\beta}_{c}^{\mathit{ind}}={\beta}_{c}-{\beta}_{c}^{\mathit{dir}}$ where *β _{c}* is the total log hazards ratio, i.e. λ

*Theorem 3: Suppose that*
${\beta}_{c}^{\mathit{dir}}$ *is known, then under the consistency, sequential ignorability and positivity assumptions,*
${V}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})={V}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E})$ *is an unbiased estimating function for*
${\beta}_{c}^{\mathit{ind}}$*, where*

$$\begin{array}{l}{V}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})\\ =\int \left[\begin{array}{c}\left\{\begin{array}{c}d{N}^{*}(t)\\ +\int \left\{\begin{array}{c}{S}_{C|E}(t|E)\\ \times d{S}_{T|E,M,X}(t|E,M=m,X)\\ \times {f}_{M|E,X}(m|E,X)\end{array}\right\}d\mu (m)\end{array}\right\}\\ \begin{array}{c}\times {f}_{E|X}^{-1}(E|X)\\ \times \left\{E-\frac{{\vartheta}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})}{{\vartheta}_{2}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})}\right\}\end{array}\end{array}\right]\\ -\iint {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=e,X)\\ \times \left\{e-\frac{{\vartheta}_{1}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})}{{\vartheta}_{2}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})}\right\}\end{array}\right]d\mu (m),\end{array}$$

*with*

$$\begin{array}{l}{\vartheta}_{j}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})={\vartheta}_{j}^{\mathit{mr}}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E})\\ =\mathbb{E}\{G(t,{H}_{j};{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})\},j=1,2\\ G(t,H;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})=\\ \left\{\begin{array}{c}{D}^{*}(t)\\ -\int \left\{\begin{array}{c}{S}_{C|E}(t|E)\\ \times {S}_{T|E,M,X}(t|E,M=m,X)\\ \times {f}_{M|E,X}(m|E,X)\end{array}\right\}d\mu (m)\end{array}\right\}\\ \times {f}_{E|X}^{-1}(E|X)h(E)\hspace{0.17em}\text{exp}\{({\beta}_{c}^{\mathit{dir}}+{\beta}_{c}^{\mathit{ind}})E\}\\ +{\sum}_{e}\int \left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=e,X)\\ \times h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}\{({\beta}_{c}^{\mathit{dir}}+{\beta}_{c}^{\mathit{ind}})e\}d\mu (m)\end{array}\right\}\end{array}$$

*Furthermore,* (
${\beta}_{c}^{\mathit{dir}}$,
${\beta}_{c}^{\mathit{ind}}$) *solves*

$$\begin{array}{r}\hfill \mathbb{E}\left\{{V}^{\mathit{mr}}\left({\beta}_{c}^{\mathit{ind}},{\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0\\ \hfill \mathbb{E}\left\{{U}^{\mathit{mr}}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0\end{array}$$

*if one but not necessarily all three of the following conditions holds: either* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{M|E,X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{M|E, X}} *or* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{E|X}}, *or* {
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*f*_{M|E, X}, *f*_{E|X}}.

According to theorem 3, a multiply robust estimator ${\widehat{\beta}}_{c}^{\mathit{ind}}$ is obtained by solving the equation:

$${\mathbb{P}}_{n}\left\{{\widehat{V}}^{\mathit{mr}}\left({\widehat{\beta}}_{c}^{\mathit{ind}},{\widehat{\beta}}_{c}^{\mathit{dir}};{\widehat{S}}_{T|E,M,X},{\widehat{f}}_{M|E,X},{\widehat{f}}_{E|X},{\widehat{S}}_{C|E}\right)\right\}=0$$

where * ^{mr}* (·, ·; ·, ·, ·) is obtained by substituting

In some situations, assuming proportional hazards may not fit the data well, in which case, an additive hazards model will often fit the data better (Lin and Ying, 2004). This alternative model assumes the average total effect of the exposure is additive on the hazards scale :

$${\lambda}_{{T}_{e}}(t)={\lambda}_{{T}_{0}}(t)+{\beta}_{a}e$$

where *β _{a}* encodes the total causal effect of exposure. As in Lange and Hansen (2011), one can decompose

$$\begin{array}{l}={\lambda}_{{T}_{1}}(t)-{\lambda}_{{T}_{0}}(t)\\ =\stackrel{\text{total}\hspace{0.17em}\text{effect}}{\overbrace{{\lambda}_{{T}_{1{M}_{1}}}(t)-{\lambda}_{{T}_{0{M}_{0}}}(t)}}=\stackrel{\text{natural}\hspace{0.17em}\text{indirect}\hspace{0.17em}\text{effect}}{\overbrace{{\lambda}_{{T}_{1{M}_{1}}}(t)-{\lambda}_{{T}_{1{M}_{0}}}(t)}}+\underset{\text{natural}\hspace{0.17em}\text{direct}\hspace{0.17em}\text{effect}}{\underbrace{{\lambda}_{{T}_{1{M}_{0}}}(t)-{\lambda}_{{T}_{0{M}_{0}}}(t)}}\end{array}$$

(9)

We further assume that the natural direct effect, and thus the indirect effect, agrees with the assumption of additive hazards, and thus

$${\lambda}_{{T}_{e{M}_{0}}}(t)={\lambda}_{{T}_{0{M}_{0}}}(t)+{\beta}_{a}^{\mathit{dir}}e$$

where ${\beta}_{a}^{\mathit{dir}}$ represents the direct effect of the exposure, and similarly

$${\lambda}_{{T}_{1{M}_{e}}}(t)={\lambda}_{{T}_{1{M}_{0}}}(t)+{\beta}_{a}^{\mathit{ind}}e$$

where ${\beta}_{a}^{\mathit{ind}}$ represents the indirect effect of the exposure. As was the case for the Cox PH model, this is an assumption, because although unlikely in practice, in principle both direct and indirect effects could be functions of time in such a way that they combine to produce a constant additive total effect. Next we discuss a variety of estimating approaches for the direct effect parameter ${\beta}_{a}^{\mathit{dir}}$.

The next result gives a weighted approach analogous to that proposed for the Cox PH model

*Theorem 4: Under the consistency, sequential ignorability and positivity assumptions,*
${Z}^{w}({\beta}_{a}^{\mathit{dir}})$ *is an unbiased estimating function for*
${\beta}_{a}^{\mathit{dir}}$*, where*

$${Z}^{w}({\beta}_{a}^{\mathit{dir}})=\int \{d{N}^{*}(t)-E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)\mathit{dt}\}W\left[E-\frac{{\varpi}_{1}(t)}{{\omega}_{2}(t)}\right],$$

(10)

*with*

$$\begin{array}{l}{\varpi}_{1}(t)=\mathbb{E}\{{D}^{*}(t)WE\},\\ {\varpi}_{2}(t)=\mathbb{E}\{{D}^{*}(t)W\}\end{array}$$

*Thus,*
${\beta}_{c}^{\mathit{dir}}$ *is the solution of the equation:*

$$\mathbb{E}\{{Z}^{w}({\beta}_{a}^{\mathit{dir}})\}=0$$

The theorem implies that
${\tilde{\beta}}_{a}^{\mathit{dir}}$ is CAN provided {_{M|E,X}, _{E|X}} is consistent, where
${\tilde{\beta}}_{a}^{\mathit{dir}}$ solves

$${\mathbb{P}}_{n}\left\{{\widehat{Z}}^{w}\left({\tilde{\beta}}_{a}^{\mathit{dir}}\right)\right\}=0$$

with

$${\widehat{Z}}^{w}(\beta )=\int \{d{N}^{*}(t)-E\beta {D}^{*}(t)dt\}\widehat{W}\left[E-\frac{{\widehat{\varpi}}_{1}(t)}{{\widehat{\varpi}}_{2}(t)}\right]$$

(11)

an empirical version of *Z ^{w}*. Note that
${\tilde{\beta}}_{a}^{\mathit{dir}}$ is available in closed form

$${\tilde{\beta}}_{a}^{\mathit{dir}}=\frac{{\mathbb{P}}_{n}\int d{N}^{*}(t)\widehat{W}\left[E-\frac{{\widehat{\varpi}}_{1}(t)}{{\widehat{\varpi}}_{2}(t)}\right]}{{\mathbb{P}}_{n}\int E{D}^{*}(t)dt\widehat{W}\left[E-\frac{{\widehat{\varpi}}_{1}(t)}{{\widehat{\varpi}}_{2}(t)}\right]}$$

But ${\tilde{\beta}}_{a}^{\mathit{dir}}$ is not multiply robust. The next theorem provides a multiply robust estimating function of ${\beta}_{a}^{\mathit{dir}}$. First, we introduce some additional notation and let

$$\begin{array}{l}{\varpi}_{j}^{\mathit{mr}}(t)={\varpi}_{j}^{\mathit{mr}}\left(t;{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)\\ =\mathbb{E}[\{{D}^{*}(t)-{S}_{C|E}(t|E){S}_{T|E,M,X}(t|E,M,X)\}W\hspace{0.17em}{h}_{j}(E)\\ +\int {\sum}_{e\in \{0,1\}}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=0,X){h}_{j}(e)\end{array}\right\}d\mu (m)\\ +\frac{I(E=0)}{{f}_{E|X}(E|X)}{\sum}_{e\in \{0,1\}}{S}_{C|E}(t|E=e){S}_{T|E,M,X}(t|E=e,M,X){h}_{j}(e)\\ -\frac{I(E=0)}{{f}_{E|X}(E|X)}\int {\sum}_{e\in \{0,1\}}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=0,X){h}_{j}(e)\end{array}\right\}d\mu (m)]\end{array}$$

*Theorem 5: Under the consistency, sequential ignorability and positivity assumptions,*
${Z}^{\mathit{mr}}({\beta}_{a}^{\mathit{dir}})={Z}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)$ *is an unbiased estimating function for*
${\beta}_{a}^{\mathit{dir}}$, *where*

$$\begin{array}{l}{Z}^{\mathit{mr}}({\beta}_{a}^{\mathit{dir}})\\ =\int \left\{\begin{array}{c}d{N}^{*}(t)-E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)dt\\ +{S}_{C|E}(t|E)d{S}_{T|E,M,X}(t|E,M,X)\\ +E{\beta}_{a}^{\mathit{dir}}{S}_{C|E}(t|E){S}_{T|E,M,X}(t|E,M,X)dt\end{array}\right\}\hspace{0.17em}W\left\{E-\frac{{\varpi}_{1}^{\mathit{mr}}(t)}{{\varpi}_{2}^{\mathit{mr}}(t)}\right\}\\ +\iint {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}\left\{\begin{array}{c}-\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M=m,X)\end{array}\right]\\ -\left[\begin{array}{c}e{\beta}_{a}^{\mathit{dir}}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)dt\end{array}\right]\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr},\u2021}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr},\u2021}(t)}\right\}{f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right]\\ +\frac{I(E=0)}{{f}_{E|X}(E|X)}\int {\sum}_{e\in \{0,1\}}\left\{\begin{array}{c}\left\{\begin{array}{c}-\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M,X)\end{array}\right]\\ -\left[\begin{array}{c}e{\beta}_{a}^{\mathit{dir}}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M,X)d(t)\end{array}\right]\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr},\u2021}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr},\u2021}(t;{\beta}_{c}^{\mathit{dir}})}\right\}\end{array}\right\}\\ -\frac{I(E=0)}{{f}_{E|X}(E|X)}\iint {\sum}_{e\in \{0,1\}}\left\{\begin{array}{c}\left\{\begin{array}{c}-\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times d{S}_{T|E,M,X}(t|E=e,M=m,X)\end{array}\right]\\ -\left[\begin{array}{c}e{\beta}_{a}^{\mathit{dir}}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)dt\end{array}\right]\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr},\u2021}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{\mathit{mr},\u2021}(t;{\beta}_{c}^{\mathit{dir}})}\right\}{f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\end{array}$$

*Furthermore,*

$$\mathbb{E}\left\{{Z}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$$

(12)

*if one but not necessarily all three of the following conditions holds: either* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{M|E,X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{M|E,X}} *or* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{E|X}}, *or* {
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*f*_{M|E,X}, *f*_{E|X}}

By theorem 5, a multiply robust estimator ${\widehat{\beta}}_{a}^{\mathit{dir}}$ is obtained by solving the equation:

$${\mathbb{P}}_{n}\left\{{\widehat{Z}}^{\mathit{mr}}\left({\widehat{\beta}}_{a}^{\mathit{dir}};{\widehat{S}}_{T|E,M,X},{\widehat{f}}_{M|E,X,}{\widehat{f}}_{E|X},{\widehat{S}}_{C|E}\right)\right\}=0$$

so that under standard regularity conditions,
${\widehat{\beta}}_{a}^{\mathit{dir}}$ is CAN in model * _{union}*. The nonparametric bootstrap can be used to compute standard errors for inference.

Suppose now that one wishes to estimate the indirect hazards difference
${\beta}_{a}^{\mathit{ind}}$. By the decomposition given in equation (9),
${\beta}_{a}^{\mathit{dir}}={\beta}_{a}-{\beta}_{a}^{\mathit{dir}}$ where
${\beta}_{a}^{\mathit{total}}$ is the total hazards difference, i.e. λ_{T1} (*t*) − λ_{T0} (*t*) = *β _{a}*. This decomposition immediately gives a simple estimator of the indirect effect based on a weighting scheme. The approach entails first estimating

*Theorem 6: Suppose*
${\beta}_{a}^{\mathit{dir}}$ *is known, then under the consistency, sequential ignorability and positivity assumptions,*
${P}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}}\right)={P}^{\mathit{mr}}({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E})$ *is an unbiased estimating function for*
${\beta}_{a}^{\mathit{ind}}$*, where*

$$\begin{array}{l}{P}^{\mathit{mr}}({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}})\\ =\int \left\{\begin{array}{c}\begin{array}{c}d{N}^{*}(t)-E({\beta}_{a}^{\mathit{dir}}+{\beta}_{a}^{\mathit{ind}}){D}^{*}(t)dt\\ +\int {S}_{C|E}(t|E)d{S}_{T|E,M,X}(t|E,m,X){f}_{M|E,X}(m|E,X)d\mu (m)\end{array}\\ +\int \left[\begin{array}{c}E({\beta}_{a}^{\mathit{dir}}+{\beta}_{a}^{\mathit{ind}}){S}_{C|E}(t|E)\\ \times {S}_{T|E,M,X}(t|E,m,X){f}_{M|E,X}(m|E,X)d(\mu (m),t)\end{array}\right]\end{array}\right\}\\ \times {f}_{E|X}^{-1}(E|X)\left\{E-\frac{{\phi}_{1}^{\mathit{mr}}(t)}{{\phi}_{2}^{\mathit{mr}}(t)}\right\}\\ +\int \int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}\left\{\begin{array}{c}-{S}_{C|E}(t|E=e)d{S}_{T|E,M,X}(t|E=e,M=m,X)\\ -e\left({\beta}_{a}^{\mathit{dir}}+{\beta}_{a}^{\mathit{ind}}\right){S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)d(\mu (m),t)\end{array}\right\}\\ \times \left\{e-\frac{{\phi}_{1}^{\mathit{mr}}(t)}{{\phi}_{2}^{\mathit{mr}}(t)}\right\}{f}_{M|E,X}(m|E=e,X)\end{array}\right]\end{array}$$

with ${\varphi}_{j}^{\mathit{mr}}(t)={\varphi}_{j}^{\mathit{mr}}(t;{S}_{T|E,M,X},{f}_{M|E,X,}{f}_{E|X},{S}_{C|E})$

$$\begin{array}{l}=\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{D}^{*}(t)\\ -\int \left[\begin{array}{c}{S}_{C|E}(t|E)\\ \times {S}_{T|E,M,X}(t|E,m,X){f}_{M|E,X}(m|E,X)d\mu (m)\end{array}\right]\end{array}\right\}\\ \times {f}_{E|X}^{-1}(E|X){h}_{j}(E)\end{array}\right]\\ +\int {\sum}_{e\in \{0,1\}}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \left[\begin{array}{c}{S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=e,X){h}_{j}(e)\end{array}\right]\end{array}\right\}d\mu (m)\end{array}$$

*Furthermore,*

$$\begin{array}{r}\hfill \mathbb{E}\left\{{P}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0\\ \hfill \mathbb{E}\left\{{Z}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}},{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0\end{array}$$

(13)

*if one but not necessarily all three of the following conditions holds: either* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{M|E,X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{M|E,X}} *or* {
${S}_{T|E,M,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*S*_{T|E,M,X}, *f*_{E|X}}, *or* {
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*f*_{M|E, X}, *f*_{E|X}}.

According to theorem 6, a multiply robust estimator ${\widehat{\beta}}_{a}^{\mathit{ind}}$ is obtained by solving the equation:

$${\mathbb{P}}_{n}\left\{{\widehat{P}}^{\mathit{mr}}\left({\widehat{\beta}}_{a}^{\mathit{ind}},{\widehat{\beta}}_{a}^{\mathit{dir}};{\widehat{S}}_{T|E,M,X},{\widehat{f}}_{M|E,X},{\widehat{f}}_{E|X},{\widehat{S}}_{C|E}\right)\right\}=0$$

where * ^{mr}* (·, ·; ·, ·, ·, ·) is obtained by substituting

We briefly consider how to modify the proposed methods if the previous assumption that censoring is independent of (*M*, *X*, *T*) conditional on *E*, is believed not to hold, but instead, censoring is known to be independent of *T* given (*E*, *M*, *X*). For brevity, we focus attention to the Cox proportional hazards model, but the approach is easily adapted to the additive hazards model. Consider the simple weighted estimating equation given in Theorem 1, the modification entails simply replacing the weight *W* in equation (6) with the time-dependent weight:

$${W}_{t}^{*}=W\times \frac{1}{{S}_{C|E,M,X}(t|E,M,X)}$$

where *S*_{C|E,M,X} is the conditional survival curve of censoring, so that equation (6) is replaced by

$$\begin{array}{lll}{U}^{w*}({\beta}_{c}^{\mathit{dir}})\hfill & =\hfill & {U}^{w*}({\beta}_{c}^{\mathit{dir}};{S}_{C|E,M,X},{f}_{M|E,X},{f}_{E|X})\hfill \\ \hfill & =\hfill & \int d{N}^{*}(t){W}_{t}^{*}\left[E-\frac{{\xi}_{1}^{*}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}^{*}(t;{\beta}_{c}^{\mathit{dir}})}\right],\hfill \end{array}$$

with

$${\xi}_{1}^{*}(t;{\beta}_{c}^{\mathit{dir}})=\mathbb{E}\{{D}^{*}(t){W}_{t}^{*}E\hspace{0.17em}\text{exp}({\beta}_{c}^{\mathit{dir}}E)\},$$

$${\xi}_{2}^{*}(t;{\beta}_{c}^{\mathit{dir}})=\mathbb{E}\{{D}^{*}(t){W}_{t}^{*}\hspace{0.17em}\text{exp}({\beta}_{c}^{\mathit{dir}}E)\},$$

then, one can show that
${U}^{w*}({\beta}_{c}^{\mathit{dir}})$ is an unbiased estimating equation under the assumption that censoring is independent of *T* given (*E*, *M*, *X*), provided a standard positivity assumption holds for the censoring mechanism (van der Laan and Robins, 2003). A feasible estimator is obtained by replacing all unknown quantities by empirical versions, using parametric or semiparametric models for (*S*_{C|E,M,X}, *f*_{M|E,X}, *f*_{E|X}). The multiply robust estimating function given in Theorem 2 can similarly be modified to accommodate this particular form of dependent censoring. Details are relegated to the appendix.

In this section, we extend the semiparametric sensitivity analysis technique proposed by Tchetgen Tchetgen and Shpitser (2011a,b), to assess the extent to which a violation of the ignorability assumption for the mediator might alter inferences about natural direct or indirect effects in the survival context. Let

$$\begin{array}{ll}\hfill & \gamma (t,e,m,x)\hfill \\ =\hfill & {\lambda}_{{T}_{1,m}|E,M,X}(t|E=e,M=m,X=x)-{\lambda}_{{T}_{1,m}|E,M,X}(t|E=e,M\ne m,X=x)\hfill \end{array}$$

then

$${T}_{{e}^{\prime},m}\coprod \u0338/M|E=e,X$$

i.e. a violation of the ignorability assumption for the mediator variable, generally implies that *γ* (*t*, *e*, *m*, *x*) ≠ 0 for some (*t*, *e*, *m*, *x*). Suppose *M* is binary and larger values of *T* are beneficial for health, then if *γ* (*t*, *e*, 1, *x*) < 0 but *γ* (*t*, *e*, 0, *x*) > 0 for all *t*, then on average, individuals with {*E* = *e*, *X* = *x*} and mediator value {*M* = 0} have a higher hazard function for each of the potential outcomes {*T*_{11}, *T*_{10}} than individuals with {*E* = *e*, *X* = *x*} but {*M* = 1} ; i.e. healthier individuals are more likely to receive the mediator. On the other hand, if *γ* (*t*, *e*, 0, *x*) < 0 but *γ* (*t*, *e*, 1, *x*) > 0 for all *t*, suggests confounding by indication for the mediator variable; i.e. unhealthier individuals are more likely to receive the mediating factor.

We proceed as in Robins et al (1999) who proposed using a selection bias function for the purposes of conducting a sensitivity analysis for total effects, and Tchetgen Tchetgen and Shpitser (2011a,b) who adapted the approach for assessing the impact of unmeasured confounding on the estimation of average natural direct and indirect effects. Here we propose to recover inferences about natural direct effects on the hazard function, under either an additive or a proportional hazards model, by assuming the selection bias function *γ* (*t*, *e*, *m*, *x*) is known, which encodes the magnitude and direction of the unmeasured confounding for the mediator. In the following, is assumed to be finite. To motivate the proposed approach, suppose for the moment that *f*_{M|E,X} is known, then under the assumption that the exposure is ignorable given *X*, we show in the appendix that the following lemma holds:

*Lemma 1:Let*

$$\begin{array}{l}\delta (t,e,m,x)=\delta (t,e,m,x;{f}_{M|E,X})\\ =\frac{{f}_{M|E,X}(m|E=e,X=x)+\{1-{f}_{M|E,X}(m|E=e,X=x)\}\hspace{0.17em}\text{exp}\left\{{\int}_{0}^{t}\gamma (u,e,m,x)du\right\}}{{f}_{M|E,X}(m|E=0,X=x)+\{1-{f}_{M|E,X}(m|E=0,X=x)\}\hspace{0.17em}\text{exp}\left\{{\int}_{0}^{t}\gamma (u,0,m,x)du\right\}}\end{array}$$

*and*

$$\dot{\delta}(t,1,m,x)=\frac{\partial \hspace{0.17em}\text{log}\hspace{0.17em}\delta (u,1,m,x)}{\partial u}{|}_{u=t}$$

*Under the consistency assumption and the first part of the sequential ignorability assumption (*2)

$$\begin{array}{ll}\hfill & {S}_{{T}_{1,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X=x)\hfill \\ =\hfill & {S}_{{T}_{1,m}|E,M,X}(t|E=0,M=m,X=x)\hfill \\ =\hfill & {S}_{T|E,M,X}(t|E=1,M=m,X=x)\times \delta (t,1,m,x)\hfill \end{array}$$

*Furthermore,*

$$\begin{array}{ll}\hfill & {\lambda}_{{T}_{1,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X=x)\hfill \\ =\hfill & {\lambda}_{{T}_{1,m}|E,M,X}(t|E=0,M=m,X=x)\hfill \\ =\hfill & {\lambda}_{T|E,M,X}(t|E=1,M=m,X=x)-\dot{\delta}(t,1,m,x)\hfill \end{array}$$

Lemma 1 implies that *S*_{T1,M0} (*t*) is identified by:

$$\mathbb{E}\left(\sum _{m\in \mathcal{S}}{S}_{T|E,M,X}(t|E=1,M=m,X=x)\delta (t,1,m,x){f}_{M|E,X}(m|E=0,X)\right)$$

(14)

Below, we use this result to obtain consistent estimators of {
${\beta}_{j}^{\mathit{dir}}$,
${\beta}_{j}^{\mathit{ind}}$: *j* = *a*, *c*} assuming *γ* (·, ·, ·, ·) is known. A sensitivity analysis is then obtained as in Tchetgen Tchetgen and Shpitser (2011a,b) by repeating this process and by reporting inferences for each choice of *γ* (·, ·, ·, ·) in a finite set of user–specified functions Γ = {*γ _{α}* (·, ·, ·, ·) :

For the Cox PH model, we propose to use the following modified estimating function for estimating the direct effect under _{1}, which carefully incorporates the selection bias function:

$$\begin{array}{lll}{U}^{w}({\beta}_{c}^{\mathit{dir}},{\alpha}^{*})\hfill & =\hfill & \int {\delta}_{{\alpha}^{*}}(t,E,M,X)\left\{d{N}^{*}(t)-{\dot{\delta}}_{{\alpha}^{*}}(t,M,X){D}^{*}(t)dt\right\}W\hfill \\ \hfill & \hfill & \times \left\{E-\frac{\mathbb{E}\{{D}^{*}(t)WE{\delta}_{{\alpha}^{*}}(t,E,M,X)\hspace{0.17em}\text{exp}({\beta}_{c}^{\mathit{dir}}E)\}}{\mathbb{E}\{{D}^{*}(t)W{\delta}_{{\alpha}^{*}}(t,E,M,X)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\}}\right\}\hfill \end{array}$$

where *δ*_{α*} (·, ·, ·, ·) is defined as *δ* (·, ·, ·, ·) under *γ*_{α*} (·, ·, ·, ·). For the additive model, one can use the following modified estimating function under _{1}:

$$\begin{array}{lll}{Z}^{w}({\beta}_{a}^{\mathit{dir}},{\alpha}^{*})\hfill & =\hfill & \int \left\{\begin{array}{c}d{N}^{*}(t)-{\dot{\delta}}_{{\alpha}^{*}}(t,E,M,X){D}^{*}(t)dt\\ -E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)\hspace{0.17em}{\dot{\delta}}_{{\alpha}^{*}}(t,E,M,X)dt\end{array}\right\}{\delta}_{{\alpha}^{*}}(t,E,M,X)W\hfill \\ \hfill & \hfill & \times \left\{E-\frac{\mathbb{E}\{{D}^{*}(t)W\hspace{0.17em}E{\delta}_{{\alpha}^{*}}(t,E,M,X)\}}{\mathbb{E}\{{D}^{*}(t)W{\delta}_{{\alpha}^{*}}(t,E,M,X)\}}\right\}\hfill \end{array}$$

In the appendix, we show the following result holds:

*Theorem 7:Suppose* *γ* (·, ·, ·, ·) = *γ*_{α*} (·, ·, ·, ·), *then under the consistency and positivity assumptions, and the ignorability assumption for the exposure, and under the Cox PH model,*
${\beta}_{c}^{\mathit{dir}}={\beta}_{c}^{\mathit{dir}}({\alpha}^{*})$ *solves the equation*

$$\mathbb{E}\{{U}^{w}({\beta}_{c}^{\mathit{dir}},{\alpha}^{*})\}=0$$

(15)

*Similarly, under the additive hazards model,*
${\beta}_{a}^{\mathit{dir}}={\beta}_{a}^{\mathit{dir}}({\alpha}^{*})$ *solves the equation*

$$\mathbb{E}\{{Z}^{w}({\beta}_{a}^{\mathit{dir}},{\alpha}^{*})\}=0$$

Thus, under model _{1} and the Cox PH assumption, a sensitivity analysis then entails reporting the set
$\left\{{\widehat{\beta}}_{c}^{\mathit{dir}}(\alpha ):\alpha \right\}$ (and the associated confidence intervals) which summarizes how sensitive inferences are to a deviation from the ignorability assumption *α* = 0, where
${\tilde{\beta}}_{c}^{\mathit{dir}}(\alpha )$ solves an empirical version of equation (15) with unknown quantities estimated under the model. A sensitivity analysis is similarly obtained for the additive hazards model, and inferences about indirect effects are obtained as in Section 3, upon substituting
$\left\{{\widehat{\beta}}_{c}^{\mathit{dir}}(\alpha ),{\widehat{\beta}}_{a}^{\mathit{dir}}(\alpha ):\alpha \right\}$ for
$\left\{{\widehat{\beta}}_{c}^{\mathit{dir}},{\widehat{\beta}}_{a}^{\mathit{dir}}\right\}$. In the appendix, we describe a doubly robust sensitivity analysis technique which further extends these results, by recovering correct sensitivity analyses under a union model in which, _{M|E,X} is assumed to be consistent, however, only one but not necessarily both *f*_{T|M,E,X} and *f*_{E|X} need to be consistently estimated.

It is helpful for practice, to briefly describe possible functional forms for the selection bias function *γ _{α}*(·, ·, ·, ·). In the simple case where

$$\begin{array}{ll}{\gamma}_{\alpha ,1}(t,e,m,x)=\alpha t(2m-1)\hfill & {\gamma}_{\alpha ,2}(t,e,m,x)=\alpha tm\hfill \\ {\gamma}_{\alpha ,3}(t,e,m,x)=\alpha t(2m-1)e\hfill & {\gamma}_{\alpha ,4}(t,e,m,x)=\alpha tme\hfill \\ {\gamma}_{\alpha ,5}(t,e,m,x)=\alpha t(2m-1)e{x}_{1}\hfill & {\gamma}_{\alpha ,6}(t,e,m,x)=\alpha tme{x}_{1}\hfill \end{array}$$

where for each of the above functional forms, the scalar parameter *α* encodes the magnitude and direction of unmeasured confounding for the mediator.

The functions *γ*_{α,3}, *γ*_{α,4}, *γ*_{α,5} and *γ*_{α,6} model interactions with the exposure variable and a component *X*_{1} of *X*, thus allowing for heterogeneity in the selection bias function over time. Since the functional form of *γ _{α}* is not identified from the observed data, we generally recommend reporting results for a variety of functional forms.

It is important to note that the sensitivity analysis technique introduced above appears to be the first of its kind for survival data. While a variety of techniques have previously been proposed for conducting sensitivity analyses for unmeasured confounding in the context of mediation, for example, VanderWeele (2010), Imai et al (2010a), Tchetgen Tchetgen and Shpitser (2011a,b), none of the existing techniques apply to mediation in the survival context under either a Cox PH model or an additive hazards model. It is also important to note that by concisely encoding a possible violation of the ignorability assumption for the mediator through a selection bias function the proposed approach avoids having to spell out in detail, the possible nature of the unmeasured confounding; although in practice, as illustrated above, a parsimonious model must be used for the selection bias function. A further appeal of the approach is that it may be used to perform a sensitivity analysis, in settings where the ignorability violation arises due to a confounder of the mediator-outcome relationship that is also an effect of the exposure variable; in which case, as observed in Section 2, such a variable even when observed, cannot be used towards identification of natural direct and indirect effects without additional assumptions.

Finally, we note that while in this section, the support of *M* was finite, the proposed sensitivity analysis methodology can be extended to accommodate a continuous mediator by further adapting the approach of Robins et al (1999) to the present setting.

The current paper makes a number of contributions to the study of statistical methods for causal mediation analysis. Focusing on survival data, we have proposed a number of new estimators of natural direct and indirect effects for the Cox PH and the additive hazards models. The weighted approach developed in section 3 is appealing for its simplicity and because it is easy to implement in existing software, provided individual-specific weights are accommodated. We should note that, whereas it is common practice when estimating total effects via inverse-probability-weights to report conservative standard errors based on the sandwich variance formula, this ignores the first stage estimation of the treatment weights. Results by Tchetgen Tchetgen and Shpitser (2011a) imply that such a practice gives the wrong answer for natural direct and indirect effects. However, a standard bootstrap may be used for inference. We also note that, in general, the more involved multiply robust approach of Section 3 should be preferred to the simpler weighted approach on theoretical grounds, because the former delivers valid inferences under weaker assumptions than the latter. However, implementing these improved methods for routine application presents a significant challenge that we plan to take on elsewhere. In addition, as pointed out by a referee, we note that in the setting of a randomized trial, the exposure mechanism is known by design and therefore, the multiply robust estimator described above becomes doubly robust in the sense that for correct inferences, one only needs either *f*_{M|E,X} or *S*_{T|E,M,X} to be correctly specified, but not necessarily both. Finally, we emphasize that the proposed multiply robust strategies should not be viewed as a substitute for sound model checking, and therefore, we encourage users of these methods to treat any multiply robust analysis they conduct with the same level of model scrutiny as they would apply to a non-robust approach.

Under the consistency, sequential ignorability and positivity assumptions,

$$\begin{array}{l}\mathbb{E}\left\{{D}^{*}(t)\frac{{f}_{M|E,X}(M|E=0,X)}{{f}_{E|X}(E|X){f}_{M|E,X}(M|E,X)}h(E)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}E)\right\}\\ =\mathbb{E}\left\{\begin{array}{c}{S}_{T|E,M,X}(t|E,M,X){S}_{C|E}(t|E)\\ \times \frac{{f}_{M|E,X}(M|E=0,X)}{{f}_{E|X}(E|X){f}_{M|E,X}(M|E,X)}h(E)\text{exp}({\beta}_{c}^{\mathit{dir}}E)\end{array}\right\}\\ ={\sum}_{e}\int \mathbb{E}\left\{\begin{array}{c}{S}_{T|E,M,X}(t|E=e,M=m,X){S}_{C|E}(t|E=e)\\ \times {f}_{M|E,X}(m|E=0,X)\\ \times h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)d\mu (m)\end{array}\right\}\\ =\left\{\begin{array}{c}{\sum}_{e}{S}_{C|E}(t|E=e)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e)\\ \times \mathbb{E}\left[\begin{array}{c}\int {S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right]\end{array}\right\}\\ =\left\{{\sum}_{e}{S}_{C|E}(t|E=e)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e){S}_{{T}_{e,{M}_{0}}}(t)\right\}\\ =\left\{{\sum}_{e}{S}_{C|E}(t|E=e)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e){S}_{{T}_{e,{M}_{0}}}(t)\right\}\end{array}$$

and {*dN** (*t*)*W h* (*E*)}

$$\begin{array}{l}=\mathbb{E}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}(t|E,M,X){S}_{T|E,M,X}(t|e,M,X)\\ \times {S}_{C|E}(t|E)Wh(E)dt\end{array}\right\}\\ =\mathbb{E}\{{f}_{T|E,M,X}(t|E,M,X){S}_{C|E}(t|E)Wh(E)\}dt\\ ={\sum}_{e}\left[\begin{array}{c}{S}_{C|E}(t|E=e)h(e)dt\\ \times \int \mathbb{E}\left\{\begin{array}{c}{f}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\end{array}\right]\\ ={\sum}_{e}{f}_{{T}_{e,{M}_{0}}}(t){S}_{C|E}(t|E=e)h(e)dt\\ ={\sum}_{e}{S}_{C|E}(t|E=e)h(e){\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}^{\mathit{dir}}e){S}_{{T}_{e,{M}_{0}}}(t)dt\end{array}$$

Therefore $\mathbb{E}\left\{\int d{N}^{*}(t)W\left[E-\frac{{\xi}_{1}(t;{\beta}_{c}^{\mathit{dir}})}{{\xi}_{2}(t;{\beta}_{c}^{\mathit{dir}})}\right]\right\}$

$$\begin{array}{l}=\mathbb{E}\left\{\int d{N}^{*}(t)W\left[\begin{array}{c}\begin{array}{c}E\\ -\frac{\mathbb{E}\left\{\begin{array}{c}{D}^{*}(t)\frac{{f}_{M|E,X}(M|E=0,X)}{{f}_{E|X}(E|X){f}_{M|E,X}(M|E,X)}\\ {h}_{1}(E)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}E\right)\end{array}\right\}}{\mathbb{E}\left\{\begin{array}{c}{D}^{*}(t)\frac{{f}_{M|E,X}(M|E=0,X)}{{f}_{E|X}(E|X){f}_{M|E,X}(M|E,X)}\\ {h}_{2}(E)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}E\right)\end{array}\right\}}\end{array}\end{array}\right]\right\}\\ =\mathbb{E}\left\{\int d{N}^{*}(t)W\left[E-\frac{\left\{{\sum}_{e}{S}_{C|E}(t|E=e)e\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}{\left\{{\sum}_{e}{S}_{C|E}(t|E=e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}\right]\right\}\\ =\int \left[\begin{array}{c}\mathbb{E}\left\{d{N}^{*}(t)W\hspace{0.17em}{h}_{1}(E)\right\}\\ -\frac{\mathbb{E}\{d{N}^{*}(t)W\hspace{0.17em}{h}_{2}(E)\}\left\{{\sum}_{e}{S}_{C|E}(t|E=e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}{\left\{{\sum}_{e}{S}_{C|E}(t|E=e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}\end{array}\right]\\ =\int dt\left[\begin{array}{c}{\sum}_{e}{S}_{C|E}(t|E=e)e{\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\\ -\frac{{\sum}_{e}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ {\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right)\\ {S}_{{T}_{e,{M}_{0}}}(t)\end{array}\right\}\times {\sum}_{e}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)e\\ \text{exp}\left({\beta}_{c}^{\mathit{dir}}e\right)\\ {S}_{{T}_{e,{M}_{0}}}(t)\end{array}\right\}}{\left\{{\sum}_{e}{S}_{C|E}(t|E=e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}\end{array}\right]\\ =\int dt\left[\begin{array}{c}{\sum}_{e}{S}_{C|E}(t|E=e)e{\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\\ -\frac{{\sum}_{e}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ {\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right)\\ {S}_{{T}_{e,{M}_{0}}}(t)\end{array}\right\}\times {\sum}_{e}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)e\\ {\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right)\\ {S}_{{T}_{e,{M}_{0}}}(t)\end{array}\right\}}{\left\{{\sum}_{e}{S}_{C|E}(t|E=e){\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right){S}_{{T}_{e,{M}_{0}}}(t)\right\}}\end{array}\right]\\ =0\end{array}$$

The following lemma will be used repeatedly to establish multiple robustness of a given estimating function.

Given i.i.d data (*O*, *M*, *E*, *X*), define the weighted functional *κ* (*l*) with weight *L* = *l*(*E*) as:

$$\kappa (l)={\sum}_{e=0}^{1}L(e)\mathbb{E}\int \left\{\begin{array}{c}\mathbb{E}(O|M=m,E=e,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}$$

Let *B*(*m*, *e*, *x*) = (*O*|*M* = *m*, *E* = *e*, *X* = *x*). Then, the random variable *J* = *J*(*B*, *f*_{M|E,X}, *f*_{E|X}) satisfies the triply robust unbiasedness property

$$\mathbb{E}\left\{J({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})\right\}=\kappa (l)$$

if at least one but not necessarily all of the following conditions hold: either {*B*^{‡},
${f}_{M|E,X}^{\u2021}$} = {*B*, *f*_{M|E,X}} or {*B*^{‡},
${f}_{E|X}^{\u2021}$} = {*B*, *f*_{E|X}}, or {
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$} = {*f*_{M|E,X}, *f*_{E|X}}; where

$$\begin{array}{l}J({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})=\frac{{f}_{M|E,X}^{\u2021}(M|E=0,X)}{{f}_{E|X}^{\u2021}(E|X){f}_{M|E,X}^{\u2021}(M|E,X)}\left\{O-{B}^{\u2021}(M,E,X)\right\}L(E)\\ +{\sum}_{e=0}^{1}\int L(e){B}^{\u2021}(M,e,X){f}_{M|E,X}^{\u2021}(m|E=0,X)d\mu (m)\\ +\frac{I(E=0)}{{f}_{E|X}^{\u2021}(0|X)}\left\{{\sum}_{e=0}^{1}L(e)\left[\begin{array}{c}{B}^{\u2021}(M,e,X)\\ -\int {B}^{\u2021}(M,e,X){f}_{M|E,X}^{\u2021}(m|E=0,X)d\mu (m)\end{array}\right]\right\}\end{array}$$

By Theorem 1 of Tchetgen Tchetgen and Shpitser (2011a), the random variable *J*(*B*^{‡},
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$) – *κ* (*l*) is the efficient influence function of *κ* (*l*) and thus the result follows from Theorem 2 of their paper. For an alternative proof consider the bias of *J*(*B*^{‡},
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$):

$$\begin{array}{l}\mathbb{E}\left\{J({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})\right\}-\kappa (l)\\ ={\sum}_{e=0}^{1}\int \mathbb{E}[\begin{array}{c}\frac{{f}_{E|X}(e|X){f}_{M|E,X}(m|E=e,X){f}_{M|E,X}^{\u2021}(m|E=0,X)}{{f}_{E|X}^{\u2021}(e|X){f}_{M|E,X}^{\u2021}(m|E=e,X)}\\ \left\{B(m,e,X)-{B}^{\u2021}(m,e,X)\right\}L(e)d\mu (m)\end{array}\\ +{\sum}_{e=0}^{1}\int L(e){B}^{\u2021}(M,e,X){f}_{M|E,X}^{\u2021}(m|E=0,X)d\mu (m)\\ -{\sum}_{e=0}^{1}\int L(e)B(M,e,X){f}_{M|E,X}(m|E=0,X)d\mu (m)\\ +\frac{{f}_{E|X}(0|X)}{{f}_{E|X}^{\u2021}(0|X)}{\sum}_{e=0}^{1}\int \left\{\begin{array}{c}L(e)\\ \times \left[\begin{array}{c}{B}^{\u2021}(m,e,X){f}_{M|E,X}(m|E=0,X)\\ -{B}^{\u2021}(m,e,X){f}_{M|E,X}^{\u2021}(m|E=0,X)\end{array}\right]d\mu (m)\end{array}\right\}]\\ =\mathbb{E}[\begin{array}{c}{\sum}_{e=0}^{1}\int \left\{\begin{array}{c}\frac{{f}_{E|X}(e|X){f}_{M|E,X}(m|E=e,X)}{{f}_{E|X}^{\u2021}(e|X){f}_{M|E,X}^{\u2021}(m|E=e,X)}\\ -1\end{array}\right\}\\ \times \left\{\begin{array}{c}B(m,e,X)\\ -{B}^{\u2021}(m,e,X)\end{array}\right\}{f}_{M|E,X}^{\u2021}(m|E=0,X)L(e)\end{array}\\ -{\sum}_{e=0}^{1}\int \left\{\begin{array}{c}{f}_{M|E,X}(m|E=0,X)\\ -{f}_{M|E,X}^{\u2021}(m|E=0,X)\end{array}\right\}\left\{\begin{array}{c}B(m,e,X)\\ -{B}^{\u2021}(m,e,X)\end{array}\right\}L(e)d\mu (m)\\ +{\sum}_{e=0}^{1}\int \left\{\begin{array}{c}{f}_{M|E,X}(m|E=0,X)\\ -{f}_{M|E,X}^{\u2021}(m|E=0,X)\end{array}\right\}\left\{\frac{{f}_{E|X}(e|X)}{{f}_{E|X}^{\u2021}(e|X)}-1\right\}B(m,e,X)L(e)d\mu (m)]\end{array}$$

= 0 if at least one of the three conditions of the Lemma holds, proving the result.

Under the consistency, sequential ignorability and positivity assumptions, in the proof of Theorem 1 we showed

$$\begin{array}{l}{\xi}_{1}\left(t;{\beta}_{c}^{\mathit{dir}}\right)\\ =\left\{\begin{array}{c}{\sum}_{e}h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right)\\ \times \mathbb{E}\left[\begin{array}{c}\int {S}_{C|E}(t|E=e){S}_{T|E,M,X}(t|E=e,M=m,X)\\ {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right]\end{array}\right\}\end{array}$$

which is of the form *κ*(*l*), with
$L(e)=h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}e\right)$ and *O* = *D** (*t*). Therefore, by Lemma 1, *R*^{‡}(*t*, *H*;
${\beta}_{c}^{\mathit{dir}}$) has the desired triply robust unbiasedness property, such that
$L(e)={S}_{C|E}(t|E=e)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}{\beta}_{c}^{\mathit{dir}}e$, thus we have
$\mathbb{E}\left\{{R}^{\u2021}(t,H;{\beta}_{c}^{\mathit{dir}})\right\}={\xi}_{1}(t;{\beta}_{c}^{\mathit{dir}})$ under the conditions of the Theorem. Similarly, we have previously established in the proof of Theorem 1, that

$$\begin{array}{ll}\hfill & \hspace{1em}\hspace{1em}\mathbb{E}\left\{d{N}^{*}(t)W\hspace{0.17em}h(E)\right\}\hfill \\ =\hfill & {\sum}_{e}\left[\begin{array}{c}h(e)\\ \times \mathbb{E}\left\{\begin{array}{c}\int {S}_{C|E}(t|E=e)\\ \times {f}_{T|E,M,X}(t|E=e,M=m,X)dt\\ {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\end{array}\right]\hfill \end{array}$$

which is of the form *κ* (*l*), with *L*(*e*) = *h*(*e*) and *O* = *dN** (*t*).Therefore, by Lemma 1, the theorem holds upon setting
$h(E)=\left\{E-\frac{{\xi}_{1}^{\mathit{mr},\u2021}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}{{\xi}_{2}^{\mathit{mr},\u2021}(t)}\right\}$.

The following Lemma is key to proving Theorem 3

Define the weighted functional *σ* (*l*) with weight *L* = *l*(*E*) as:

$$\begin{array}{lll}\sigma (l)\hfill & =\hfill & \sum _{e=0}^{1}L(e)\mathbb{E}\{\mathbb{E}(O|E=e,X)\}\hfill \\ \hfill & =\hfill & \sum _{e=0}^{1}L(e)\mathbb{E}\left\{\begin{array}{c}\int \mathbb{E}(O|M=m,\hspace{0.17em}E=e,X)\\ \times {f}_{M|E,X}(m|E=e,X)d\mu (m)\end{array}\right\}\hfill \end{array}$$

The random variable *A* = *A*(*B*, *f*_{M|E,X}, *f*_{E|X}) satisfies the double robust unbiasedness property which states that
$\mathbb{E}\left\{A({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})\right\}=\sigma (l)$ if at least one but not necessarily both of the following conditions hold: either {*B*^{‡},
${f}_{M|E,X}^{\u2021}$} = {*B*, *f*_{M|E,X}} or
${f}_{E|X}^{\u2021}={f}_{E|X}$; where

$$\begin{array}{l}A({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})\\ =\frac{1}{{f}_{E|X}^{\u2021}(E|X)}\left\{O-\int {B}^{\u2021}(m,E,X){f}_{M|E,X}^{\u2021}(m|E,X)d\mu (m)\right\}L(E)\\ +{\sum}_{e=0}^{1}\int L(e){B}^{\u2021}(M,e,X){f}_{M|E,X}^{\u2021}(m|E=e,X)d\mu (m).\end{array}$$

$$\begin{array}{l}\mathbb{E}\left\{A({B}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021})\right\}-\sigma (l)=\\ {\sum}_{e=0}^{1}\mathbb{E}\{\frac{{f}_{E|X}(e|X)}{{f}_{E|X}^{\u2021}(e|X)}\int \left\{\begin{array}{c}B(m,e,X){f}_{M|E,X}(m|E=e,X)\\ -{B}^{\u2021}(m,e,X){f}_{M|E,X}^{\u2021}(m|E=e,X)\end{array}\right\}d\mu (m)L(e)\\ \begin{array}{c}\hspace{1em}+{\sum}_{e=0}^{1}\int L(e){B}^{\u2021}(M,e,X){f}_{M|E,X}^{\u2021}(m|E=e,X)d\mu (m)\\ -\int B(m,e,X){f}_{M|E,X}(m|E=e,X)d\mu (m)\end{array}\}\\ ={\sum}_{e=0}^{1}\mathbb{E}\{\begin{array}{c}\left\{\frac{{f}_{E|X}(e|X)}{{f}_{E|X}^{\u2021}(e|X)}-1\right\}\\ \times \left\{\begin{array}{c}\int B(m,e,X){f}_{M|E,X}(m|E=e,X)d\mu (m)\\ -\int {B}^{\u2021}(m,e,X){f}_{M|E,X}^{\u2021}(m|e,X)d\mu (m)\end{array}\right\}L(e)\end{array}\end{array}$$

= 0 under the assumptions of the theorem.

We note that ${\vartheta}_{j}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}}\right)$

$$\begin{array}{l}={\vartheta}_{j}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})={\vartheta}_{j}\left(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)\\ =\mathbb{E}\left[{\sum}_{e}\int \left\{\begin{array}{c}{S}_{C|E}(t|E=e){S}_{T|E,M,X}(t|E=e,M=m,X)\\ {f}_{M|E,X}(m|E=e,X)h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left\{\left({\beta}_{c}^{\mathit{dir}}+{\beta}_{c}^{\mathit{ind}}\right)e\right\}d\mu (m)\end{array}\right\}\right]\end{array}$$

is of the form of the weighted functional *σ* (*l*) with
$L(e)=h(e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left\{\left({\beta}_{c}^{\mathit{dir}}+{\beta}_{c}^{\mathit{ind}}\right)e\right\}$ and *O* = *D**(*t*) thus by Lemma A.2,

$$\mathbb{E}\left[G(t,{H}_{j};{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}},{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E})\right]={\vartheta}_{j}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})$$

if either ${f}_{E|X}^{\u2021}={f}_{E|X}$ or $\left\{{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021}\right\}=\left\{{S}_{T|E,M,X},{f}_{M|E,X}\right\}$ but not necessarily both. Furthermore, note that

$$=\mathbb{E}\left\{\int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {f}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=e,X)\\ \times \left\{e-\frac{{\vartheta}_{1}^{\mathit{mr}}(t;{\beta}_{1},{\beta}_{2})}{{\vartheta}_{2}^{\mathit{mr}}(t;{\beta}_{1},{\beta}_{2})}\right\}dt\end{array}\right]d\mu (m)\right\}$$

is of the form of the weighted functional *σ* (*l*) with
$L(e)=\left\{e-\frac{{\vartheta}_{1}^{\mathit{mr}}(t;{\beta}_{1},{\beta}_{2})}{{\vartheta}_{2}^{\mathit{mr}}(t;{\beta}_{1},{\beta}_{2})}\right\}dt$ and *O* = *dN** (*t*), therefore, by Lemma A.2, if either
${f}_{E|X}^{\u2021}={f}_{E|X}$ or
$\left\{{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021}\right\}=\left\{{S}_{T|E,M,X},{f}_{M|E,X}\right\}$

$$\begin{array}{l}\mathbb{E}\left\{{V}^{\mathit{mr}}({\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E})\right\}\\ =\mathbb{E}\left\{\int \int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {f}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=e,X)\\ \times \left\{e-\frac{{\vartheta}_{1}(t;{\beta}_{c}^{\mathit{dir}},{\beta}_{c}^{\mathit{ind}})}{{\vartheta}_{2}(t;{\beta}_{1},{\beta}_{2})}\right\}dt\end{array}\right]d\mu (m)\right\}\\ =\mathbb{E}\left\{\int {\sum}_{e\in \{0,1\}}\left[\begin{array}{c}{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t){\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}e)\\ \times \left\{\begin{array}{c}e\\ -\frac{{\sum}_{e\in \{0,1\}}{S}_{C|E}(t|E=e)e{S}_{{T}_{e}}(t){\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}e)}{{\sum}_{e\in \{0,1\}}{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t){\lambda}_{{T}_{0}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}({\beta}_{c}e)}\end{array}\right\}dt\end{array}\right]\right\}\\ =0\end{array}$$

The result then follows by noting that ${\beta}_{c}^{\mathit{dir}}$ solves

$$\mathbb{E}\left\{{U}^{\mathit{mr}}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$$

which is triply robust by Theorem 2.

It is straightforward to verify that * _{j}* (

$$\begin{array}{l}{\varpi}_{1}(t)={\sum}_{e=0}^{1}\int e\mathbb{E}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\\ {\varpi}_{2}(t)={\sum}_{e=0}^{1}\int \mathbb{E}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\end{array}$$

Under the assumed structural model, and the consistency, sequential ignorability and positivity assumptions,

$$\begin{array}{l}\mathbb{E}\left[\left\{d{N}^{*}(t)-E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)dt\right\}W\hspace{0.17em}{h}_{j}(E)\right]\\ ={\sum}_{e=0}^{1}{h}_{j}(e){\lambda}_{{T}_{0}}(t)dt\mathbb{E}\int \left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M,X)\\ \times {f}_{M|E,X}(m|E=0,X)d\mu (m)\end{array}\right\}\end{array}$$

proving the result.

The proof is similar to that of Theorem 2, by applying Lemma A.1 to the three functionals ${\varpi}_{1}(t)={\varpi}_{1}^{\mathit{mr}}$, ${\varpi}_{2}(t)={\varpi}_{2}^{\mathit{mr}}(t)$ and $\mathbb{E}\left[\left\{d{N}^{*}(t)-E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)dt\right\}W\hspace{0.17em}{h}_{j}(E)\right]$.

The proof is similar to that of Theorem 3, by applying Lemma A.2 to the four key functionals:

$$\begin{array}{l}{\varphi}_{j}^{\mathit{mr}}(t)\\ ={\sum}_{e\in \{0,1\}}{h}_{j}(e)\int \mathbb{E}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)\\ \times {f}_{M|E,X}(m|E=e,X)\end{array}\right\}d\mu (m),\\ j=1,\hspace{0.17em}2\end{array}$$

and thus *L*(*e*) = *h _{j}*(

$$\begin{array}{l}\mathbb{E}\left[\left\{\begin{array}{c}d{N}^{*}(t)\\ -E({\beta}_{a}^{\mathit{dir}}+{\beta}_{a}^{\mathit{ind}}){D}^{*}(t)dt\end{array}\right\}{f}_{E|X}^{-1}(E|X)\left\{E-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}\right]\\ =\mathbb{E}\left[\left\{\begin{array}{c}d{N}^{*}(t)\\ -E{\beta}_{a}{D}^{*}(t)dt\end{array}\right\}{f}_{E|X}^{-1}(E|X)\left\{E-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}\right]\\ =\int {\sum}_{e\in \{0,1\}}\left\{e-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}\left[\begin{array}{c}\left\{\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {f}_{T|E,M,X}(t|E=e,M=m,X)\end{array}\right\}\\ \times {f}_{M|E,X}(m|E=e,X)d\mu (m)\end{array}\right]\\ -\int {\sum}_{e\in \{0,1\}}e\left(\begin{array}{c}{\beta}_{a}^{\mathit{dir}}\\ +{\beta}_{a}^{\mathit{ind}}\end{array}\right)\left\{\begin{array}{c}e\\ -\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\end{array}\right\}\left[\begin{array}{c}\begin{array}{c}{S}_{C|E}(t|E=e)\\ \times {S}_{T|E,M,X}(t|E=e,M=m,X)\end{array}\\ \times {f}_{M|E,X}(m|E=e,X)d\mu (m)\end{array}\right]\end{array}$$

is a difference of two *σ* (*l*) – functionals with respectively
$L(e)=\left\{e-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}$ and *O* = *dN** (*t*) for the first functional, and *L*(*e*)

$$=e\left({\beta}_{a}^{\mathit{dir}}+{\beta}_{a}^{\mathit{ind}}\right)\left\{e-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}$$

and *O* = *D**(*t*) for the second functional. This implies that if if either
${f}_{E|X}^{\u2021}={f}_{E|X}$ or
$\left\{{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021}\right\}=\left\{{S}_{T|E,M,X},{f}_{M|E,X}\right\}$

$$\begin{array}{l}\mathbb{E}\left\{{P}^{\mathit{mr}}({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E})\right\}\\ =\int {\lambda}_{{T}_{0}}(t){\sum}_{e=0}^{1}{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t)\left\{e-\frac{{\varphi}_{1}^{\mathit{mr}}(t)}{{\varphi}_{2}^{\mathit{mr}}(t)}\right\}dt\\ =\int {\lambda}_{{T}_{0}}(t){\sum}_{e=0}^{1}{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t)\left\{e-\frac{{\sum}_{e=0}^{1}e{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t)}{{\sum}_{e=0}^{1}{S}_{C|E}(t|E=e){S}_{{T}_{e}}(t)}\right\}dt\end{array}$$

= 0 and therefore *P ^{mr}* (
${\beta}_{a}^{\mathit{dir}}$,
${\beta}_{a}^{\mathit{ind}}$;
${S}_{T|E,M,X}^{\u2021}$,
${f}_{M|E,X}^{\u2021}$,
${f}_{E|X}^{\u2021}$,

The result then follows by noting that ${\beta}_{a}^{\mathit{ind}}={\beta}_{a}-{\beta}_{a}^{\mathit{dir}}$ and ${\beta}_{a}^{\mathit{dir}}$ solves

$$\mathbb{E}\left\{{Z}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$$

which is triply robust by Theorem 5, and thus, ${\beta}_{a}^{\mathit{ind}}$ solves

$\mathbb{E}\left\{{P}^{\mathit{mr}}\left({\beta}_{a}^{\mathit{dir}},{\beta}_{a}^{\mathit{ind}};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X}^{\u2021},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$ provided one of the three conditions given in the theorem hold.

We observe that

$$\begin{array}{l}{S}_{{T}_{1,m}|E,X}(t|E=e,X=x)=\\ {S}_{{T}_{1,m}|E,M,X}(t|E=e,M=m,X=x){f}_{M|E,X}(m|E=e,X=x)\\ {+S}_{{T}_{1,m}|E,M,X}(t|E=e,M\ne m,X=x)\{1-{f}_{M|E,X}(m|E=e,X=x)\}\\ =\text{exp}\left\{-{\int}_{0}^{t}({\lambda}_{{T}_{1,m}|E,M,X}(u|E=e,M=m,X=x))du\right\}\\ \times {f}_{M|E,X}(m|E=e,X=x)\\ +\text{exp}\left\{-{\int}_{0}^{t}({\lambda}_{{T}_{1,m}|E,M,X}(u|E=e,M\ne m,X=x)du)\right\}\\ \times \left\{1-{f}_{M|E,X}(m|E=e,X=x)\right\}\\ =\text{exp}\left\{-{\int}_{0}^{t}({\lambda}_{{T}_{1,m}|E,M,X}(u|E=e,M=m,X=x)du)\right\}\\ \times \left[\begin{array}{c}{f}_{M|E,X}(m|E=e,X=x)\\ +\text{exp}\left\{{\int}_{0}^{t}\gamma (u,e,m,x)du\right\}\\ \times \left\{1-{f}_{M|E,X}(m|E=e,X=x)\right\}\end{array}\right]\end{array}$$

Thus, by ignorability of *E*, we obtain

$$\begin{array}{l}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{{T}_{1,m|E,M,X}}(u|E=0,M=m,X=x)du\right)\right\}\\ =\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{{T}_{1,m}|E,M,X}(u|E=1,M=m,X=x)du\right)\right\}\\ \times \frac{\left[{f}_{M|E,X}(m|E=1,X=x)+\text{exp}\left\{{\int}_{0}^{t}\gamma (u,1,m,x)du\right\}\left\{1-{f}_{M|E,X}(m|E=1,X=x)\right\}\right]}{\left[{f}_{M|E,X}(m|E=0,X=x)+\text{exp}\left\{{\int}_{0}^{t}\gamma (u,0,m,x)du\right\}\left\{1-{f}_{M|E,X}(m|E=0,X=x)\right\}\right]}\\ =\hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x)du\right)\right\}\\ \times \frac{\left[{f}_{M|E,X}(m|E=1,X=x)+\text{exp}\left\{{\int}_{0}^{t}\gamma (u,1,m,x)du\right\}\left\{1-{f}_{M|E,X}(m|E=1,X=x)\right\}\right]}{\left[{f}_{M|E,X}(m|E=0,X=x)+\text{exp}\left\{{\int}_{0}^{t}\gamma (u,0,m,x)du\right\}\left\{1-{f}_{M|E,X}(m|E=0,X=x)\right\}\right]}\end{array}$$

proving the first result by consistency.

Furthermore, by differentiating with respect to *t* :

$$\begin{array}{l}-{\lambda}_{{T}_{1,m|E,M,X}}(t|E=0,M=m,X=x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}({\lambda}_{{T}_{1,m|E,M,X}}(u|E=0,M=m,X=x)du)\right\}\\ =-{\lambda}_{T|E,M,X}(t|E=1,M=m,X=x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x)du\right)\right\}\\ \times \delta (t,e,m,x)+\dot{\delta}(t,e,m,x)\times \delta (t,e,m,x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x)du\right)\right\}\\ \iff -{\lambda}_{{T}_{1,m|E,M,X}}(t|E=0,M=m,X=x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x\right)du)\right\}\\ \times \delta (t,e,m,x)=-{\lambda}_{T|E,M,X}(t|E=1,M=m,X=x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x\right)du)\right\}\\ \times \delta (t,e,m,x)+\dot{\delta}(t,e,m,x)\times \delta (t,e,m,x)\\ \times \hspace{0.17em}\text{exp}\left\{-{\int}_{0}^{t}\left({\lambda}_{T|E,M,X}(u|E=1,M=m,X=x\right)du)\right\}\\ \iff -{\lambda}_{{T}_{1,m|E,M,X}}(t|E=0,M=m,X=x)\\ ={\lambda}_{T|E,M,X}(u|E=1,M=m,X=x)-\dot{\delta}(t,e,m,x)\end{array}$$

proving the second part of the Lemma.

By Lemma 1 and the assumptions of the theorem,

$$\begin{array}{l}\mathbb{E}\left[{\delta}_{{a}^{*}}(t,E,M,X)\left\{\begin{array}{c}d{N}^{*}(t)\\ -{\dot{\delta}}_{{\alpha}^{*}}(t,E,M,X){D}^{*}(t)dt\end{array}\right\}W\hspace{0.17em}{h}_{j}(E)\right]\\ =\mathbb{E}\left[\left\{\begin{array}{c}{\lambda}_{T|E,M,X}(t|E,M,X)\\ -{\dot{\delta}}_{{\alpha}^{*}}(t,E,M,X)\end{array}\right\}{\delta}_{{\alpha}^{*}}(t,E,M,X){D}^{*}(t)W\hspace{0.17em}{h}_{j}(E)dt\right]\\ =\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}(t|E,M,X)\\ -{\dot{\delta}}_{{a}^{*}}(t,E,M,X)\end{array}\right\}\\ \begin{array}{c}\times {S}_{C|E}(t|E){S}_{T|E,M,X}(t|E,M,X)\\ \times {\delta}_{{a}^{*}}(t,E,M,X)W\hspace{0.17em}{h}_{j}(E)dt\end{array}\end{array}\right]\\ ={\sum}_{e\in \{0,1\}}{\sum}_{m\in \mathcal{S}}\int {S}_{C|E}(t|e)\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}(t|e,m,X)\\ -{\dot{\delta}}_{{\alpha}^{*}}(t,e,M,X)\end{array}\right\}\\ \begin{array}{c}\times \left\{\begin{array}{c}{S}_{T|E,M,X}(t|e,m,X)\\ \times {\delta}_{{\alpha}^{*}}(t,e,M,X)\end{array}\right\}\\ {f}_{M|E,X}(m|E=0,X){h}_{j}(e)dt\end{array}\end{array}\right]\\ ={\sum}_{m\in \mathcal{S}}\int {S}_{C|E}(t|e)\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}(t|1,m,X)\\ -{\dot{\delta}}_{{a}^{*}}(t,1,M,X)\end{array}\right\}\\ \begin{array}{c}\times \left\{\begin{array}{c}{S}_{T|E,M,X}(t|1,m,X)\\ \times {\delta}_{{\alpha}^{*}}(t|1,m,X)\end{array}\right\}\\ {f}_{M|E,X}(m|E=0,X)d\mu (m){h}_{j}(1)dt\end{array}\end{array}\right]+\\ {\sum}_{m\in \mathcal{S}}\int {S}_{C|E}(t|e)\mathbb{E}\left[\begin{array}{c}{\lambda}_{T|E,M,X}(t|0,m,X)\\ \times {S}_{T|E,M,X}(t|0,m,X)\\ \times {f}_{M|E,X}(m|E=0,X){h}_{j}(0)dt\end{array}\right]\\ ={\sum}_{m\in \mathcal{S}}\int {S}_{C|E}(t|e)\mathbb{E}\left[\begin{array}{c}{\lambda}_{{T}_{1,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X)\\ \times {S}_{{T}_{1,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X)\\ \times {f}_{M|E,X}(m|E=0,X){h}_{j}(1)dt\end{array}\right]+\\ {\sum}_{m\in \mathcal{S}}\int {S}_{C|E}(t|e)\mathbb{E}\left[\begin{array}{c}{\lambda}_{{T}_{0,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X)\\ \times {S}_{{T}_{0,{M}_{0}}|{M}_{0},X}(t|{M}_{0}=m,X)\\ \times {f}_{M|E,X}(m|E=0,X){h}_{j}(0)dt\end{array}\right]\\ ={\sum}_{e\in \{0,1\}}{S}_{C|E}(t|e){\lambda}_{{T}_{e,{M}_{0}}}(t){S}_{{T}_{e,{M}_{0}}}(t){h}_{j}(e)dt\\ ={\sum}_{e\in \{0,1\}}{S}_{C|E}(t|e){\lambda}_{{T}_{0,{M}_{0}}}(t)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}\right){S}_{{T}_{e,{M}_{0}}}(t){h}_{j}(e)dt\end{array}$$

One can similarly show that

$$\begin{array}{l}\mathbb{E}\left\{{D}^{*}(t)W{h}_{j}(E){\delta}_{{\alpha}^{*}}(t,E,M,X)\hspace{0.17em}\text{exp}\hspace{0.17em}\left({\beta}_{c}^{\mathit{dir}}E\right)\right\}\\ ={\sum}_{e\in \{0,1\}}{S}_{C|E}(t|e)\hspace{0.17em}\text{exp}\hspace{0.17em}\left(e{\beta}_{c}^{\mathit{dir}}\right){S}_{{T}_{e,{M}_{0}}}(t){h}_{j}(e)dt\end{array}$$

which implies the result since

$$\begin{array}{l}\mathbb{E}\left\{{U}^{w}\left({\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)\right\}\\ =\int {\sum}_{e\in \{0,1\}}{S}_{C|E}(t|e){\lambda}_{{T}_{0,}{M}_{0}}(t)\text{exp}\left({\beta}_{c}^{\mathit{dir}}\right){S}_{{T}_{e},{M}_{0}}(t)\\ \times \left\{e-\frac{{\mathrm{\Sigma}}_{{e}^{\prime}\in \left\{0,1\right\}}{S}_{C|E}\left(t|{e}^{\prime}\right)\text{exp}\left({e}^{\prime}{\beta}_{c}^{\mathit{dir}}\right){S}_{{T}_{{e}^{\prime},{M}_{0}}}(t){e}^{\prime}}{{\mathrm{\Sigma}}_{{e}^{\u2033}\in \left\{0,1\right\}}\int {S}_{C|E}\left(t|{e}^{\u2033}\right)\text{exp}\left({e}^{\u2033}{\beta}_{c}^{\mathit{dir}}\right){S}_{{T}_{{e}^{\u2033},{M}_{0}}}(t)dt}\right\}dt\\ =0\end{array}$$

For the case of an additive structural model

$$\begin{array}{l}\int \mathbb{E}\left[W\left\{\begin{array}{c}d{N}^{*}(t)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right){D}^{*}(t)dt\\ -E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t)dt\end{array}\right\}{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right){h}_{j}\left(E\right)\right]\\ =\mathbb{E}\left[\left\{\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|E,M,X\right)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right)\\ -E{\beta}_{a}^{\mathit{dir}}\end{array}\right\}{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right){D}^{*}(t)W{h}_{j}\left(E\right)dt\right]\\ =\mathbb{E}\left[\left\{\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|E,M,X\right)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right)\\ -E{\beta}_{a}^{\mathit{dir}}\end{array}\right\}{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right){D}^{*}(t)W{h}_{j}\left(E\right)dt\right]\\ =\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|E,M,X\right)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right)\\ -E{\beta}_{a}^{\mathit{dir}}\end{array}\right\}\\ \times {S}_{C|E}\left(t|E\right){S}_{T|E,M,X}\left(t|E,M,X\right){\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)W{h}_{j}\left(E\right)dt\end{array}\right]\\ ={\sum}_{e\in \left\{0,1\right\}}{\sum}_{m\in \mathcal{S}}\int {S}_{C|E}\left(t|e\right)\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|e,m,X\right)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,e,M,X\right)-e{\beta}_{a}^{\mathit{dir}}\end{array}\right\}\\ \times \left\{\begin{array}{c}{S}_{T|E,M,X}\left(t|e,m,X\right)\\ \times {\delta}_{{\alpha}^{*}}\left(t,e,M,X\right)\end{array}\right\}\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(e\right)dt\end{array}\right]\\ ={\sum}_{m\in \mathcal{S}}\int {S}_{C|E}\left(t|0\right)\mathbb{E}\left[\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|0,m,X\right)\left\{{S}_{T|E,M,X}\left(t|0,m,X\right)\right\}\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(0\right)dt\end{array}\right]\\ +{\sum}_{m\in \mathcal{S}}\int {S}_{C|E}\left(t|1\right)\mathbb{E}\left[\begin{array}{c}\left\{\begin{array}{c}{\lambda}_{T|E,M,X}\left(t|1,m,X\right)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,1,M,X\right)-{\beta}_{a}^{\mathit{dir}}\end{array}\right\}\\ \times \left\{\begin{array}{c}{S}_{T|E,M,X}\left(t|1,m,X\right)\\ \times {\delta}_{{\alpha}^{*}}\left(t|1,M,X\right)\end{array}\right\}\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(1\right)dt\end{array}\right]\\ ={\sum}_{m\in \mathcal{S}}\int {S}_{C|E}\left(t|0\right)\mathbb{E}\left[\begin{array}{c}{\lambda}_{{T}_{0,{M}_{0}}|{M}_{0},X}\left(t|{M}_{0}=m,X\right)\\ \times {S}_{{T}_{0,{M}_{0}}|{M}_{0},X}\left(t|{M}_{0}=m,X\right)\\ \times {f}_{M|E,X}\left(m|E=0,X\right)d\mu \left(m\right){h}_{j}\left(0\right)dt\end{array}\right]\\ +{\sum}_{m\in \mathcal{S}}\int {S}_{C|E}\left(t|1\right)\mathbb{E}\left[\begin{array}{c}{\lambda}_{{T}_{1,{M}_{0}}|{M}_{0},X}\left(t|{M}_{0}=m,X\right)-{\beta}_{a}^{\mathit{dir}}\\ \times \left\{{S}_{{T}_{1},{M}_{0}|{M}_{0},X}\left(t|{M}_{0}=m,X\right)\right\}\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(1\right)dt\end{array}\right]\\ ={S}_{C|E}\left(t|0\right)\left[{\lambda}_{{T}_{0,{M}_{0}}}(t){S}_{{T}_{0,{M}_{0}}}(t){h}_{j}\left(0\right)dt\right]\\ +\int {S}_{C|E}\left(t|1\right)\left\{{\lambda}_{{T}_{1,{M}_{0}}}(t)-{\beta}_{a}^{\mathit{dir}}\right\}{S}_{{T}_{1,{M}_{0}}}(t){h}_{j}\left(1\right)dt\\ ={\lambda}_{{T}_{0,{M}_{0}}}(t)\{{S}_{C|E}\left(t|0\right){S}_{{T}_{0,{M}_{0}}}(t){h}_{j}\left(0\right)dt\\ +\int {S}_{C|E}\left(t|1\right){S}_{{T}_{1,{M}_{0}}}(t){h}_{j}\left(1\right)dt\}\\ ={\lambda}_{{T}_{0,{M}_{0}}}(t){\sum}_{e\in \left\{0,1\right\}}{S}_{C|E}\left(t|e\right){S}_{{T}_{e,{M}_{0}}}(t){h}_{j}\left(e\right)dt\end{array}$$

One can similarly show that

$$\mathbb{E}\left\{{D}^{*}(t)W{h}_{j}\left(E\right){\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)\right\}={\sum}_{e\in \left\{0,1\right\}}{S}_{C|E}\left(t|e\right){S}_{{T}_{e,{M}_{0}}}(t){h}_{j}\left(e\right)dt$$

which gives the result.

Let

$$\begin{array}{l}{R}^{*}\left(t,H;{\beta}_{c}^{\mathit{dir}}\right)={R}^{*}\left(t,H;{\beta}_{c}^{\mathit{dir}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E,M,X}\right)\hfill \\ =\left\{{D}^{*}(t)-{S}_{C|E,M,X}\left(t|E,M,X\right){S}_{T|E,M,X}\left(t|E,M,X\right)\right\}{W}^{*}h\left(E\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}E\right)\hfill \\ +\left\{\begin{array}{c}{\sum}_{e}\int {S}_{T|E,M,X}\left(t|E=e,M=m,X\right)\\ {f}_{M|E,X}\left(m|E=0,X\right)h\left(e\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}e\right)d\mu \left(m\right)\end{array}\right\}\hfill \\ +\frac{I\left(E=0\right)}{f\left(E|X\right)}{\sum}_{e}{S}_{T|E,M,X}\left(t|E=e,M,X\right)h\left(e\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}e\right)\hfill \\ -\frac{I\left(E=0\right)}{f\left(E|X\right)}\left[\begin{array}{c}{\sum}_{e}\int {S}_{T|E,M,X}\left(t|E=e,M=m,X\right)\\ {f}_{M|E,X}\left(m|E=0,X\right)h\left(e\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}e\right)d\mu \left(m\right)\end{array}\right]\hfill \end{array}$$

and let
${\xi}_{j}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)={\xi}_{j}^{\mathit{mr},*}\left(t;{\beta}_{c}^{\mathit{dir}},{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E,M,X}\right)=\mathbb{E}\left\{{R}^{*}\left(t,{H}_{j};{\beta}_{c}^{\mathit{dir}}\right)\right\}$, *j* = 1, 2;

Then,
${\xi}_{j}^{\mathit{mr}*}\left(t;{\beta}_{c}^{\mathit{dir}}\right)={\xi}_{j}^{*}\left(t;{\beta}_{c}^{\mathit{dir}}\right)$, *j* = 1, 2, and in fact,
${\xi}_{j}^{\mathit{mr}*\u2020}\left(t;{\beta}_{c}^{\mathit{dir}}\right)=\mathbb{E}\left\{{R}^{*\u2020}\left(t,{H}_{j};{\beta}_{c}^{\mathit{dir}}\right)\right\}={\xi}_{j}\left(t;{\beta}_{c}^{\mathit{dir}}\right)$ provided that
${S}_{C|E,M,X}^{\u2020}={S}_{C|E,M,X}$ and at least one of the following three conditions hold: either
$\left\{{S}_{T|E,M,X}^{\u2020},{f}_{M|E,X}^{\u2020}\right\}=\left\{{S}_{T|E,M,X},{f}_{M|E,X}\right\}$ or
$\left\{{S}_{T|E,M,X}^{\u2020},{f}_{E|X}^{\u2020}\right\}=\left\{{S}_{T|E,M,X},{f}_{E|X}\right\}$, or
$\left\{{f}_{M|E,X}^{\u2020},{f}_{E|X}^{\u2020}\right\}=\left\{{f}_{M|E,X},{f}_{E|X}\right\}$. One may use this result to establish the following theorem:

*Theorem A.1: Under the consistency, sequential ignorability and positivity assumptions*,
${U}^{\mathit{mr}*}\left({\beta}_{c}^{\mathit{dir}}\right)={U}^{\mathit{mr}*}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E,M,X}\right)$ *is an unbiased estimating function for*
${\beta}_{c}^{\mathit{dir}}$*, where*

$$\begin{array}{l}{U}^{\mathit{mr}*}\left({\beta}_{c}^{\mathit{dir}}\right)\\ =\int \left\{\begin{array}{c}d{N}^{*}(t)\\ +{S}_{C|E}\left(t|E\right)\\ \times d{S}_{T|E,M,X}\left(t|E,M,X\right)\end{array}\right\}{W}^{*}\left\{E-\frac{{\xi}_{1}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}{{\xi}_{2}^{\mathit{mr}}(t)}\right\}\\ -\iint {\sum}_{e\in \left\{0,1\right\}}\left[\begin{array}{c}d{S}_{T|E,M,X}\left(t|E=e,m,X\right)\\ {f}_{M|E,X}\left(m|E=0,X\right)\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}{{\xi}_{2}^{\mathit{mr}}(t)}\right\}\end{array}\right]d\mu \left(m\right)\\ -\frac{I\left(E=0\right)}{{f}_{E|X}\left(E|X\right)}\int {\sum}_{e\in \left\{0,1\right\}}\left[\begin{array}{c}\left\{d{S}_{T|E,M,X}\left(t|E=e,M,X\right)\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}{{\xi}_{2}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}\right\}\end{array}\right]\\ +\frac{I\left(E=0\right)}{{f}_{E|X}\left(E|X\right)}\iint {\mathrm{\Sigma}}_{e\in \left\{0,1\right\}}\left[\begin{array}{c}\left\{\begin{array}{c}d{S}_{T|E,M,X}\left(t|E=e,M=m,X\right)\\ \times {f}_{M|E,X}\left(m|E=0,X\right)\end{array}\right\}\\ \times \left\{e-\frac{{\xi}_{1}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}{{\xi}_{2}^{\mathit{mr}}\left(t;{\beta}_{c}^{\mathit{dir}}\right)}\right\}\end{array}\right]d\mu \left(m\right)\end{array}$$

*Furthermore,*

$$\mathbb{E}\left\{{U}^{\mathit{mr}*}\left({\beta}_{c}^{\mathit{dir}};{S}_{T|E,M,X}^{\u2020},{f}_{M|E,X}^{\u2020},{f}_{E|X}^{\u2020},{S}_{C|E,M,X}^{\u2020}\right)\right\}=0$$

(16)

*if*
${S}_{C|E,M,X}^{\u2020}={S}_{C|E,M,X}$ *and one but not necessarily all three of the following conditions holds: either*
$\left\{{S}_{T|E,M,X}^{\u2020},{f}_{M|E,X}^{\u2020}\right\}=\left\{{S}_{T|E,M,X},{f}_{M|E,X}\right\}$ *or*
$\left\{{S}_{T|E,M,X}^{\u2020},{f}_{E|X}^{\u2020}\right\}=\left\{{S}_{T|E,M,X},{f}_{E|X}\right\}$, *or*
$\left\{{f}_{M|E,X}^{\u2020},{f}_{E|X}^{\u2020}\right\}=\left\{{f}_{M|E,X},{f}_{E|X}\right\}$.

The proof of this theorem is similar to that of Theorem 2.

We propose a sensitivity analysis that is doubly robust under the Cox PH model. For each fixed *α*=*α**, consider the following modified estimating function for
${\beta}_{c}^{\mathit{dir}}$:

$$\begin{array}{l}{U}^{w,dr}\left({\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)={U}^{w,dr}\left({\beta}_{c}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)\\ =\int \{\begin{array}{c}{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)\left\{\begin{array}{c}d{N}^{*}(t)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right){D}^{*}(t)dt\\ -E{\beta}_{a}^{\mathit{dir}}{D}^{*}(t){\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right)dt\end{array}\right\}W\\ \times \left\{E-\frac{{\chi}_{1}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}{{\chi}_{2}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}\right\}\end{array}\\ \times \begin{array}{c}-{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right){S}_{C|E}\left(t|E\right)\\ \left\{\begin{array}{c}{f}_{T|E,M,X}\left(t|E,M,X\right)dt\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right){S}_{T|E,M,X}\left(t|E,M,X\right)dt\\ -E{\beta}_{a}^{\mathit{dir}}{S}_{T|E,M,X}\left(t|E,M,X\right)dt{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right)dt\end{array}\right\}W\end{array}\}\\ \times \left\{E-\frac{{\chi}_{1}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}{{\chi}_{2}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}\right\}\\ \times \left\{\begin{array}{c}+{\sum}_{m\in \mathcal{S}}{\sum}_{e\in \left\{0,1\right\}}{\delta}_{{\alpha}^{*}}\left(t,e,m,X\right){S}_{C|E}\left(t|e\right)\\ {f}_{T|E,M,X}\left(t|e,m,X\right)dt\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,e,m,X\right){S}_{T|E,M,X}\left(t|e,m,X\right)dt\\ -e{\beta}_{a}^{\mathit{dir}}{\dot{\delta}}_{{\alpha}^{*}}\left(t,e,m,X\right){S}_{T|E,M,X}\left(t|e,m,X\right)dt\\ \times \left\{e-\frac{{\chi}_{1}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}{{\chi}_{2}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)}{f}_{M|E,X}\left(m|E=0,X\right)\right\}\end{array}\right\}\end{array}$$

with

$$\begin{array}{l}{\chi}_{j}\left(t;{\beta}_{c}^{\mathit{dir}},{\alpha}^{*}\right)=\\ \mathbb{E}\left\{\begin{array}{c}\left(\begin{array}{c}{D}^{*}(t)\\ -{S}_{C|E}\left(t|E\right)\\ \times {S}_{T|E,M,X}\left(t|E,M,X\right)dt\end{array}\right)\\ \times W{h}_{j}\left(E\right){\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}E\right)\end{array}\right\}\\ +{\sum}_{m\in \mathcal{S}}{\sum}_{e\in \left\{0,1\right\}}\left({S}_{C|E}\left(t|e\right){S}_{T|E,M,X}\left(t|e,m,X\right)dt\right)\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(e\right){\delta}_{{\alpha}^{*}}\left(t,e,m,X\right)\text{exp}\left({\beta}_{c}^{\mathit{dir}}e\right)\end{array}$$

One can then easily verify that ${U}^{w,dr}\left({\beta}_{c}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)$ is doubly robust in the sense that

$\mathbb{E}\left\{{U}^{w,dr}\left({\beta}_{c}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$ if either ${S}_{T|E,M,X}^{\u2020}={S}_{T|E,M,X}$ or ${f}_{E|X}^{\u2020}={f}_{E|X}$.

For the additive hazards model, we propose to use the following modified estimating function of ${\beta}_{a}^{\mathit{dir}}$:

$$\begin{array}{l}{Z}^{w,dr}\left({\beta}_{a}^{\mathit{dir}},{\alpha}^{*}\right)\\ ={Z}^{w,dr}\left({\beta}_{a}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)\\ =\int \{\begin{array}{c}{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)\\ \times \left\{\begin{array}{c}d{N}^{*}(t)\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right){D}^{*}(t)dt\end{array}\right\}W\\ \times \left\{E-\frac{{\zeta}_{1}\left(t,{\alpha}^{*}\right)}{{\zeta}_{2}\left(t;{\alpha}^{*}\right)}\right\}\end{array}\\ \begin{array}{c}-{\delta}_{{\alpha}^{*}}\left(t,E,M,X\right){S}_{C|E}\left(t|E\right)\\ \times \left\{\begin{array}{c}{f}_{T|E,M,X}\left(t|E,M,X\right)dt\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,E,M,X\right){S}_{T|E,M,X}\left(t|E,M,X\right)dt\end{array}\right\}W\end{array}\}\\ \times \left\{E-\frac{{\zeta}_{1}\left(t,{\alpha}^{*}\right)}{{\zeta}_{2}\left(t;{\alpha}^{*}\right)}\right\}\\ \hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}+{\sum}_{m\in \mathcal{S}}{\sum}_{e\in \left\{0,1\right\}}{\delta}_{{\alpha}^{*}}\left(t,e,m,X\right){S}_{C|E}\left(t|e\right)\\ \times \left\{\begin{array}{c}{f}_{T|E,M,X}\left(t|e,m,X\right)dt\\ -{\dot{\delta}}_{{\alpha}^{*}}\left(t,e,m,X\right){S}_{T|E,M,X}\left(t|e,m,X\right)dt\\ \times \left\{e-\frac{{\zeta}_{1}\left(t,{\alpha}^{*}\right)}{{\zeta}_{2}\left(t;{\alpha}^{*}\right)}\right\}{f}_{M|E,X}\left(m|E=0,X\right)\end{array}\right\}\end{array}$$

with

$$\begin{array}{l}{\zeta}_{j}\left(t,{\alpha}^{*}\right)=\\ \mathbb{E}\left\{\begin{array}{c}\left(\begin{array}{c}{D}^{*}(t)\\ -{S}_{C|E}\left(t|E\right){S}_{T|E,M,X}\left(t|E,M,X\right)dt\end{array}\right)\\ \times W{h}_{j}\left(E\right){\delta}_{{\alpha}^{*}}\left(t,E,M,X\right)\end{array}\right\}\\ +{\sum}_{m\in \mathcal{S}}{\sum}_{e\in \left\{0,1\right\}}\left({S}_{C|E}\left(t|e\right){S}_{T|E,M,X}\left(t|e,m,X\right)dt\right)\\ \times {f}_{M|E,X}\left(m|E=0,X\right){h}_{j}\left(e\right){\delta}_{{\alpha}^{*}}\left(t,e,m,X\right)\end{array}$$

One can easily verify that ${Z}^{w,dr}\left({\beta}_{a}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X},{f}_{M|E,X},{f}_{E|X},{S}_{C|E}\right)$ is doubly robust in the sense that

$\mathbb{E}\left\{{Z}^{w,dr}\left({\beta}_{a}^{\mathit{dir}},{\alpha}^{*};{S}_{T|E,M,X}^{\u2021},{f}_{M|E,X},{f}_{E|X}^{\u2021},{S}_{C|E}\right)\right\}=0$ if either ${S}_{T|E,M,X}^{\u2021}={S}_{T|E,M,X}$ or ${f}_{E|X}^{\u2021}={f}_{E|X}$.

[1] Avin C, Shpitser I, Pearl J. Identifiability of path-specific effects. IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence; Edinburgh, Scotland, UK. July 30–August 5, 2005; 2005. pp. 357–363.

[2] Bang H, Robins J. Doubly robust estimation in Missing data and causal inference models. Biometrics. 2005;61:692–972. doi: 10.1111/j.1541-0420.2005.00377.x. [PubMed] [Cross Ref]

[3] van der Laan MJ, Robins JM. Unified Methods for Censored Longitudinal Data and Causality. Springer Verlag; New York: 2003.

[4] van der Laan M, Petersen M. Direct Effect Models. 2005. UC Berkeley Division of Biostatistics Working Paper Series Working Paper 187. http://www.bepress.com/ucbbiostat/paper187.

[5] Imai K, Keele L, Yamamoto T. Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science. 2010a;25:51–71. doi: 10.1214/10-STS321. [Cross Ref]

[6] Imai K, Keele L, Tingley D. A General Approach to Causal Mediation Analysis. Psychological Methods. 2010b Dec;15(4):309–334. doi: 10.1037/a0020761. [PubMed] [Cross Ref]

[7] Lin DY, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81:61–71. doi: 10.1093/biomet/81.1.61. [Cross Ref]

[8] Newey W. Semiparametric efficiency bounds. Journal of Applied Econometric. 1994;5(2):99–135. doi: 10.1002/jae.3950050202. [Cross Ref]

[9] Pearl J. Direct and indirect effects. Proceedings of the 17th Annual Conference on Uncertainty in Artificial Intelligence (UAI-01); San Francisco, CA. 2001. pp. 411–42. Morgan Kaufmann.

[10] Pearl J. The Mediation Formula: A guide to the assessment of causal pathways in nonlinear models. 2011. Technical report. http://ftp.cs.ucla.edu/pub/stat_ser/ [PubMed]

[11] Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [PubMed] [Cross Ref]

[12] Robins JM, Rotnitzky A. Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V, editors. AIDS Epidemiology - Methodological Issues. Boston, MA: Birkhäuser; 1992. pp. 297–331.

[13] Robins JM, Rotnitzky A, Scharfstein D. Sensitivity Analysis for Selection Bias and Unmeasured Confounding in Missing Data and Causal Inference Models. In: Halloran ME, Berry D, editors. Statistical Models in Epidemiology: The Environment and Clinical Trials. Vol. 116. NY: Springer-Verlag; 1999. pp. 1–92. IMA. [Cross Ref]

[14] Robins JM. Marginal structural models 1997. Proceedings of the American Statistical Association; 1998. pp. 1–10. Section on Bayesian Statistical Science, Reproduced courtesy of the American Statistical Association.

[15] Robins J. Semantics of causal DAG models and the identification of direct and indirect effects. In: Green P, Hjort N, Richardson S, editors. Highly Structured Stochastic Systems. Oxford, UK: Oxford University Press; 2003. pp. 70–81.

[16] Robins JM, Richardson TS. Alternative graphical causal models and the identification of direct effects. In: Shrout P, editor. To appear in Causality and Psychopathology: Finding the Determinants of Disorders and Their Cures. Oxford University Press; 2010.

[17] Scharfstein DO, Rotnitzky A, Robins JM. Rejoinder to comments on “Adjusting for non-ignorable drop-out using semiparametric non-response models” Journal of the American Statistical Association. 1999;94:1096–1120. doi: 10.2307/2669923. Journal of the American Statistical Association, 94:1121–1146. [Cross Ref]

[18] Tchetgen Tchetgen EJ, Shpit I. Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness and sensitivity analysis. 2011b. Jun 3rd, 2011. http://www.bepress.com/harvardbiostat/paper130/

[19] Tchetgen Tchetgen EJ, Shpit I. Semiparametric Estimation of Models for Natural Direct and Indirect Effects. 2011a. Jun 3rd, 2011. http://www.bepress.com/harvardbiostat/paper129/

[20] Tsiatis AA. Semiparametric Theory and Missing Data. Springer. Verlag; New York: 2006.

[21] VanderWeele TJ. Marginal structural models for the estimation of direct and indirect effects. Epidemiology. 2009;20:18–26. doi: 10.1097/EDE.0b013e31818f69ce. [PubMed] [Cross Ref]

[22] VanderWeele TJ. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology. 2010;21:540–551. doi: 10.1097/EDE.0b013e3181df191c. [PMC free article] [PubMed] [Cross Ref]

Articles from The International Journal of Biostatistics are provided here courtesy of **Berkeley Electronic Press**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |