Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Stat Med. Author manuscript; available in PMC 2017 December 20.
Published in final edited form as:
Stat Med. 2016 December 20; 35(29): 5430–5447.
Published online 2016 August 16. doi:  10.1002/sim.7069
PMCID: PMC5118169

Instrumental variable analysis of multiplicative models with potentially invalid instruments


Instrumental variable (IV) methods have potential to consistently estimate the causal effect of an exposure on an outcome in the presence of unmeasured confounding. However, validity of IV methods relies on strong assumptions, some of which cannot be conclusively verified from observational data. One such assumption is that the effect of the proposed instrument on the outcome is completely mediated by the exposure. We consider the situation where this assumption is violated, but the remaining IV assumptions hold; that is, the proposed IV 1) is associated with the exposure and 2) has no unmeasured causes in common with the outcome. We propose a method to estimate multiplicative structural mean models of binary outcomes in this scenario in the presence of unmeasured confounding. We also extend the method to address multiple extensions, including mediation analysis. The method adapts the asymptotically efficient G-estimation approach that was previously proposed for additive structural mean models, and it can be carried out using off-the-shelf software for generalized method of moments. Monte Carlo simulation studies show that the method has low bias and accurate coverage. We applied the methods to a case study of circulating vitamin D and depressive symptoms using season of blood collection as a (potentially invalid) instrumental variable. Potential applications of the proposed method include randomized intervention studies as well as Mendelian randomization studies with genetic variants that affect multiple phenotypes, possibly including the outcome.

Keywords: Causal inference, generalized method of moments, instrumental variables, structural mean models

1. Introduction

Instrumental variable (IV) methods have gained considerable traction in observational epidemiology over the past decade [1]. This attention is due to the potential of IVs to overcome bias from unmeasured confounding. However, researchers are wary of IV methods because 1) their validity depends on strong unverifiable assumptions, and 2) violating IV assumptions can lead to larger bias than that due to unmeasured confounding [1, 2, 3, 4, 5].

IV assumptions can be conveyed using directed acyclic graphs (DAGs). Figure 1A is a proto-typical DAG encoding the three IV assumptions (possibly after conditioning on a set of measured covariates C) [5]:

Figure 1
Directed acyclic graphs conveying IV Assumptions 1–3 (A) and violation of IV Assumption 3 (B). Letters in parentheses denote variables. [rightwards dashed arrow] indicates possible satisfaction of assumptions conditioned on C.
  • IV Assumption 1: There are no unmeasured common causes of proposed instrument Z and outcome Y,
  • IV Assumption 2: Proposed instrument Z is associated with exposure X, and
  • IV Assumption 3: Effect of the proposed instrument Z on outcome Y is completely mediated by exposure X.

Although Figure 1 A shows Z having a causal effect on X, IV Assumption 2 also holds when Z and X are associated because they share an unmeasured common cause, as shown in [5]. When these assumptions hold, unbiased estimation of the causal effect of exposure on the outcome may be possible in the presence of unmeasured exposure-outcome confounders. Recent work has advocated for critical evaluation of IV assumptions [2, 3] owing to the potential for huge bias when IV assumptions are violated. In this paper, we describe statistical methods to overcome violation of IV Assumption 3, as shown in the DAG in Figure 1B. Once again, Z need not cause X, but an unmeasured common cause of Z and X must not also cause Y to ensure that IV Assumption 1 holds.

Methods to estimate additive linear structural models, typically of continuous outcomes, have been proposed when IV Assumptions 1 and 2 hold, but IV Assumption 3 may be violated [6, 7, 8, 9]. Most of these proposed methods [6, 8, 9] were motivated by mediation analysis, often in the context of randomized trials, rather than IV analysis. However, Joffe et al [7] considered the scenario where the proposed instrument is completely mediated by measured variables, including the exposure. The authors’ [6, 7, 8, 9] proposed estimation techniques include G-estimation and two-stage least squares. However, none of these models and methods explicitly addressed IV estimation of multiplicative structural models with binary outcomes when IV Assumption 3 is violated.

In this paper; we propose, evaluate, and illustrate a method to perform IV estimation of multiplicative structural mean models when IV Assumption 3 is possibly violated. The paper is organized as follows: Section 2 describes the models, assumptions, and IV estimators. Section 3 is a Monte Carlo simulation study to examine the finite-sample performance of the estimators. Section 4 is a case study illustrating the estimators when estimating the causal effect of circulating vitamin D on depressive symptoms in an observational cohort study of older adults using season of blood collection as the (possibly invalid) proposed instrumental variable. Lastly, Section 5 summarizes the findings and contextualizes them in the current literature.

2. Proposed Method

2.1. Data and Definitions

Consider a study of n independent observations. Let Xi and Yi denote the observed exposure and binary outcome, respectively, of the ith observation, where the objective of the study is to estimate the causal effect of of X on Y. Let Ci denote a set of baseline covariates. Let Ui denote unmeasured XY confounders in the sense that if Ui were observed, then adjustment for {Ui, Ci} would be sufficient to adjust for confounding. Lastly, let Zi denote a (possibly invalid) instrumental variable. Given that Z may have a direct effect on Y, we consider counterfactuals of Z and X, Yi (Zi = z, Xi = x). These counterfactuals denote the outcome that would have been observed if, possibly counter to fact, Zi were set to z and Xi were set to x. We propose a method to estimate a conditional causal risk ratio (CRR) of the form:


That is, the causal risk ratio comparing X set to x with X set to 0, when Z is set to z, conditioned on observed values of Z, X, and C. The method also estimates


the causal risk ratio comparing Z set to z with Z set to 0, when X is set to x, conditioned on observed values of Z, X, and C.

Below, we describe the model, identifying assumptions, and estimation of these causal risk ratios.

2.2. Model

We posit the following log-linear structural mean model (SMM) [10] relating Z and X to Y :


Under this model, exp{ψxx} is CRRz(x|Zi = z, Xi = x, Ci). Model (1) presumes no structural interaction between Z and X; therefore CRRz(x|Zi = z, Xi = x, Ci} is presumed constant in z. We consider extensions to relax this modeling assumption below. Furthermore, under a certain set of untestable “no effect modification” assumptions analogous to those in [5], exp{ψx} may also have a population causal risk ratio interpretation (see Appendix A). Similarly, exp{ψzz} is CRRx(z|Zi = z, Xi = x, Ci), which is presumed constant in x under Model (1).

2.3. Identifying Assumptions

Multiple assumptions are needed to identify the parameters ψ = {ψz, ψx} in Equation (1). First, we make the stable unit treatment value assumption (Identifying Assumption 1), which means that the levels of the proposed instrument and exposure for one participant do not influence the outcomes of others and that there are not multiple versions of the proposed instrument or exposure. This identifying assumption allows us to denote potential outcomes with scalar indices Yi(z, x) rather than vector indices Yi(z, x). See [11] for a review.

Second, we assume consistency (Identifying Assumption 2), which allows us to link observables with counterfactuals. That is, Yi = Yi (z, x) if Zi = z and Xi = x. Under the consistency assumption, we can rewrite the left hand side of Model (1) as log{E[Yi|Zi = z, Xi = x, Ci]}.

Third, we assume conditional mean independence of Y(z, 0) on Z, possibly only after conditioning on C (Identifying Assumption 3). This assumption is implied by the stronger IV Assumption 1 and can be formally written as


Fourth, we assume that Z is associated with X conditional on C (Identifying Assumption 4; i.e., IV Assumption 2).


This assumption may hold either because Z causally affects X (as shown in Figure 1 ) or because Z is associated with some unmeasured variable Z* that affects X as described in [5]. However, if the latter is true, Z* must not affect outcomes to avoid violating Equation (2).

Fifth, we assume correct model specification (Identifying Assumption 5). For example, Model (1) specifies no structural interaction between Z, X, or any component of C. Results will be biased if fitting Model (1) when these interactions exist. We consider extensions to the model below to allow for possible structural interactions. Furthermore, when Z or X are continuous, the functional form of the effect of these variables on outcomes must be correctly specified.

Sixth, we assume that there exists a set of functions d(Z, C) with mean 0 of length dim(ψ)with the form g(Z, C) − E[g(Z, C)|C] that are linearly independent, i.e., rank (d) = dim(ψ) (Identifying Assumption 6). The purpose of this assumption is to satisfy a technical criterion, described in further detail below, for achieving unique identifying estimating equations of each parameter of ψ.

2.4. Estimation

We can estimate ψ by solving the following set of G-estimation equations:


where dit(Zi,Ci)={di1(Zi,Ci),di2(Zi,Ci)} is a set of weights with mean 0 of the form g(Zi, Ci) − E[g(Zi, Ci)|Ci] for some function g. The term qi(Ci) is an arbitrary function of Ci. Solving Equation (4) minimizes the objective function


and it performs G-estimation by finding the value of ψ that makes Yiexp{−ψzZiψxXi}, the predicted “treatment-free” counterfactuals Yi(0, 0), independent of Z (Identifying Assumption 3, implied by IV Assumption 1). As shown by Robins [10], the functions g and q affect efficiency of estimation, but not consistency. The optimal choice for qi in terms of asymptotic efficiency is E[Yi exp{−ψzZiψxXi}|Ci] (see [10] and Appendices B and C). Motivated by the nomenclature of Joffe et al [7], we refer to the solution to Equation (4) as the extended instrumental variables (EIV) estimator.

The estimating equation in Equation (4) is the multiplicative analog to the additive model proposed by Ten Have et al. [6] for mediation analysis. In additive models, the optimal choice for d is


Satisfying Identifying Assumption 4 (3) (i.e., IV Assumption 2), ensures that the second component of Equation (5) is not trivially zero. When Z is binary, Equation (5) simplifies to −(ZiE[Zi|Ci]){1, (E[Xi|Zi = 1, Ci] − E[Xi|Zi = 0, Ci])}t. The quantity E[Xi|Zi = 1, Ci] − E[Xi|Zi = 0, Ci] has been called the “compliance score,” a term motivated by the context of clinical trials, where Z is the randomized treatment assignment and X is the treatment received [12, 13]. To satisfy Identifying Assumption 6 and identify ψ using compliance-score weights, the compliance score must not be constant in C to ensure that the two estimating equations in Equation (4) are linearly independent. Thus, when Z is binary, interactions between Z and at least one component of C are necessary for estimating ψ using compliance-score weights. Others have noted that this set-up is tantamount to using Z × C interactions as instrumental variables and including a Z × CX pathway in Figure 1B [8, 9]. However, when Z is continuous, Identifying Assumption 6 may be satisfied with empty C so long as non-linear transformations of Z relate to X. Although the weights in Equation (5) have a satisfying interpretation, they are not asymptotically efficient for multiplicative models.

Using the unconfoundedness of Z and the arguments in Robins [10], we demonstrate in Appendix B and Appendix C that the optimal d are




Verifying Identifying Assumption 6 using optimal weights in Equation (6) is more difficult than when using the weights in Equation (5) because, in general, there is little simplification even when Z and X are both binary and C is a binary scalar; more importantly, Equation (6) is a function of unknown parameters ψ. Thus, as a practical approach, researchers can assess the plausibility of satisfying Identifying Assumption 6 by computing the rank of d(Z, C) over a grid of values for ψ.

Denote Yψ(0,0)=Yexp{-ψzZ-ψxX}, the predicted treatment-free counterfactuals for some candidate ψ. Then the estimating equations (4) are unbiased when ψ* equals the true value ψ0:


The third equality follows from identifying Assumption 5 (correct model specification), the fourth equality follows from Identifying Assumption 3 (conditional mean independence, implied by IV Assumption 1), and the last equality follows from the fact that d(Z, C) has mean 0 .

The estimates of ψ, denoted [psi], can be computed using generalized method of moments (GMM) with moment conditions (6) [14]. We provide sample code for R software [15] in Appendix D using the gmm() function in the gmm package for estimation [16, 17].

Asymptotic standard errors can be computed from the square-root of the diagonal of the sandwich estimator for the variance-covariance matrix,


2.5. Extensions and Special Cases

We consider some extensions to the model as well as some relevant special cases that lead to simplified estimation. The extensions are motivated by those that have been considered for additive models with continuous outcomes [7, 8, 9]. Where relevant, we present the weights d that would be optimal for additive models when Z is binary and that are analogous to compliance-score weights. We refer the reader to Appendix C for a general approach to compute optimal weights for multiplicative models.

2.5.1. Structural Interactions

Thus far, we have specified causal effects that are additive on the log scale. This constraint can be relaxed by modeling structural interactions. We first consider interactions between the (possibly invalid) instrument Z and exposure X as in the model


To implement a weighting scheme analogous to compliance-score weighting in additive models with binary Z [9], the weights would be


One caveat about these weights with binary Z is that two covariates of C must modify the effect of Z on X to ensure that the estimating equations are linearly independent. Optimal weights for multiplicative models can be found using the formulas in Appendix C.

Next we consider an interaction between between Z and baseline covariates C as in the model


In this case, the weighting scheme analogous to the optimal additive-model weights with binary Z is


To prevent linear dependence when Z is binary, the compliance scores E[Xi | Zi = 1,Ci] − E[Xi | Zi = 0,Ci] must involve higher-order terms or interactions. Note that this criterion cannot be satisfied with a single binary covariate. See Zheng and Zhou [9] for additional guidance. See Appendix C for a general approach for finding optimal weights.

Lastly, we consider an interaction between X and C as in the model


In this case, the weighting scheme analogous to the optimal additive-model weights when Z is binary is


Once again, when Z is binary, linear independence cannot be achieved with a single binary covariate [9], and optimal weights for multiplicative models can be found by applying the formulas in Appendix C.

2.5.2. Mediation Analysis

The proposed model and EIV estimation can also be extended to perform mediation analysis. Consider a variable M that may mediate the effect of X on Y. We now consider counterfactuals of the form Y (Z = z,X = x,M = m). Unmeasured MY confounders may be of concern as shown in Figure 2A; and if

Figure 2
Directed acyclic graphs conveying IV Assumptions 1–2 (violation of IV Assumption 3) and mediation with (A) and without (B) both unmeasured M-Y confounding and a direct effect of Z on Y. Letters in parentheses denote variables. [rightwards dashed arrow] indicates ...


then the unconfoundedness of Z can be further leveraged. In particular, consider the model


where Mi is the observed value of the potential mediator for the ith observation. In this case, exp{ψx} is interpreted as a controlled direct effect of X on Y with M set to 0 [18]. The weighting scheme analogous to the optimal additive-model weights when Z is binary is


To identify ψ, the functional form of E[Xi | Zi,Ci] must be linearly independent of the functional form of E[Mi | Zi,Ci]. Appendix C can be used to find the optimal multiplicative-model weights.

Joffe et al [7] considered the special case of additive models where X and M are assumed to completely mediate the effect of Z on Y. In this case, we consider counterfactuals of the form Y (X = x,M = m). Adapting the authors’ model [7] to multiplicative models results in


G-estimation can be carried out with compliance-score weights


Joffe et al [7] also proposed an alternative two-stage least squares estimator for additive models. Once again, optimal weights for multiplicative models can be derived using the formulation in Appendix C. Alternatively, if one is concerned about a possible ZMY pathway, but expects both unmeasuredMY confounding and no direct effect of Z on Y as shown in Figure 2B, then a simple two-step sequential G-estimation [19] using conventional methods can be performed:

  • Step 1: Remove the effect of M on Y, and obtain the residual of Y :
    Estimate the expected outcome by regressing Y on X, Z,M, and C. Additional measuredMY confounders, denoted L, including those induced by X can be included here. That is, fit the model
    using, for example, Poisson regression, and save the estimated value of ψm, [psi]m. Remove the effect of M on Y by computing the residual outcome,
    This step analytically ‘converts’ Z into an IV by removing the causal pathway from Z to Y that does not go through X. Note that ψm has a causal interpretation owing to the assumption of no unmeasured MY confounders.
  • Step 2: Perform conventional IV estimation using the residual Y exp{−[psi]mM} as the outcome:
    IV estimation using G-estimation with the residual outcome involves solving Σi=1 di(Zi,Ci)(Yi exp{−[psi]mMiψxXi} − qi(Zi,Ci)) with, for example, di(Zi,Ci) = (ZiE[Zi | Ci]). More efficient weights can be derived as shown in Appendix C. This two-step approach with optimal d produces the most efficient [psi]x if ψm were known [20].

An alternative one-step approach to parameter and asymptotic standard error estimation can also be performed by solving the score equations, Si(ψ, δ), where δ = (δ0, δz, δc, δx, δl) and


3. Simulation Studies

We carried out Monte Carlo simulation studies to evaluate the performance of EIV estimation. Specifically, we aimed to compare EIV estimation with compliance-score weights (EIV, simple) and optimal weights (EIV, optimal) to conventional IV estimation when estimating the causal effect of X on Y, with and without ZY. For all simulations, 5,000 estimates were computed from simulated data sets of sizes n=10,000; 5,000; 2,000; and 1,000. All variables were binary; {Z,U,C,X, Y } were simulated from Bernoulli distributions assuming multiplicative models (additive on the log scale). All analyses were performed with R software version 3.2.0 [15]. G-estimation was carried out using GMM via the gmm() function within the gmm R package [16]; conventional IV estimation was carried out without covariates using previously published code via gmm() [16, 17].

We simulated Z and U from independent Bernoulli(0.5) distributions. Next, we simulated C from a Bernoulli(pC|U) distribution where log(pC|U) = log(0.75) + U log(0.6). We then simulated X from a Bernoulli(pX|Z,U,C) distribution where log(pX|Z,U,C) = log(0.2) + Z log(1.75) + U log(1.5) − C log(1.5) + Z × CaX|Z×C, where aX|Z×C [set membership] {log(1.75), log(1.25)}. Lastly, we simulated Y from a Bernoulli(pY |Z,U,C,X) distribution where log(pY|Z,U,C,X) = log(0.2) + ZaY |Z + U log(1.4) − C log(1.5) + X log(2.0), and where aY |Z [set membership] {log(1.5), log(1.25), log(1)}, indicating strong, weak, or no violation of IV Assumption 3. Thus, six different model specifications at four different sample sizes were assessed in this simulation study.

Results (Table 1) demonstrated that whenever IV Assumption 3 was violated, aY |Z ≠ log(1), conventional IV estimation had the greatest bias of the three methods, irrespective of sample size or the effect of Z × C on X, with greater bias at stronger effects of Z on Y. In contrast, when IV Assumption 3 holds, conventional IV estimation almost always had the lowest bias and most accurate coverage. When aX|Z×C = log(1.75), EIV with optimal weights had low bias and good coverage at large sample sizes (n = 10, 000 or 5, 000) irrespective of the effect of Z on Y ; bias also tended to be low when n = 2, 000, but with lower than nominal coverage and increasing non-convergence. Additional analyses of EIV with simple compliance-score weights revealed that estimates had a right-skewed distribution, and that the median bias was low at large sample sizes. Furthermore, empirical standard deviations and mean standard errors of estimates using the optimal weights were smaller than those using the compliance-score weights. When aX|Z×C = log(1.25), neither EIV method tended to perform well, even at large sample sizes, regardless of the strength of the effect of Z on Y.

Table 1
Simulation study of IV estimation of causal effects with a potentially invalid IV based on 5,000 iterations.

To understand the variation in performance of EIV estimators, especially that of EIV with optimal weights when the effect of Z × C on X is strong versus weak, we additionally assessed Identifying Assumption 6 for all six model specifications. Specifically, we computed the mean rank of weight matrix d(Z,C) and the mean correlation of its components. Since optimal weights depend on unknown parameters ψ, at each iteration, we computed the rank and correlation of d(Z,C) over a grid of values ranging from −log(5) to log(5) for each component of ψ. We found that d(Z,C) was always of full rank for both compliance-score weights and optimal weights over the grid of ψ. However, the average Pearson correlation coefficients for compliance-score weights were 0.971 and 0.988 when the effect of Z × C on X was log(1.75) and log(1.25), respectively. The average Pearson correlation coefficients for optimal weights at ψ = 0 (the starting value used to implement the algorithm) ranged from 0.965 for aX|Z×C = log(1.75) and aY |Z = log(1) to 0.992 when aX|Z×C = log(1.25) and aY |Z = log(1.5). Furthermore, some correlation coefficients over the grid of ψ exceeded 0.9999 for all six model specifications. For each value of aY |Z, the correlations were higher when aX|Z×C = log(1.25) than when aX|Z×C = log(1.75).

These results suggest that in the context of binary data, if Z × C has a strong effect on X, then the bias and efficiency benefits of EIV with optimal weights relative to conventional IV estimation when IV Assumption 3 is violated may outweigh its limitations when IV Assumption 3 is satisfied. In contrast, a Z × C with a weak effect on X is tantamount to weak instrument bias, especially when using compliance-score weights, as demonstrated by the higher correlations of d(Z,C). A caveat at smaller sample sizes is the much greater percent non-convergence with EIV versus conventional IV. This phenomenon is not unique to EIV and has been shown in other published implementations of G-estimation, even when observed differences from true data generating mechanisms were minor [21]. Bootstrap standard errors should also be considered at smaller sample sizes, even for conventional IV when IV Assumption 3 is satisfied.

4. Data Application

We applied the proposed EIV estimators to a study of circulating vitamin D and depressive symptoms among participants enrolled in “Aging in Chianti” (InCHIANTI), a study of older adults residing in the Chianti region of Italy (near Tuscany) [22]. To be concrete, we focus on estimating the effect of serum concentrations of 25-hydroxyvitamin D [25(OH)D], the gold-standard measure of vitamin D body stores, on severity of depressive symptoms in community-dwelling older adults [23]. The body produces 25(OH)D in response to sunlight exposure, thus 25(OH)D exhibits seasonal variation with peaks in summer and troughs in winter [24]. Therefore, we propose season of blood collection for 25(OH)D measurement as an IV. Violation of IV Assumption 3 is plausible because depressive symptoms may also exhibit seasonal variation, manifesting as “winter blues,” or more severely, seasonal affective disorder. Figure 3 displays the hypothesized causal pathways.

Figure 3
Directed acyclic graph conveying hypothesized causal pathways in the InCHIANTI data application. Letters in parentheses denote variables.

The analysis included 3,147 observations from the InCHIANTI study. Depressive symptoms were measured using the Center for Epidemiologic Studies Depression (CES-D) Scale [25], a measurement scale comprising 20 items resulting in a score ranging from 0 to 60, with higher scores indicating more severe depressive symptoms. A CES-D exceeding 16 was used to indicate depressed mood [26]. Binary season was operationalized as April to September (Northern Hemisphere Spring and Summer) versus October to March (Northern Hemisphere Autumn and Winter); binary 25(OH)D was dichotomized at 20 ng/mL based on Institute of Medicine recommended concentrations [27]. The covariate combined information on age and sex, specifically it was an indicator for being male and older than 65 years.

We estimated causal risk ratios of depressed mood comparing low to high vitamin D (i.e, 25(OH)D < 20 ng/mL versus 25(OH)D ≥ 20) using EIV estimation with both simple compliance-score weights and optimal weights and using conventional IV analysis. We additionally estimated the causal effect of season of blood collection on depressed mood (Spring and Summer versus Autumn and Winter) using EIV. We note that without additional assumptions, the EIV causal risk ratio for vitamin D is among those with low vitamin D conditioned on season of blood collection, age, and sex. If we make the ‘no effect modification’ assumptions described in Appendix A, then the causal risk ratio for vitamin D is just conditioned on age and sex.

Table 2 shows that the estimated causal risk ratio for depressed mood is nearly 5 comparing low to high vitamin D using EIV, regardless of weights. Furthermore, EIV estimates a lower risk of depressed mood comparing Spring and Summer with Autumn and Winter, which is consistent with a violation of Assumption 3. Confidence intervals for EIV using the simple compliance-score weights were estimated with 500 bootstrap samples owing to near infinite estimated asymptotic standard errors. The confidence intervals estimated using the optimal weights were more narrow than those estimated using the simple compliance-score weights. In contrast to the proposed EIV method, the conventional IV method estimated a qualitatively different causal risk ratio, suggesting that the risk of depressed mood with low vitamin D was only 26% of that for high vitamin D. A possible reason for this discrepancy is the violation of Assumption 3.

Table 2
Analysis of Vitamin D and Depressed Mood in the InCHIANTI Study (n=3,147)

As per published guidelines [1], we empirically evaluated the IV assumptions. The evaluation demonstrated that season of blood collection was indeed strongly associated with vitamin D status (Risk Ratio=1.37; 95% confidence interval 1.24 to 1.52). However season of blood collection was also strongly associated with depressed mood after accounting for vitamin D status (Risk Ratio=0.78; 95% confidence interval 0.71 to 0.86), which suggests violation of IV Assumption 3. Measured potential confounders age and sex were more strongly related to vitamin D status than to season of blood collection. These observations are consistent with, but do not conclusively prove, IV Assumptions 1 and 2 (and Identifying Assumptions 3 and 4) for vitamin D. Furthermore, relevant for Identifying Assumptions 5 and 6, we assessed the association of vitamin D status and depressed mood with the interaction term between season and the covariate after accounting for season and covariate main effects. The chi-square statistics for the interaction terms were 17.6 and 0.99, for associations of the interaction term with vitamin D status (binomial model with identity link, consistent with compliance score) and depressed mood (binomial model with log link, consistent with multiplicative structural model), respectively. The strong association of the interaction term with vitamin D is consistent with Identifying Assumption 6, and the weak association with depressed mood is consistent with Identifying Assumption 5; however this latter assumption cannot be conclusively proven with observational data. Table 3 additionally shows the frequencies of all combinations of the covariate, proposed instrument, exposure, and outcome.

Table 3
Observational Data from the InCHIANTI Study (n=3,147)

We further assessed the identifiability of both the EIV and IV models by computing the rank and correlation for d(Z,C) for EIV using both compliance-score and optimal weights. We also evaluated the GMM objective function over a a grid of ψ. The rank and correlation of d(Z,C) for EIV using compliance-score weights was 2 and 0.961, respectively. Over a grid of 200 values each for ψz and ψx ranging from −log(8) to log(8), the rank of d(Z,C) using optimal weights was always 2; however, correlations exceeding 0.999 were observed for for some values of ψ. The objective function using optimal weights at [psi] was evaluated to be 4.62 × 10−17; the only solution over the grid with an objective function less than 10−9 was exp{ψz} = 0.62 and exp{ψx} = 4.73, and the only solutions over the grid with an objective function less than 10−8 had exp{ψz} ranging from 0.57 to 0.63 and exp{ψx} ranging from 4.26 to 8.00. Similarly, the objective function using compliance-score weights at [psi] was evaluated to be 5.28 × 10−8; and the only solutions with an objective function less than 6.00 × 10−8 had exp{ψz} ranging from 0.59 to 0.67 and exp{ψx} ranging from 3.24 to 6.35. These findings are consistent with EIV parameters being identified in this application. The objective function for conventional IV using previously published code [17] with an intercept term ψ0 was evaluated to be 2.93 × 10−9 at [psi]; the only solution over the grid with an objective function less than 10−7 was exp{ψx} = 0.26, and the only solutions over the grid with an objective function less than 10−5 had exp{ψx} ranging from 0.25 to 0.26. These findings are consistent with the IV parameter being identified in this application.

5. Discussion

Unbiased IV estimation depends on three assumptions that cannot be conclusively proven from observational data. Therefore, the rationale for applying IV estimation to overcome bias from unmeasured confounding is based on the plausibility that IV assumptions hold. In this paper, we proposed IV estimators for multiplicative models of binary outcomes when Assumption 3, “the effect of the IV on the outcome is completely mediated by the exposure,” is false. Thus, the proposed EIV estimators are valid when Assumptions 1 and 2 hold, regardless of Assumption 3. The proposed estimators are adaptations of previously published estimators that were originally motivated by mediation analysis of additive models [6, 8, 9] to address invalid instruments for multiplicative models.

The major advantage of EIV estimation (with optimal weights) is that it performs well whether or not Assumption 3 holds. Therefore, IV estimation is expanded to a broader, more realistic class of scenarios. We were motivated by an exposure with known seasonal variation when we could not rule out seasonality of the outcome. EIV is also potentially advantageous for Mendelian randomization studies, where the proposed IV is a genetic variant, because some genetic variants are known to affect multiple phenotypes, possibly including the outcome [28]. EIV estimation may enable researchers to apply Mendelian randomization in the presence of multiple phenotypes even if Assumption 3 is violated, thereby potentially increasing the pool of genetic variants that can be used in Mendelian randomization studies. EIV can also be applied when cases where Z is of particular scientific interest, such as randomized trials. Furthermore, we focus on estimation of SMM using GMM, which does not require correct specification of the effect of C on Y.

Despite the method’s advantages, a major challenge is finding Z and C that satisfy the assumptions to identify model parameters. Previous work on instrumental variables in multiplicative models found that weak instruments can lead to zero or multiple solutions [14, 21, 29]. Researchers focusing on Mendelian randomization have noted that individual genetic variants already tend to have weak associations with phenotypes (exposures) of interest; interactions between genetic variants and covariates may have even weaker associations, thereby potentially exacerbating the problem of weak instrument bias [30]. The problem of not finding a single solution may be mitigated through alternative estimation procedures. We plan to explore alternatives in future research.

Other methods have also been proposed to address violation of Assumption 3, but have focused on additive models. Joffe et al [7] considered the case where the effect of the proposed IV is completely mediated by measured variables, including the exposure. Other proposed approaches can only be performed with multiple instruments and depend on assumptions about the nature of bias. These methods include an application of meta-analysis methods and a bias-corrected two-stage least squares estimator, both of which depend on the assumption that the strength of an instrument is independent of the strength of its effect on the outcome [31, 32]. Additional methods require at least half of the instruments to be valid [33, 34]. In the context of investigating mediation, Burgess et al [30] proposed regression-based methods and structural equation modeling using separate valid instruments for the exposure and mediator. In contrast, EIV estimation can be applied to multiplicative models using one or more instruments regardless of the proportion of valid instruments and regardless of the relationship between their strength as instruments and the strength of their effects on the outcome. Although not required for EIV estimation, using multiple candidate instruments can be beneficial, especially to enhance the potential for identifiability and mitigate problems from weak instrument bias. The EIV estimators can easily handle estimation with multiple IVs. Indeed, the asymptotically efficient weights are the most efficient function of the proposed IVs [35], some of which may be invalid.

Another advantage of EIV estimation is that it can be applied to mediation analysis in the presence of unmeasured exposure-outcome and mediator-outcome confounders. Furthermore, EIV be performed as an adaptation of sequential G-estimation [19] combining generalized linear models with conventional IV estimation when unmeasured mediator-outcome confounding is thought to be negligible. EIV can also handle structural interactions between the invalid instrument and exposures as well as pre-exposure covariates.

The three IV assumptions are not sufficient to identify exposure-outcome causal effects in the presence of unmeasured confounders [4, 5, 36, 37]. Additional assumptions, such as “no effect modification” are needed to identify population causal effects; alternative monotonicity assumptions have also been proposed to estimate local causal effects [38]. Analogous identifying assumptions are also needed to identify a causal effect when using EIV.

Statisticians have noted that G-estimation of structural mean models is underutilized in applied research relative to other methods such as marginal structural modeling when assuming no unmeasured confounding or two-stage least squares under IV Assumptions 1–3 [39, 40]. We aimed to enhance accessibility of G-estimation by describing a scenario when assumptions of the aforementioned competing estimators do not hold, demonstrating the steps to enhance efficiency (Appendices B and C), and including R code for estimation (Appendix D).

Lastly, sensitivity analysis has been advocated as part of a critical evaluation to uncover violations that may account for estimated IV effects [2, 3]. The proposed EIV estimators are flexible enough to be applied as part of a sensitivity analysis. For example, researchers can apply recommended empirical approaches [2] as a way to falsify IV Assumptions 1 and 2. Additional theoretical work and simulations are needed to extend and evaluate EIV to other models, including estimation of causal odds ratios and analysis of case-control data [35].


Contract/grant sponsor: National Institute on Aging Intramural Program

Appendix A

Equation (1) explicitly models conditional causal effects; that is, effects conditional on Z and X as well as C. We describe a set of “no effect modification” assumptions that are sufficient to identify population causal effects conditioned on C. Specifically, the causal effects of Z and X must not differ according to observed values of Z and X conditioned on C:



The left hand side of Equation (1) can be written as a telescoping sum,


The second and third lines of Equation (11) equal zero, because Z is not confounded. The fourth line of (11) equals the causal effect of Z when X is set to 0, which is log{E[Y(z, 0) | C]} − log{E[Y (0, 0) | C]} because Z is not confounded (Identifying Assumption 3, implied by IV assumption 1). If Equation (9) holds then this expression equals log{E[Y(z, 0) | Z = z, X = 0, C]} − log{E[Y (0, 0) | Z = z, X = 0, C]}, which equals ψzz in Equation (1). The fifth and sixth lines of Equation (11) reflect the degree of unmeasured XY confounding, and the last line equals ψxx, which equals log{E[Y (z, x) | C]} − log{E[Y (z, 0) | C]} if Equation (10) holds.

Appendix B

To estimate ψ we consider the class of estimation equations of the form


The asymptotic variance of [psi] is proportional to the variance of (12), where


and where


Because Z is unconfounded, E[Y exp{−ψzZψxX} | Z,C] = E[Y exp{−ψzZψxX} | C]. Therefore, (13) equals


and is hence minimized when q(C) = E[Y exp{−ψzZψxX} | C].

To derive the most efficient form of d, let the observed-data likelihood for ψ under a parametric submodel be denoted




Given that Z is unconfounded, the following must hold:


which can be re-written as


Taking the derivative of both sides of Equation (14) with respect to ψ yields



where S(ψ) = {(S1(ψ), S2(ψ)}t are the set of scores for ψ, where S1(ψ) = [partial differential]log{L(ψ)}/[partial differential]ψz, S2(ψ) = [partial differential]log{L(ψ)}/[partial differential]ψx, and E[S(ψ) | X,Z,C] = 0.

Following the theory in Tsiatis [41], we find the most efficient form of d by projecting S(ψ) onto Equation (12) obtained by varying d. Recall that d(Z,C) = g(Z,C) − E[g(Z,C) | C]. Consider g such that E[g(Z,C) | C] = 0, so that d(Z,C) = g(Z,C). Then the projection of S(ψ) onto Equation (12), denoted


must satisfy


From here, we plug the constraints in Equations (15) and (16) into Equation (17) to obtain the result in Equation (6).

Appendix C

The procedure in Appendix B can be generalized to handle extensions to the multiplicative model like those described in Section 2.5. Following the notation of Robins [10], let Hi(ψ) denote the residual outcome, interpreted as the predicted treatment-free counterfactual for observation i. For example, in Equation (6), Hi(ψ) = Yi exp{−ψzZiψxXi}, whereas in Equation (8), Hi(ψ) = Yi exp{−ψzZiψxXiψzxZi × Xi}.

In general, the optimal value for qi(Ci) is


which can be estimated by regressing Hi(ψ) on Ci using the current estimates of ψ in iterative estimation (e.g., GMM).

Next, let Di(Zi,Ci) denote the expected gradient of Hi(ψ),


which, in Equation (6), equals


The expected value can be estimated by regressing Hi(ψ)ψt on Zi and Ci.

Next, let wi(Zi, Ci) denote an inverse variance of Hi(ψ),


which can be estimated by regressing (Hi(ψ) − qi(Ci))2 on Zi and Ci and taking the reciprocal of the fitted values. The optimal function gi is computed as


where wi = wi(Zi,Ci) and Di = Di(Zi,Ci). Since E[gi] = 0, gi = di. The expected values in Equation (18) can be estimated by regressing wiDi and wi on Ci.

Appendix D

We include R code using the gmm() function in the gmm package to compute asymptotically efficient estimators of causal risk ratios. This code adapts that published by Clarke et al [17].

###load required package
library(gmm) <- data.frame(cbind(Z,X,Y,C))
###moment functions (score equations) for G-estimation
SMM.moments <- function(psi,x){
             C <- x[,“C”] ###covariate
             Y <- x[,“Y”] ###outcome
             X <- x[,“X”] ###exposure
             Z <- x[,“Z”] ###(possibly invalid) instrument
             ###predicted treatment-free counterfactual
             H <- Y*exp(-psi[1]*Z-psi[2]*X)
             ###q = E[H | C]
             q <- lm(H~C)$fitted.values
             dH.dpsi.1 <- -Y*Z*exp(-psi[1]*Z-psi[2]*X)
             ###D1 = E[dH.dpsi.1 | Z, C]
             D1 <- lm(dH.dpsi.1~Z*C)$fitted.values
             dH.dpsi.2 <- -Y*X*exp(-psi[1]*Z-psi[2]*X)
             ###D2 = E[dH.dpsi.2 | Z, C]
             D2 <- lm(dH.dpsi.2~Z*C)$fitted.values
    <- (H-q)^2
             ###w = 1/E[(H-q)^2 | Z, C]
             w <- 1/(lm(*C)$fitted.values)
             ###Ew = E[w | C]
             Ew <- lm(w~C)$fitted.values
             wD1 <- w*D1
             ###EwD1 = E[wD1 | C]
             EwD1 <- lm(wD1~C)$fitted.values
             wD2 <- w*D2
             ###EwD2 = E[wD2 | C]
             EwD2 <- lm(wD2~C)$fitted.values
             ###Optimal weights for the estimating equations
             dopt.1 <- w*(D1 - EwD1/Ew)
             dopt.2 <- w*(D2 - EwD2/Ew)
             m1 <- dopt.1*(H - q)
             m2 <- dopt.2*(H - q)
SMM.pre <- gmm(SMM.moments,, t0=c(0,0), vcov=“iid”)
###update results
SMM <- gmm(SMM.moments,, t0=c(coef(SMM.pre)), vcov=“iid”)
###psi.z = psi[1]; psi.x = psi[2]
psi <- coef(SMM)
###SE(psi.z) = SE[1]; SE(psi.x) = SE[2]
SE <- sqrt(diag(SMM$vcov))


1. Davies NM, Davey Smith G, Windmeijer F, Martin RM. Issues in the reporting and conduct of instrumental variable studies: a systematic review. Epidemiology. 2013;24(3):363–369. doi: 10.1097/EDE.0b013e31828abafb. [PubMed] [Cross Ref]
2. Glymour MM, Tchetgen Tchetgen EJ, Robins JM. Credible Mendelian randomization studies: approaches for evaluating the instrumental variable assumptions. American Journal of Epidemiology. 2012;175(4):332–339. doi: 10.1093/aje/kwr323. [PMC free article] [PubMed] [Cross Ref]
3. VanderWeele TJ, Tchetgen Tchetgen EJ, Cornelis M, Kraft P. Methodological challenges in Mendelian randomization. Epidemiology. 2014;25(3):427–435. doi: 10.1097/EDE.0000000000000081. [PMC free article] [PubMed] [Cross Ref]
4. Greenland S. An introduction to instrumental variables for epidemiologists. International Journal of Epidemiology. 2000;29(4):722–729. doi: 10.1093/ije/29.4.722. [PubMed] [Cross Ref]
5. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006;17(4):360–372. doi: 10.1097/01.ede.0000222409.00878.37. [PubMed] [Cross Ref]
6. Ten Have TR, Joffe MM, Lynch KG, Brown GK, Maisto SA, Beck AT. Causal mediation analyses with rank preserving models. Biometrics. 2007;63(3):926–934. doi: 10.1111/j.1541-0420.2007.00766.x. [PubMed] [Cross Ref]
7. Joffe MM, Small D, Ten Have T, Brunelli S, Feldman HI. Extended instrumental variables estimation for overall effects. International Journal of Biostatistics. 2008;4(1) doi: 10.22202/1557-4679.1082. Article 4. [PMC free article] [PubMed] [Cross Ref]
8. Small DS. Mediation analysis without sequential ignorability: using baseline covariates interacted with random assignment as instrumental variables. Journal of Statistical Research. 2012;46:89–101. [PMC free article] [PubMed]
9. Zheng C, Zhou X-H. Causal mediation analysis in the multilevel intervention and multicomponent mediator case. Journal of the Royal Statistical Society Series B. 2015;77(3):581–615. doi: 10.1111/rssb.12082. [Cross Ref]
10. Robins JM. Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics-Theory and Methods. 1994;23:2379–2412. doi: 10.1080/03610929408831393. [Cross Ref]
11. Rubin DB. Causal inference using potential outcomes: design, modeling, decisions. Journal of the American Statistical Association. 2005;100:322331. doi: 10.1198/016214504000001880. [Cross Ref]
12. Follmann D. On the effect of treatment among would-be treatment compliers: an analysis of the multiple risk factor intervention trial. Journal of the American Statistical Association. 2000;95:1101–1109. doi: 10.1080/01621459.2000.10474306. [Cross Ref]
13. Joffe MM, Ten Have TR, Brensinger C. The compliance score as a regressor in randomized trials. Biostatistics. 2003;4(3):327–340. doi: 10.1093/biostatistics/4.3.327. [PubMed] [Cross Ref]
14. Palmer TM, Sterne JA, Harbord RM, Lawlor DA, Sheehan NA, Meng S, Granell R, Smith GD, Didelez V. Instrumental variable estimation of causal risk ratios and causal odds ratios in Mendelian randomization analyses. American Journal of Epidemiology. 2011;173(12):1392–1403. doi: 10.1093/aje/kwr026. [PubMed] [Cross Ref]
15. Ihaka R, Gentleman R. R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics. 1996;5:299–314. doi: 10.1080/10618600.1996.10474713. [Cross Ref]
16. Chaussé P. Computing generalized method of moments and generalized empirical likelihood with R. Journal of Statistical Software. 2010;34:1–35. doi: 10.18637/jss.v034.i11. [Cross Ref]
17. Clarke PS, Palmer TM, Windmeijer F. Estimating structural mean models with multiple instrumental variables using the generalised method of moments. Statistical Science. 2015;30(1):96–117. doi: 10.1214/14-STS503. [Cross Ref]
18. Pearl J. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann; 2001. Direct and indirect effects; p. 411420.
19. Vansteelandt S. Estimating direct effects in cohort and case-control studies. Epidemiology. 2009;20(6):851–860. doi: 10.1097/EDE.0b013e3181b6f4c9. [PubMed] [Cross Ref]
20. Vansteelandt S, Goetghebeur E. Causal inference with generalized structural mean models. Journal of the Royal Statistical Society Series B. 2003;65(4):817–835. doi: 10.1046/j.1369-7412.2003.00417.x. [Cross Ref]
21. Brumback BA, He Z, Prasad M, Freeman MC, Rheingans R. Using structural-nested models to estimate the effect of cluster-level adherence on individual-level outcomes with a three-armed cluster-randomized trial. Statistics in Medicine. 2014;33:1490–1502. doi: 10.1002/sim.6049. [PubMed] [Cross Ref]
22. Ferrucci L, Bandinelli S, Benvenuti E, Di Iorio A, Macchi C, Harris TB, Guralnik JM. Subsystems contributing to the decline in ability to walk: bridging the gap between epidemiology and geriatric practice in the InCHIANTI study. Journal of the American Geriatrics Society. 2000;48(12):1618–1625. doi: 10.1111/j.1532-5415.2000.tb03873.x. [PubMed] [Cross Ref]
23. Milaneschi Y, Shardell M, Corsi AM, Vazzana R, Bandinelli S, Guralnik JM, Ferrucci L. Serum 25-hydroxyvitamin D and depressive symptoms in older women and men. Journal of Clinical Endocrinology and Metabolism. 2010;95(7):3225–3233. doi: 10.1210/jc.2010-0347. [PubMed] [Cross Ref]
24. Holick MF. Vitamin D deficiency. New England Journal of Medicine. 2007;357(3):266–281. doi: 10.1056/NEJMra070553. [PubMed] [Cross Ref]
25. Radloff LS. The CES-D Scale: a self-report depression scale for research in the general population. Applied Psychological Measurement. 1977;1:385–401.
26. Lewinsohn PM, Seeley JR, Roberts RE, Allen NB. Center for Epidemiologic Studies Depression Scale (CES-D) as a screening instrument for depression among community-residing older adults. Psychology and Aging. 1997;12(2):277–287. [PubMed]
27. Institute of Medicine. Dietary Reference Intakes for Calcium and Vitamin D. Washington, DC: The National Academies Press; 2011.
28. Ebrahim S, Davey Smith G. Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Human Genetics. 2008;123(1):15–33. doi: 10.1007/s00439-007-0448-6. [PubMed] [Cross Ref]
29. Burgess S, Granell R, Palmer TM, Sterne JAC, Didelez V. Lack of identification in semiparametric instrumental variable models with binary outcomes. American Journal of Epidemiology. 2014;180(1):111–119. doi: 10.1093/aje/kwu107. [PMC free article] [PubMed] [Cross Ref]
30. Burgess S, Daniel RM, Butterworth AS, Thompson SG. EPIC-InterAct Consortium. Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways. International Journal of Epidemiology. 2015;44(2):484–495. doi: 10.1093/ije/dyu176. [PMC free article] [PubMed] [Cross Ref]
31. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International Journal of Epidemiology. 2015;44(2):512–525. doi: 10.1093/ije/dyv080. [PMC free article] [PubMed] [Cross Ref]
32. Kolesár M, Chetty R, Friedman J, Glaeser E, Imbens G. Identification and inference with many invalid instruments. Cambridge, MA: National Bureau of Economic Research; 2014. NBER Working Paper 17519. [Cross Ref]
33. Kang H, Zhang A, Cai T, Small D. Instrumental variable estimation with some invalid instruments, and its application to Mendelian randomisation. Journal of the American Statistical Association. 2016;111(513):132–144. doi: 10.1080/01621459.2014.994705. [Cross Ref]
34. Han C. Detecting invalid instruments using L1-GMM. Economics Letters. 2008;101:285–287. doi: 10.1016/j.econlet.2008.09.004. [Cross Ref]
35. Bowden J, Vansteelandt S. Mendelian randomization analysis of case-control data using structural mean models. Statistics in Medicine. 2011;30(6):678–694. doi: 10.1016/j.econlet.2008.09.004. [PubMed] [Cross Ref]
36. Manski CF. Nonparametric bounds on treatment effects. American Economics Review. 1990;80(2):319–323.
37. Balke A, Pearl J. Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association. 1997;92(439):1171–1176.
38. Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables (with discussion) Journal of the American Statistical Association. 1996;91(434):444–472. doi: 10.1080/01621459.1996.10476902. [Cross Ref]
39. Vansteelandt S, Joffe M. Structural nested models and G-estimation: the partially realized promise. Statistical Science. 2015;29(4):707–731. doi: 10.1214/14-STS493. [Cross Ref]
40. Joffe MM. Structural nested models, G-estimation, and the healthy worker effect: the promise (mostly unrealized) and the pitfalls. Epidemiology. 2012;23(2):220–222. doi: 10.1097/EDE.0b013e318245f798. [PubMed] [Cross Ref]
41. Tsiatis AA. Semiparametric Theory and Missing Data. New York: Springer; 2006.