Let us examine the simple IV model depicted in , assuming a zero-mean, unit-variance standardization. If we retrace the derivation of the association between

*X* and

*Y* conditional on

*Z*,

we find that this formula holds not only for a perfect IV but also for a near-IV, as the one depicted in (see my previous article (

3)). Allowing a confounding path to extend from

*Z* to

*Y* will only change the crude association, which will increase from

to

to reflect the added confounding path

*X* ←

*Z* →

*Y*.

Now consider a system of multiple confounders, such as the one depicted in , where each covariate intercepts a distinct confounding path between

*X* and

*Y* and for which the crude bias (without any conditioning) is

If we condition on

*Z*_{1}, two modifications are required. First, the path containing

*Z*_{1} will no longer contribute to confounding and, second, whatever bias is contributed by the remaining paths, namely

, will be amplified by a factor of

, reflecting the decreased variance of

*X* due to fixing

*Z*_{1}. Overall, the bias remaining after conditioning on

*Z*_{1} will read

Further conditioning on

*Z*_{2} will remove the factor α

_{2}β

_{2} from the numerator (deactivating the path

*X* ←

*Z*_{2} →

*Y*) and will replace the denominator by the factor

, representing the reduced variance of

*X*, due to fixing both

*Z*_{1} and

*Z*_{2}. The resulting bias will be

We see the general pattern that characterizes sequential conditioning on sets of covariates, organized as in . The bias

*B*(

*Z*) remaining after conditioning on a set

*Z* = (

*Z*_{1},

*Z*_{2}, …,

*Z*_{k−}_{1},

*Z*_{k}) is given by the formula

which reveals 2 distinct patterns of progression, one representing confounding reduction (shown in the numerator) and one representing IV amplification (shown in the denominator). The latter increases monotonically while the former progresses nonmonotonically, since the signs of the added terms may alternate. Thus, the cumulative effect of sequential conditioning has a built-in slant towards bias amplification as compared with confounding reduction; the latter is tempered by sign cancellations, the former is not.

In deriving

equation 5, we assumed that no

*Z*_{k} is a collider, that each

*Z*_{k} has a distinct path characterized by α

_{k}, and that the

*Z*_{k}’s are not correlated. In a general graph, where multiple paths may traverse each

*Z*_{k},

*B*(

*Z*) will read

where

represents the crude bias

*B*_{0} modified by conditioning on (

*Z*_{1},

*Z*_{2}, …,

*Z*_{k−}_{1},

*Z*_{k}), and

is the coefficient of

*Z*_{k} in the regression of

*X* on (

*Z*_{1},

*Z*_{2}, …,

*Z*_{k−}_{1},

*Z*_{k}). For example, in model 5 of Myers et al. (

4) (shown in ), the crude bias is

while the bias remaining after conditioning on

*Z* reads

The numerator is obtained by setting α

_{2} = 0 in

equation 7 and multiplying the remaining term by

, to account for the effect that conditioning on

*Z* has on the path

*X* ←

*U* →

*Y*. The denominator invokes the factor α′ = (α

_{2} + γ

_{1}α

_{1}), which is the regression coefficient of

*X* on

*Z*.

We see that, in this model, γ

_{1} controls simultaneously the reduction of confounding bias and the amplification of residual bias, both caused by conditioning on

*Z*. Myers et al. (

4) assumed that γ

_{1} controls the former only.

In examining the extent to which these results are generalizable to nonlinear models, it has been shown (

3) that, while in linear systems conditioning on an IV always amplifies confounding bias (if such exists), bias in nonlinear systems may be amplified as well as attenuated. Additionally, an IV may introduce new bias where none exists. This can be demonstrated if we introduce an interaction term into the model of , to read

With this modification,

equation 1 becomes

while the crude association becomes

The resulting

*z*-adjusted bias therefore reads

where

*B*_{0} is the unadjusted bias.

We see that, if

*B*_{0} ≥ 0 and α

_{0}α

_{1}δ

_{z} > 0, we can get |

*B*_{z}| < |

*B*_{0}|. This means that conditioning on

*Z* may reduce confounding bias, even though

*Z* is a perfect instrument and both

*Y* and

*X* are linear in

*U*. Note that, owing to the nonlinearity of

*Y*(

*x*,

*u*), the conditional bias depends on the value of

*Z* and, moreover, for

*Z* = 0 we obtain the same bias amplification as in the linear case (

equation 1).

We also see that conditioning on

*Z* can introduce bias where none exists. However, this occurs only for a specific value of

*X*,

a condition that yields

*B*_{0} = 0 and |

*B*_{z}| > 0.