Let us examine the simple IV model depicted in , assuming a zero-mean, unit-variance standardization. If we retrace the derivation of the association between
X and
Y conditional on
Z,
we find that this formula holds not only for a perfect IV but also for a near-IV, as the one depicted in (see my previous article (
3)). Allowing a confounding path to extend from
Z to
Y will only change the crude association, which will increase from

to

to reflect the added confounding path
X ←
Z →
Y.
Now consider a system of multiple confounders, such as the one depicted in , where each covariate intercepts a distinct confounding path between
X and
Y and for which the crude bias (without any conditioning) is
If we condition on
Z1, two modifications are required. First, the path containing
Z1 will no longer contribute to confounding and, second, whatever bias is contributed by the remaining paths, namely

, will be amplified by a factor of

, reflecting the decreased variance of
X due to fixing
Z1. Overall, the bias remaining after conditioning on
Z1 will read
Further conditioning on
Z2 will remove the factor α
2β
2 from the numerator (deactivating the path
X ←
Z2 →
Y) and will replace the denominator by the factor

, representing the reduced variance of
X, due to fixing both
Z1 and
Z2. The resulting bias will be
We see the general pattern that characterizes sequential conditioning on sets of covariates, organized as in . The bias
B(
Z) remaining after conditioning on a set
Z = (
Z1,
Z2, …,
Zk−1,
Zk) is given by the formula
which reveals 2 distinct patterns of progression, one representing confounding reduction (shown in the numerator) and one representing IV amplification (shown in the denominator). The latter increases monotonically while the former progresses nonmonotonically, since the signs of the added terms may alternate. Thus, the cumulative effect of sequential conditioning has a built-in slant towards bias amplification as compared with confounding reduction; the latter is tempered by sign cancellations, the former is not.
In deriving
equation 5, we assumed that no
Zk is a collider, that each
Zk has a distinct path characterized by α
k, and that the
Zk’s are not correlated. In a general graph, where multiple paths may traverse each
Zk,
B(
Z) will read
where

represents the crude bias
B0 modified by conditioning on (
Z1,
Z2, …,
Zk−1,
Zk), and

is the coefficient of
Zk in the regression of
X on (
Z1,
Z2, …,
Zk−1,
Zk). For example, in model 5 of Myers et al. (
4) (shown in ), the crude bias is
while the bias remaining after conditioning on
Z reads
The numerator is obtained by setting α
2 = 0 in
equation 7 and multiplying the remaining term by

, to account for the effect that conditioning on
Z has on the path
X ←
U →
Y. The denominator invokes the factor α′ = (α
2 + γ
1α
1), which is the regression coefficient of
X on
Z.
We see that, in this model, γ
1 controls simultaneously the reduction of confounding bias and the amplification of residual bias, both caused by conditioning on
Z. Myers et al. (
4) assumed that γ
1 controls the former only.
In examining the extent to which these results are generalizable to nonlinear models, it has been shown (
3) that, while in linear systems conditioning on an IV always amplifies confounding bias (if such exists), bias in nonlinear systems may be amplified as well as attenuated. Additionally, an IV may introduce new bias where none exists. This can be demonstrated if we introduce an interaction term into the model of , to read
With this modification,
equation 1 becomes
while the crude association becomes
The resulting
z-adjusted bias therefore reads
where
B0 is the unadjusted bias.
We see that, if
B0 ≥ 0 and α
0α
1δ
z > 0, we can get |
Bz| < |
B0|. This means that conditioning on
Z may reduce confounding bias, even though
Z is a perfect instrument and both
Y and
X are linear in
U. Note that, owing to the nonlinearity of
Y(
x,
u), the conditional bias depends on the value of
Z and, moreover, for
Z = 0 we obtain the same bias amplification as in the linear case (
equation 1).
We also see that conditioning on
Z can introduce bias where none exists. However, this occurs only for a specific value of
X,
a condition that yields
B0 = 0 and |
Bz| > 0.