We consider the logistic model for a binary outcome and a binary exposure. The continuous exposure case is discussed in Section 1 of the supplementary material
. Three estimates of confounding bias are compared in terms of bias and variability: the simple estimate,
which includes the nonlinearity effect, and 2 estimates of the corrected measure of confounding that do not include the nonlinearity effect, the standardization estimate (3.4) and the IPW estimate (3.6).
We simulate 5000 data sets with 1000 exposed and 1000 unexposed subjects under the following model:
= − 3, α0
= 0, and a modest conditional exposure effect, eβ1
= 2.0. The parameters α1
are varied to explore different scenarios. We consider the causal log odds ratio, T
, the confounding bias Δ = T
, and the size of the nonlinearity effect, Δnl
Each scenario is displayed graphically using a plot of E(Y|X,Z) versus Z, for each value of X (see ). These plots visually display the size of the exposure effect, the amount of confounding, and the size of the nonlinearity effect. The length of the lines is governed by the interquartile range of Z|X, and the dot on each line is at the marginal mean E(Z|X). The vertical separation between the lines for X = 0 and X = 1 displays the size of the conditional exposure effect. The horizontal separation between the lines is due to differences in the Z-distribution between exposed and unexposed populations, and thus represents the amount of confounding. Finally, the curvature of the lines measures the degree of nonlinearity of E(Y|X,Z) as function of Z and therefore captures the nonlinearity effect.
Table 1. 5000 simulations under model (5.2) to evaluate the performance of the 3 estimates of confounding bias for a binary exposure. In each scenario, E(Y|X, Z) is plotted against Z for each value of X. In all scenarios, eβ1 = 2.0. The mean Neuhaus and (more ...)
The performances of the estimators of confounding bias are displayed in . When there is negligible confounding and nonlinearity effect (both < 1% of the size of the exposure effect; Scenarios A and B), the standardization and IPW estimators tend to be less biased but more variable than the simple estimator. As the amount of confounding bias increases (holding the nonlinearity effect under 1% of the size of the exposure effect; Scenarios C, D, and E), the simple and corrected estimates are all relatively unbiased, but the corrected estimates are substantially more variable. Note that the IPW estimator, in particular, performs poorly when the Z–X association is strong (Scenarios D and E), as P(X = 1|X) is close to 0 or 1 and the weights are extreme. When there is a large nonlinearity effect ( > 25% of the size of the exposure effect) but negligible confounding bias ( < 7% of the size of the exposure effect; Scenarios G and H), the simple estimator is substantially biased, and the corrected estimators are relatively unbiased and less variable. Finally when there is large confounding bias (> 30% of the size of the exposure effect; Scenarios I, J, and K), adding a nonlinearity effect makes the simple estimator biased; the corrected estimators are relatively unbiased but tend to be more variable.
The approximation of Neuhaus and others (1991) given in (4.2) is also used to estimate the nonlinearity effect in each scenario (). This approximation produces a reasonably unbiased estimate of the nonlinearity effect and can be used as a simple diagnostic to determine whether the nonlinearity effect is large enough to merit using a bias-corrected estimate of confounding.
We conclude that in scenarios with very small nonlinearity effects, the simple estimate of confounding bias, obtained by contrasting coefficients for the risk factor of interest from regression models with and without the confounders, is appropriate. Correcting for the small amount of bias comes at too high a cost in terms of extra variability. On the other hand, in circumstances where the nonlinearity effect is sizeable, the corrected estimates of confounding are less biased. But the reduction in bias may be associated with a confounding estimate with increased variability. By varying the size of the conditional exposure effect we find similar conclusions.