Ad-hoc causal diagrams have been used to encode investigators’ knowledge about systems of variables in epidemiology and biologic sciences for decades (eg,5,8
formalized causal diagrams as directed acyclic graphs (DAGs), providing investigators with powerful tools for bias assessment. A set of rules for causal diagrams are succinctly described by Greenland et al10
and in the appendix of Hernan et al.11
Briefly, causal diagrams link variables by single-headed (ie, directed) arrows that represent direct causal effects. To represent chains of causation in time, Pearl’s formalization of causal diagrams does not allow a directed path9
(ie, a trail of arrows) to point back to a prior variable (ie, the diagrams are acyclic). For a diagram to represent a causal system, all shared causes of any pair of variables included on the graph must also be included on the graph. The absence of an arrow between 2 variables is a strong claim of no direct effect of the former variable on the latter. We denote control (eg, regression adjustment, stratification, restriction) by placing a box around the controlled variable.
Causal Diagrams for Overadjustment Bias
We define overadjustment bias as control for an intermediate variable (or a descending proxy for an intermediate variable) on a causal path from exposure to outcome. DAG 1 provides a causal diagram representing the simplest case of overadjustment bias. For example, Bodnar et al12
evaluated the mediating role of triglycerides (M in our notation) in the association between prepregnancy body mass index (E in our notation) and preeclampsia (D in our notation), which is consistent with this causal diagram.
In this scenario, one can consistently estimate the total causal effect of exposure E on outcome D using common regression techniques by ignoring the intermediate variable M. However, if one controls (ie, adjusts, stratifies, restricts) for the intermediate variable M, which is on a causal pathway between exposure and outcome, the total causal effect of the exposure on the outcome cannot be consistently estimated. Yet, as Cole and Hernán et al11
and others have noted, such adjustment can provide correct estimates of the controlled direct causal effect with added assumptions.13–16
With control for M, the observed association between the exposure E and outcome D will typically be a null-biased estimate of the total causal effect. In cases where the only causal path between exposure E and outcome D is that path mediated through M (ie, no direct effect of E on D, which requires a perturbation of DAG 1), the observed association between exposure E and outcome D will typically be null in expectation, conditional on the intermediate M.
DAG 2 provides a second causal diagram representing perhaps a more common case of overadjustment bias. This diagram encodes the assumption that exposure E and unmeasured intermediate U both affect the outcomes D and M. Weinberg17
described this case in her example of adjusting for prior history of spontaneous abortion (M). An underlying abnormality in the endometrium (U) is the unmeasured intermediate caused by smoking (E), and is a cause of prior (M) and current (D) spontaneous abortion.
Note that in DAG 2 the measured variable M is a “descending” proxy for the intermediate variable U, which itself is typically unmeasured; one can think of M as a mismeasured version of U under a classic measurement error model, or as an event caused by U. One can again consistently estimate the total causal effect of exposure E on outcome D using common regression techniques by ignoring M, the imperfect proxy for the unmeasured intermediate variable U. However, if one controls (ie, adjusts, stratifies, restricts) for the variable M in DAG 2, which is a proxy for variable U (on a causal pathway between exposure E and outcome D) the total causal effect of the exposure on the outcome again cannot be consistently estimated. In such cases, one could place a half-box about U to imply the partial adjustment for the unmeasured U that occurs with adjustment for the measured M.
In the cases described by DAG 2, the observed association between the exposure E and outcome D will typically be biased toward the null with respect to the total causal effect. But in such cases, the null-bias will be attenuated compared with DAG 1. Even in the (extreme) case where the only causal path between exposure E and outcome D is mediated through the unmeasured proxy U (ie, no direct effect of E on D, a perturbation of DAG 2), the observed association between exposure and outcome will not be completely negated in expectation. Intuitively, one can see that adjustment for M, where M is an imperfect measure of U, leaves a partially open pathway from E through U to D. Because mismeasurement of exposures is ubiquitous in general18,19
and with pathway markers in particular, it has become popular practice to try to adjust for a proxy variable of the unmeasured intermediate variable in an attempt to decompose the effect measure into direct and indirect components. Investigators employing such approaches should be wary of the inability of proxies to completely close pathways for which they proxy.11
quantifies the overadjustment bias (ie, bias in the total effect estimate) under general linear models assumptions (ie, the direction of the bias will be the same under generalized linear models assumptions but the magnitude may differ depending on the link function), where we define (assuming no direct effect of E on D):
where U is an unmeasured intermediate effect, M is the measured descending proxy of the unmeasured intermediate variable (U), E is the exposure of interest, and D is the outcome of interest.
FIGURE 1 Bias estimating the total effect of exposure of interest E on the outcome D as a function of the direct effect of the unmeasured intermediate (U) variable (βD) on the outcome D and the direct effect of the unmeasured intermediate (U) variable (more ...)
Estimating the unknown parameters in (1)
is equivalent to estimating:
However, if one adjusts for M in estimating the effect of E, one would obtain approximately:
Therefore, the bias arising from using model (2)
to estimate the total effect of E
. One can see in that the bias is a linear function of βD
, and a quadratic function of βM
. In the case of joint continuously distributed variables, the bias for the total causal effect of exposure E on disease D, conditioning on the measured proxy M, is simply the difference in the partial Pearson correlation between exposure E and disease D, controlling for M, and the simple Pearson correlation between exposure E and disease D.
DAG 3 is a duplicate of DAG 2, except that the proxy variable M for unmeasured U is now an “ascending” rather than a “descending” proxy.
In DAG 3, adjustment for M will not block the path from exposure E to outcome D, even partially. This is because holding M constant does not alter the effect of exposure E on outcome D through intermediate U. Therefore, ascending proxies should not be used as markers of pathways when attempting to decompose total causal effects. DAG 3 could depict an alternate conception of the study of the mediating effect of triglycerides20
; here M would represent some other cause of change in triglycerides, such as dietary or lifestyle factors. There is a lack of bias under general or generalized linear models assumptions, where we defined (assuming no direct effect of E on D):
In the M-adjusted model evaluating the effects of E on D, we would estimate:
and in the crude model, we would estimate:
Therefore, the bias of using the M adjusted model to estimate the effect of E
is given by
. Bias is absent regardless of the magnitude of βUM.
One may note the similarity of DAG 3 to standard representations of instrumental variables.21,22
Indeed, in DAG 3, M is an instrument for the effect of intermediate U on outcome D. However, our focus here is the effect of exposure E on outcome D.
DAG 4 is a generalization of DAG 2. This illustrates a general problem with the control of variables affected by exposure,13,16
such as U or M.
Adjusting for a descending proxy M of an unmeasured intermediate variable U (or U itself, if it were measured), is susceptible to collider-stratification bias.23
In this instance, the unmeasured common cause V of the proxy variable M and the outcome variable D causes additional bias in the association between exposure E and outcome D within levels of M. DAG 4 is one of 10 possible cases extending DAG 2 to allow unmeasured common causes of any pair or triad of variables on DAG 2.
We define unnecessary adjustment as occurring when controlling for a variable whose control does not affect the expectation of the estimate of the total causal effect between exposure and outcome. Unnecessary adjustment occurs in 4 primary cases represented in DAG 5, namely: (a) adjusting for a variable completely outside the system of interest (C1), (b) adjusting for a variable that causes the exposure only (C2), (c) adjusting for a variable whose only causal association with variables of interest is as a descendent of the exposure and not in the causal pathway (C3), and (d) adjusting for a variable whose only causal association with variables of interest is as a cause of the outcome (C4). The result of adjustment for such variables is that the total causal effect of exposure on outcome remains unchanged (in expectation). We denote these cases as “bias-neutral adjustment.” However, there may be precision gain or loss which depends on the relationship between the exposures of interest (E), the unnecessary adjustment variable (C1–C4), the outcome of interest and the given sample size. Adjustment for these types of variables could harm rather than improve one’s estimate in terms of the combination of bias and variance.
We performed a simulation study with the goal of estimating the total effect between the exposure variable E and the outcome variable D and adjusted for 5 factors including adjusting for a variable whose only causal association with variables of interest is as a descendent of the outcome (C5), to evaluate this trade-off in the linear setting. The causal relation among these 7 factors is depicted in DAG 5. For simplicity, we assumed that C1, C2, and C4 follow standard normal distributions. E is also assumed to be normally distributed, with a mean 10 and a standard deviation of 1. Moreover, we assume that all the relationships in this system are linear, and we set the coefficients for all these associations at 0.5.
We set C1–C5 to appear in the linear models one at a time, corresponding to lines C1–C5 in . Sample size varies according to the x axis. We vary the sample size of the study from 100 to 100,000 by orders of magnitude. For each sample size, 1000 iterations were implemented and the Monte Carlo mean and variance were estimated.
Large and small sample size properties on Monte Carlo relative bias and variance of total effect estimates after adjusting for unnecessary variables.
depicts the relative bias and variance of total effect estimates after adjusting for unnecessary variables (C1–C4). The relative bias is null for both large and small samples (). We observed a small reduction in variance (gain in efficiency) in the specific simulated situation described here for the estimated total effect depicted in when estimating the total effect when adjusting for C1–C4.
The pursuit of unbiased effect estimates is the primary concern when evaluating the presence or absence of overadjustment bias. On the other hand, in the case of unnecessary adjustment effect estimates are unbiased, and the focus turns then to the effect of adjustment on precision. These scenarios have been studied extensively in the literature, especially the case of C4. As noted above, the gain or loss of efficiency is based on the type of model (linear or nonlinear). In the linear setting (shown above), the inclusion of extraneous determinants of the outcome (a predictor, but not a confounder of the outcome) will result in gains in efficiency for the estimation of the association of interest. In particular, a strong association of C4 with outcome (D) improves precision, whereas in the confounding case (not shown here) a strong association of C4 with exposure (E) alone may have a detrimental effect. This is caused by a reduction in the residual sums of squares after the extraneous variable is accounted for.
Robinson and Jewell showed no similar practical gain in nonlinear settings, and one must pay for the inclusion of the extraneous determinant with (at least) 1 degree of freedom.24
Specifically, in the logistic model, associations of both exposure and outcome with C4
have detrimental effects on precision for logistic regression estimators of the total effect. Thus, although adjustment for predictive covariates in classic linear regression can result in either increased or decreased precision, adjustment by logistic regression will result in a loss of precision.
In addition, when the measure of association is noncollapsible (eg, odds,25
or hazard ratios6
), adjusted analyses of C4
may provide different results compared with crude analyses and thus confuse interpretation, because the conditional and marginal causal effects may differ in nonlinear models due to noncollapsibility. Robinson and Jewell24
discussed this topic in detail. They demonstrated that the asymptotic relative precision of β* to
is less than or equal to 1, where β* (estimator of the total effect) is the estimator of the β coefficient from a crude logistic regression and
is the analogous estimator from a model adjusting for a determinant of outcome that is unassociated with exposure. Therefore, in this case the standard error for the crude association is smaller than or equal to the adjusted. However, the adjusted point estimate (an estimate of the covariate-conditional effect) will be larger than or equal to the crude (an estimate of the marginal effect) and biased (very slightly if the disease is rare). The relatively small decrease in the standard error with adjustment is typically outweighed by the relatively large increase in the point estimate with adjustment. Thus, statistical power to test β = 0 using the adjusted estimand is increased, but it is for a different estimand (ie, the covariate-conditional effect rather than the marginal effect).
In the linear model, and as depicted in , bias is introduced in C5
when the association between the outcome and the extraneous variable is strong relative to the error in the extraneous variable; adjustment in this case is unwarranted and in extreme cases can cause large bias and loss of precision. Furthermore, this scenario is especially susceptible to collider stratification by an unmeasured variable.27
Example: The Effect of Maternal Smoking and Neonatal Mortality
As an example to illustrate overadjustment bias, we examine the often-studied relation between birth weight and neonatal mortality. Investigators have speculated for decades on possible causes of neonatal mortality, and have consistently demonstrated that birth weight is a strong predictor of neonatal and infant mortality.28,29
When assessing the effect of possible risk factors for neonatal and infant mortality (eg, maternal smoking,28
), birth weight stratification or adjustment is frequently undertaken. We follow the premises for a causal diagram as proposed by Basso et al.32
They demonstrated that it is plausible for the observed association between birth weight and neonatal mortality to be due to an unmeasured confounder. Under this conjecture, adjustment for birth weight in the study of neonatal mortality would represent overadjustment.
We identified all infants born alive in the United States in 1999–2001 (n = 11,597,620) through the national linked birth/infant death data sets assembled by the National Center for Health Statistics.33
These records contain information on dates and causes of death, birth weight, maternal smoking, and other medical and sociodemographic characteristics systematically recorded on the US birth certificates. Neonatal mortality rates (denoted by the variable D in the causal diagram) were defined as the number of deaths within the first 28 days of life per 100,000 live births. Maternal smoking (denoted by the variable E in the causal diagram) was defined by self-report of prepregnancy smoking. Unmeasured fetal development during pregnancy is denoted by the variable U in the causal diagram. Birth weight can be thought of as a descending proxy for the unmeasured fetal development measured with error (denoted by the variable M in the causal diagrams). Following Basso et al,28
we assume that the relation between birth weight and infant mortality is due to unmeasured confounding by a condition such as malformation, fetal or placental aneuploidy, infection, or imprinting disorder (denoted by the variable V in DAG 6).
E = Prepregnancy maternal smoking
D = Neonatal mortality
U = Unmeasured fetal development during pregnancy
V = Unmeasured confounder, such as imprinting disorder
M = Birth weight
Our analysis excluded data from California (due to lack of smoking information), as well as data from infants with missing information on birth weight or maternal smoking, resulting in 10,035,444 live births. We used risk ratios and differences to quantify the association between maternal smoking and neonatal mortality, and 95% confidence intervals (CIs) to quantify precision.
The neonatal mortality rate was 219 per 100,000 live births, and 12% of mothers reported smoking. The unadjusted risk ratio for the association between maternal smoking and neonatal mortality was 2.49 (95% CI = 2.41–2.56). Adjustment for birth weight (M in the graph) by stratification attenuated the risk ratio to 2.03 (1.97–2.09). Therefore, control (ie, adjustment) for birth weight resulted in a risk ratio 18% smaller (1–2.03/2.49) than the unadjusted risk ratio.
The unadjusted risk difference for the association between maternal smoking and neonatal mortality was 274 per 100,000 (95% CI = 262–287). Adjustment for birth weight by stratification attenuated the risk difference to 228 per 100,000 (216–247). Therefore, control (ie, adjustment) for low birth weight resulted in a risk difference 17% smaller (1–228/274) than the unadjusted risk difference. This difference in the measure of association is likely due to the fact that smoking causes changes in U (changes in fetal growth), which affect birth weight and neonatal mortality separately. Using empirical methods for confounding adjustment, the differences between the estimated crude and adjusted risk ratios and differences from this data support the premise of adjusting for birth-weight when looking at the total causal effect of smoking on neonatal mortality. However, the data and prior knowledge are consistent with the change in estimate being due to overadjustment bias; and therefore adjustment may be unwise. Instead, clearly stating a causal question to be addressed, depicting the possible data generating mechanisms using causal diagrams, and measuring indicated confounders (or conducting a sensitivity analysis) are paramount for such cases.
One situation that is prone to create confusion is based on the fact that the adjusted model in this case for birth weight would not be considered overadjustment bias when estimating indirect and direct effects. Such conjectures beg redrawing of DAG 6 to allow a direct causal effect form birth weight to neonatal mortality. In summary, the data alone do not identify causal relationships.