|Home | About | Journals | Submit | Contact Us | Français|
Abbreviations: CDM, counterfactual disparity measure; MSM, marginal structural model; TMLE, targeted minimum loss-based estimation.
Social epidemiologists often seek to determine the mechanisms that underlie health disparities. This work is typically based on mediation procedures that may not be justified with exposures of common interest in social epidemiology. In this analysis, we explored the consequences of using standard approaches, referred to as the difference and generalized product methods, when mediator-outcome confounders are associated with the exposure. We compared these with inverse probability-weighted marginal structural models, the structural transformation method, doubly robust g-estimation of a structural nested model, and doubly robust targeted minimum loss-based estimation. We used data on 900,726 births from 2003 to 2007 in the Penn Moms study, conducted in Pennsylvania, to assess the extent to which breastfeeding prior to hospital discharge explained the racial disparity in infant mortality. Overall, for every 1,000 births, 3.36 more infant deaths occurred among non-Hispanic black women relative to all other women (95% confidence interval: 2.78, 3.93). Using the difference and generalized product methods to assess the disparity that would remain if everyone breastfed prior to discharge suggested a complete elimination of the disparity (risk difference = −0.87 per 1,000 births; 95% confidence interval: −1.39, −0.35). In contrast, doubly robust methods suggested a reduction in the disparity to 2.45 (95% confidence interval: 2.20, 2.71) more infant deaths per 1,000 births among non-Hispanic black women. Standard approaches for mediation analysis in health disparities research can yield misleading results.
Mediation analysis has long been of interest in medical research. Often, the approach is used to assess the extent to which an exposure-outcome relationship is attributable to a third variable. Such questions are routine in social epidemiology. Commonly used racial/ethnic classifications, measures of socioeconomic position, or characterizations of the neighborhood environment are all associated with several health outcomes throughout the life course (1, 2). Variables representing these constructs are often taken to designate “fundamental” (3) or “upstream” (4) causes that shape the distribution of more proximal risk factors leading to health disparities.
Many studies have sought to determine how these upstream determinants of health are “explained” by more proximal risk factors. Examples are numerous and include serum potassium concentrations in the relationship between race and diabetes (5), cancer stage at diagnosis in the relationship between socioeconomic position and mortality (6), and tobacco consumption in the relationship between neighborhood socioeconomic status and lung cancer (7). Most of these studies rely on a procedure for mediation (the difference method) that yields valid inferences under rather strict conditions (8, 9). Moreover, the additional assumptions required for causal inference are difficult to justify with exposures of typical interest in social epidemiology (10). This has led to some debate on the causal status of “nonmanipulable” exposures, such as race, sex, or socioeconomic position (10–15).
From this debate, some progress has emerged. Recent work has shown how mediation methods can be used to assess the magnitude of the disparity that would remain if a downstream risk factor were changed (16). This work focused on key conceptual and analytical issues when race is the exposure of interest. Here, we review several methods for mediation analysis and discuss other complications that can arise in social epidemiology. In particular, we focus on the interpretation of parameter estimates from different approaches when mediator-outcome confounders are associated with the exposure. We provide an overview of 6 approaches, and we present technical details and annotated software code in Web Appendices 1 and 2 (available at http://aje.oxfordjournals.org/).
We explore 2 broad classes of methods. The first is more commonly used but applicable in a narrower range of settings. This class includes the difference method (17) and a generalization of the product method (18, 19). The second is more general in that methods in this class can accommodate exposure-mediator interactions and mediator-outcome confounders associated with the exposure. These include inverse probability-weighted marginal structural models (MSMs) (20), the structural transformation method (also known as sequential g-estimation) (21), g-estimation of a structural nested model (22), and targeted minimum loss-based estimation (TMLE) (23).
We present an example in which we determine the proportion of the racial disparity in infant mortality explained by breastfeeding prior to hospital discharge. We explain how different methods can be used to better exploit existing knowledge about the particular mediator and outcome under study, and we show how these methods yield different results in an empirical analysis of data on 900,726 births in Pennsylvania between 2003 and 2011.
Figures Figures1A1A and 1B are causal diagrams (24), where X represents an exposure, M a mediator, and Y an outcome of interest. We represent exposure-outcome confounders as CXY and mediator-outcome confounders as CMY. Four assumptions may be required to estimate direct causal exposure effects (18):
Assumptions 1–3 are encoded in Figure Figure1A,1A, which shows that adjusting for CXY and CMY leaves no open back-door path from X to Y (assumptions 1 and 2), and where there is no arrow from X to CMY (assumption 3). If the stable unit treatment value assumption (25) is met for both the exposure and the mediator, if there is no selection or information bias, and if assumptions 1–4 hold, one can use any of the mediation methods we subsequently discuss to estimate the direct causal effect of the exposure. If assumption 4 is violated, the difference method can no longer be used (8, 9). If assumption 3 is violated (as in Figure Figure1B),1B), neither the difference method nor the generalized product method will yield valid causal inferences.
Fewer assumptions are required to estimate counterfactual disparity measures (CDMs) (16). These might be of interest in cases where X is a social exposure of interest, such as an indicator of maternal race (1 if non-Hispanic black, 0 otherwise), M denotes whether a woman breastfed her infant prior to discharge from the hospital (1 if no, 0 if yes), and Y denotes whether an infant born alive died within the first year of life (1 if yes, 0 if no). We define a CDM of this association on the difference scale as
where Y(m = 0) is the potential outcome that would be observed if, possibly contrary to fact, a woman began breastfeeding her infant prior to discharge from the place of delivery (26). In equation 1, CDM(m = 0) represents the magnitude of the racial disparity in infant mortality that would be observed if all women breastfed prior to discharge. To identify and estimate the CDM defined in equation 1, fewer assumptions are required for the exposure, but one still requires assumptions for causal inference to hold for the mediator.
Because interest lies in the association between race and the outcome, assumption 1 is not required to estimate the CDM defined in equation 1, because part of the association may be due to exposure-outcome confounding (16). In fact, if nonmanipulable exposures cannot be construed as counterfactually defined causes, variables denoted CXY cannot represent confounders in the strict causal sense (27), and our use of arrows emanating from and entering into X in Figure Figure11 represents a misuse of causal diagrams (24). However, one may want to standardize the disparity measure in equation 1 by the distribution of certain covariates associated with the exposure and the outcome. Furthermore, accounting for the uncertain causal status of race (designated X) in Figure Figure11 is beyond the scope of this article, and thus we maintain this incorrect usage throughout.
We additionally denote confounders of the relationship between breastfeeding prior to discharge and infant mortality as CMY, which includes markers of 1) year of birth, 2) urbanicity, 3) maternal education, 4) paternal education, 5) maternal marital status at pregnancy, 6) maternal participation in the Special Supplemental Nutrition Program for Women, Infants, and Children, 7) birth weight (kg), 8) gestational age at birth (weeks), 9) the interaction between birth weight and gestational age at birth, 10) 5-minute Apgar score, 11) parity, 12) prepregnancy smoking status, 13) gestational smoking status, 14) week at first prenatal visit, 15) total number of prenatal visits, 16) maternal age, and 17) paternal age. These will be used as confounders of the relationship between breastfeeding status prior to discharge and infant mortality in our empirical analysis. Importantly, the manner in which these variables relate to race determines whether standard approaches or more general approaches should be used to estimate the CDM.
If there is no association between any variables in CMY and race and there is no interaction between race and breastfeeding status, then one may use the difference method to estimate the CDM. To do this, we fit a model for the relationship between race and infant mortality with adjustment for the variables in CXY and CMY as
In the absence of exposure-mediator interactions, adding the mediator to this model yields
It is easy to show (Appendix) that under certain additional assumptions required for causal inference (counterfactual consistency (28), no interference (29), correct model specification (30), positivity (31)), β1 CDM(m = 0), and thus one can calculate the proportion of the disparity that would be eliminated as (α1 − β1)/α1 (32). In the medical literature, this measure is often interpreted as the extent to which the disparity can be prevented or reduced if the mediator is changed (5). However, if race is associated with confounders of the mediator-outcome relationship (as in Figure Figure1B),1B), then adjusting for CMY in model 2 will yield a parameter α1 that does not meaningfully correspond to a total association, because part of the association between race and the outcome occurs through CMY.
If none of the variables in CMY are associated with race but there is an interaction between race and breastfeeding status, then one may use a generalization of the product method (18, 19) to estimate the CDM. To implement the approach, we fit a simple linear regression model as
The parameter estimate for race (γ1) from this model can then be interpreted as the CDM(m = 0). However, as with the difference method, if race is associated with confounders of the mediator-outcome relationship (as in Figure Figure1B),1B), then adjusting for CMY in model 4 will yield a parameter γ1 that must be interpreted as the association that would remain if all women breastfed and if all the variables in CMY were set to their referent levels.
As with many variables of common interest in social epidemiology, race will often be associated with variables that act as mediator-outcome confounders. In particular, it is in fact the case that birth weight and gestational age, which are included in CMY, are strongly associated with race and are also strong determinants of the outcome in our empirical example. Thus, the CDM(m = 0) can be thought of as the combined magnitude of the arrows in Figure Figure1B1B from X to Y if M is set to its referent level. When mediator-outcome confounders are associated with the exposure, more general approaches that include the methods we subsequently discuss must be used. To estimate the CDM, these methods require that there be no uncontrolled confounding of the mediator-outcome relationship. When controlled direct effects are of interest, no exposure-outcome confounding is also a requirement.
We next discuss several methods for dealing with such issues in the context of health disparities research. Following Robins et al. (33), we comment on how the variation in the assumptions underlying these methods might be used to take advantage of prior knowledge on the particular mediator and outcome under study. In effect, we consider when to adjust for confounding by either 1) modeling the mediator, 2) modeling the outcome, or 3) modeling both the mediator and the outcome to obtain a doubly robust estimate of the CDM.
Inverse probability-weighted MSMs (20) can be used to estimate CDMs. The approach proceeds by modeling the mediator and exposure, generating inverse probability weights from these models, and fitting a weighted regression model of the outcome against the exposure, the mediator, and their interaction. The weights can be obtained as
where fX(·) and fM(·) are the probability density functions for X and M, respectively. As explained elsewhere (20, 34), correct specification of the models for the denominator of sw yields an unbiased estimate of the parameter of interest. To correctly estimate the CDM, this approach relies on the assumption that the mediator model is correctly specified as a function of all mediator-outcome confounders.
Conceptually, inverse probability weighting removes the association between CXY and X, and between CMY and M in Figure Figure1B,1B, enabling one to “erase” arrows, yielding Figure Figure1C.1C. Using this approach, one estimates the combined magnitude of dashed arrows in Figure Figure1C.1C. This approach enables estimation of the CDM on the risk difference, risk ratio, or odds ratio scale using weighted binomial regression models with the appropriate link function (20).
Problems may be encountered if the predicted probabilities of the observed exposure or mediator are small. This may occur if, for example, the mediator is continuous, or if very few (or very many) individuals in a given stratum of CMY have M = 1 (or M = 0), leading to inverse probability weights for these individuals that would be very large. Other authors have discussed ways to manage such positivity violations when possible (35–37).
Goetgeluk et al. (38) introduced an approach we call the structural transformation method. To estimate the CDM on the risk difference scale using this approach, one proceeds by modeling the outcome using an identity link function, as
Using parameter estimates from this model, one creates a transformed outcome as
Finally, one regresses this transformed outcome against the exposure and exposure-outcome confounders:
The parameter estimate, , can be interpreted as the CDM defined in equation 1. To estimate standard errors for , one can use a modified sandwich variance estimator (21, 39) or the bootstrap. The approach can be used with linear or log-linear risk models, but it cannot be used with logistic models due to noncollapsibility (21).
Conceptually, the structural transformation method subtracts the effect of the mediator from the outcome and regresses the remainder against the exposure (39). The approach yields an estimate of the combined magnitude of the dashed arrows in Figure Figure1D1D (40). Vansteelandt (21) discusses use of this approach (termed “sequential g-estimation”) in different settings, including with case-control designs. Unlike inverse probability-weighted MSMs, this approach relies on correct specification of the outcome model (models 6 and 7). This distinction serves as a basis for choosing whether to use inverse probability weighting (i.e., modeling the mediator) or the structural transformation method (modeling the outcome) to estimate the CDM (33). For example, because infant mortality is rare, nonconvergence may be an issue with a flexible linear model for the outcome. In this case, it may be preferable to adjust for confounding using inverse probability-weighted MSMs. In contrast, if specifying a flexible model for the mediator led to highly variable weights, use of the structural transformation method might be preferred.
Alternatively, one may choose to posit a structural nested mean model (22) for the CDM as
where, for a binary x, ψ1 represents the CDM(m = 0). These parameters can be estimated using a modified doubly robust g-estimator (33, 41). The approach can be divided into 2 stages, each with 2 steps. The first stage is to estimate γ2 and γ3 in the structural nested model. In step 1, we obtain predicted mediator probabilities (denoted ) from, for example, a logistic model such as
which can be used to create mediator residuals Step 2 is to regress the outcome against the mediator residuals, the interaction between the mediator residuals and the exposure, and confounders CMY and CXY using linear regression:
where γ2 and γ3 correspond to the desired parameters for the mediator's effect in the structural nested mean model above.
Once γ2 and γ3 are estimated, one can proceed with the second stage to estimate ψ1 using the same procedure. In step 1, estimate
Create a transformed outcome as
which is the outcome with the mediator effect removed. Finally, regress the transformed outcome against the exposure residuals and confounders of the exposure-outcome relationship in a linear regression model. Parameters for this model can be obtained using, for example, a standard ordinary least squares estimator:
By regressing the transformed outcome against the residual exposure and confounders of the exposure-outcome relationship, one can implement a modified doubly robust g-estimator using standard software routines (21, 33).
In large data sets, the standard (i.e., uncorrected) robust variance estimator (42, 43) will yield conservative confidence intervals for ψ1 (44), but the bootstrap may also be used. In contrast to inverse probability-weighted MSMs or the structural transformation method, the OLS-based g-estimator we present is doubly robust. This means that the CDM(m = 0) will be consistently estimated if either 1) the model used to generate is correctly specified, 2) the models that regress the outcome against the residuals and other covariates are correctly specified, or 3) both conditions (1 and 2) hold (45).
A final approach for estimating the CDM is TMLE (23). One can implement this procedure by first fitting the following regression models:
where Once these models have been fitted, we obtain predicted values from each model under X = 1 and M = 0 for all individuals. We refer to the outcome model predictions (equation 15) under X = x and M = m as Using these outcome model predictions, we estimate a “fluctuation parameter,” denoted ϵ2, from a no-intercept logistic regression model:
where is entered as an offset in the model. The fluctuation parameter, ϵ2, captures the degree of residual confounding in the predictions If these predictions are unbiased, will be zero (46). The predictions in the denominator of
are obtained from the models for M and X, where I(●) denotes an indicator function equal to 1 if ● is true (0 otherwise). Once fitted, we generate updated predictions, denoted , from model 16, setting X = 1 and M = 0 for all subjects. We regress these updated predictions against X and CXY using logistic regression:
and we obtain predicted values for each individual from this model under X = 1, which we denote We then estimate a second fluctuation parameter, denoted ϵ1, from the following model:
and produce predicted values from this model under X = 1. Taking the average of these predicted values over all individuals yields an estimate of E[Y(0)|X = 1]. To obtain an estimate of E[Y(0)|X = 0], we repeat the entire process, replacing all instances of X = 1 in the above equations with X = 0. We can then take the difference of the averages of and as estimates of the CDM(m = 0). Analytical standard errors can be obtained by means of the delta method using the equations provided in section 1 of Web Appendix 1. Alternatively, the bootstrap may be used.
Conceptually, TMLE starts by obtaining initial outcome predictions under the desired exposure-mediator combination, such as X = 1 and M = 0. However, these predictions may be biased. The next TMLE step is to allow the outcome predictions to “fluctuate” against a variable that enables us to target the same outcome prediction in another way. This variable, referred to as a “clever covariate” (47, p. 73), is an inverse probability weight that serves as an independent variable in regression models 16 and 18. This clever covariate confers TMLE with a doubly robust property, potentially reducing the bias of the initial predictions. Like g-estimation, TMLE is a consistent (doubly robust) estimator of the CDM if either model 17 (for the mediator) or model 15 (for the outcome) is correct.
Several techniques are available for estimating the conditional probabilities from models 13, 14, and 15. We used standard regression. However, the cross-validated Super Learner (48) can combine parametric and nonparametric (e.g., machine learning) methods into a single algorithm to generate the predicted values required for all TMLE steps.
It is important to note that the logistic regression models 16, 17, and 18 should be estimated by solving the score of the binomial likelihood function (49). Epidemiologists regularly accomplish this when fitting logistic regression models via standard software routines, such as PROC LOGISTIC in SAS (SAS Institute, Inc., Cary, North Carolina). However, these routines cannot be used for fitting models 17 or 18, because the dependent variables can take on values other than 0 or 1. Thus, to solve the binomial likelihood function for models 17 or 18, less standard routines, such as PROC MODEL or PROC NLMIXED, must be used.
Table Table11 presents a brief summary and comparison of these general methods. Additional details are available in section 2 of Web Appendix 1. Tables Tables22 and and33 present an outline of the steps required for g-estimation and TMLE.
To facilitate implementation, we provide example software code and simulated data to fit each of the 4 general methods to estimate the CDM on the difference scale in Web Appendix 2. In these simulated data, the true CDM(m = 0) = 0.05 on the difference scale. When the exposure is manipulable and counterfactual outcomes are well-defined, this same code can be used to estimate the controlled direct effect on the difference scale (50).
To examine how different methods affect interpretation in an empirical setting, we estimated the magnitude of the racial disparity in infant mortality that would remain if every woman breastfed her infant prior to discharge from the place of birth. Our sample consisted of 900,726 live-born singletons from Pennsylvania. Data were obtained from linked birth/infant-death records from 2003 to 2011. We used complete case analysis to deal with all missing data. Previous reports have shown high sensitivity and moderate false discovery rates for the birth certificate–based measure of breastfeeding status (51). Additional cohort details are available elsewhere (52). In these analyses, CXY was the empty set. We adjusted for confounders of the mediator-outcome relationship outlined in the section on the definition of the CDM. We adjusted for continuous confounders with restricted quadratic splines and for categorical confounders with using dummy variables.
Of the 900,726 children born in our sample, 3,555 infants died between 2003 and 2011. There were 84,405 births to non-Hispanic black women (590 infant deaths) and 816,321 births among women of all other races/ethnicities (2,965 infant deaths). The overall relationship between race and infant mortality was 3.36 (95% confidence interval: 2.78, 3.93) more infant deaths for every 1,000 births among non-Hispanic black women compared with all other women. This estimate was used as our measurement of the total association for all proportion-eliminated calculations.
Table Table44 presents the results of our mediation analyses. These values are estimates of the CDM(m = 0) obtained using each of the 6 different methods. When using the difference and generalized product methods, we estimated that a hypothetical intervention in which all women breastfed prior to discharge would completely eliminate the disparity in infant mortality between non-Hispanic black and all other women (Table (Table2).2). In contrast, inverse probability-weighted MSM estimates suggested that the disparity would be reduced to 2.24 (95% confidence interval: 1.41, 3.07) more infant deaths per 1,000 livebirths among non-Hispanic black women. Use of the structural transformation method suggested a reduction in the disparity to 3.08 (95% confidence interval: 2.51, 3.66) more infant deaths per 1,000 livebirths. Both doubly robust approaches yielded estimates that were no different from the inverse probability-weighted MSM estimate.
Social epidemiologists have long been interested in elucidating mechanisms underlying health disparities. Methods for mediation analysis are often used for this purpose. Using the racial disparity in infant mortality as an example, we examined complications that can arise when quantifying the extent to which a health disparity is explained by a risk factor of interest using standard regression approaches for mediation analysis. In an analysis of 900,726 births in Pennsylvania, use of standard methods suggested that a complete reduction of the disparity would be observed if every woman in the population were to breastfeed prior to discharge from the place of birth. This is because the parameter estimate for race from both standard approaches is obtained from a model that simultaneously conditions on mediator-outcome confounders. These confounders include gestational age at birth and birth weight, which are strong determinants of the outcome and are also strongly correlated with race. This accounts for most of the elimination of the racial disparity in our example.
Mishandling of mediator-outcome confounders associated with the exposure has important implications for social epidemiology. Typical exposures in this field are associated with myriad downstream risk factors that may confound a mediator-outcome relationship, making assumption 3, as outlined in the Introduction, unrealistic in most settings. More general approaches separate the adjustment for mediator-outcome confounders from the estimation of the parameter for race. Among them, inverse probability-weighted MSMs and the structural transformation method are easier to use. However, they are singly robust, in that consistent estimation depends on a single regression model for either the outcome or the mediator. When choosing between them, one should first determine whether background knowledge and the available data would allow one to more confidently specify a model for the mediator or for the outcome.
Often, however, the trade-offs between modeling either the exposure or the outcome will be difficult to gauge. G-estimation of a structural nested mean model and TMLE are doubly robust because these estimators are consistent if either the outcome model or the mediator model is correctly specified. In our example analysis, because the 95% confidence intervals for both doubly robust approaches fell within the range of the corresponding confidence intervals for the inverse probability-weighted MSM parameter, we surmise that the exposure model is driving the consistency of the doubly robust approaches (45). Thus, our analysis also demonstrates the benefits of using doubly robust techniques in addition to more commonly employed singly robust methods.
In this work, we defined a CDM as a causal effect metric with the exposure of interest. This measure is counterfactual insofar as the assumptions required for causal inference for the mediator are met. Furthermore, it is based on the fact that the statistical association between race and the outcome carries relevant disparity-related (but not necessarily causal) information. CDMs correspond directly to research questions about how an intervention might affect a health disparity (16, 53). When the exposure of interest is not subject to counterfactual consistency violations and other assumptions required for causal inference are also met, CDMs are controlled direct effects in the literature on causal mediation analysis (50).
Our example analysis is subject to limitations in that we used a complete case analysis to deal with missing data, which may have biased our results (54). Additionally, we did not have information on postdischarge breastfeeding practices. However, while important, these limitations are tangential to our primary purpose. We sought to illustrate different methods for mediation analysis and to demonstrate potential problems with using standard approaches for effect decomposition in social epidemiology.
A more controversial issue involves our interpretation of the parameter for maternal race/ethnicity, measured via self-report. Use of such exposure variables to infer causality has been the source of some debate (13, 16), and some have questioned the validity of claiming that such racial/ethnic classification schemes are nonmanipulable or noncausal (14, 15). We take this self-reported racial classification scheme to reflect historical and contemporary features of the social, political, and economic systems that characterize the lives of the women in our cohort rather than innate biological characteristics (55–57). In this framework, differences in the probability of breastfeeding prior to discharge (our mediator) and the probability of infant mortality (our outcome) between racial/ethnic groups can be construed as biological expressions of long-standing race relations (58). Such race relations have deep and complicated historical roots (59, 60), making it difficult to identify the infant mortality risk in, for example, non-Hispanic black women that would have been observed under a different set of race relations throughout history. These complications underlie much of the debate on the causal status of race (13). However, by expressing interest in the associations for race, CDMs avoid these issues, while enabling causal inference in a counterfactual framework.
While methods for mediation have proliferated in the last decade, use of standard approaches continues in situations where they may not be applicable. This has important consequences for interpretation. Importantly, the presence of mediator-outcome confounders associated with the exposure precludes the use of standard approaches for mediation analysis. This issue is particularly relevant for research on health disparities.
Author affiliations: Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania (Ashley I. Naimi, Lisa M. Bodnar); Faculty of Pharmacy, University de Montreal, Montreal, Quebec, Canada (Mireille E. Schnitzer); and Department of Epidemiology, Biostatistics, and Occupational Health, Faculty of Medicine, McGill University, Montreal, Quebec, Canada (Erica E. M. Moodie).
This project was partially supported by the National Institutes of Health (grant R21 HD065807) and the Thrasher Research Fund (grant 9181).
Conflict of interest: none declared.
Assume that Figure Figure1A1A in the main text holds and that the functional form of model 3 in the main text is correct. Then, under counterfactual consistency (cc) and conditional exchangeability (e):
Thus, it follows that: