Recently, researchers have used a potential-outcome framework to estimate causally interpretable direct and indirect effects of an intervention or exposure on an outcome. One approach to causal-mediation analysis uses the so-called mediation formula to estimate the natural direct and indirect effects. This approach generalizes classical mediation estimators and allows for arbitrary distributions for the outcome variable and mediator. A limitation of the standard (parametric) mediation formula approach is that it requires a specified mediator regression model and distribution; such a model may be difficult to construct and may not be of primary interest. To address this limitation, we propose a new method for causal-mediation analysis that uses the empirical distribution function, thereby avoiding parametric distribution assumptions for the mediator. In order to adjust for confounders of the exposure-mediator and exposure-outcome relationships, inverse-probability weighting is incorporated based on a supplementary model of the probability of exposure. This method, which yields estimates of the natural direct and indirect effects for a specified reference group, is applied to data from a cohort study of dental caries in very-low-birth-weight adolescents to investigate the oral-hygiene index as a possible mediator. Simulation studies show low bias in the estimation of direct and indirect effects in a variety of distribution scenarios, whereas the standard mediation formula approach can be considerably biased when the distribution of the mediator is incorrectly specified.
Recent theory in causal inference has provided concepts for mediation analysis and effect decomposition that allow one to decompose a total effect into a direct and an indirect effect. Here, it is shown that what is often taken as an indirect effect can in fact be further decomposed into a “pure” indirect effect and a mediated interactive effect, thus yielding a three-way decomposition of a total effect (direct, indirect, and interactive). This three-way decomposition applies to difference scales and also to additive ratio scales and additive hazard scales. Assumptions needed for the identification of each of these three effects are discussed and simple formulae are given for each when regression models allowing for interaction are used. The three-way decomposition is illustrated by examples from genetic and perinatal epidemiology, and discussion is given to what is gained over the traditional two-way decomposition into simply a direct and an indirect effect.
When identification of causal effects relies on untestable assumptions regarding nonidentified parameters, sensitivity of causal effect estimates is often questioned. For proper interpretation of causal effect estimates in this situation, deriving bounds on causal parameters or exploring the sensitivity of estimates to scientifically plausible alternative assumptions can be critical. In this paper, we propose a practical way of bounding and sensitivity analysis, where multiple identifying assumptions are combined to construct tighter common bounds. In particular, we focus on the use of competing identifying assumptions that impose different restrictions on the same non-identified parameter. Since these assumptions are connected through the same parameter, direct translation across them is possible. Based on this cross-translatability, various information in the data, carried by alternative assumptions, can be effectively combined to construct tighter bounds on causal effects. Flexibility of the suggested approach is demonstrated focusing on the estimation of the complier average causal effect (CACE) in a randomized job search intervention trial that suffers from noncompliance and subsequent missing outcomes.
alternative assumptions; bounds; causal inference; missing data; noncompliance; principal stratification; sensitivity analysis
In this commentary, structural equation models (SEMs) are discussed as a tool for epidemiologic analysis. Such models are related to and compared with other analytic approaches often used in epidemiology, including regression analysis, causal diagrams, causal mediation analysis, and marginal structural models. Several of these other approaches in fact developed out of the SEM literature. However, SEMs themselves tend to make much stronger assumptions than these other techniques. SEMs estimate more types of effects than do these other techniques, but this comes at the price of additional assumptions. Many of these assumptions have often been ignored and not carefully evaluated when SEMs have been used in practice. In light of the strong assumptions employed by SEMs, the author argues that they should be used principally for the purposes of exploratory analysis and hypothesis generation when a broad range of effects are potentially of interest.
causal inference; causality; causal modeling; confounding factors (epidemiology); epidemiologic methods; regression analysis; structural equation model
It is common to present multiple adjusted effect estimates from a single model in a single table. For example, a table might show odds ratios for one or more exposures and also for several confounders from a single logistic regression. This can lead to mistaken interpretations of these estimates. We use causal diagrams to display the sources of the problems. Presentation of exposure and confounder effect estimates from a single model may lead to several interpretative difficulties, inviting confusion of direct-effect estimates with total-effect estimates for covariates in the model. These effect estimates may also be confounded even though the effect estimate for the main exposure is not confounded. Interpretation of these effect estimates is further complicated by heterogeneity (variation, modification) of the exposure effect measure across covariate levels. We offer suggestions to limit potential misunderstandings when multiple effect estimates are presented, including precise distinction between total and direct effect measures from a single model, and use of multiple models tailored to yield total-effect estimates for covariates.
causal diagrams; causal inference; confounding; direct effects; epidemiologic methods; mediation analysis; regression modeling
A fundamental assumption usually made in causal inference is that of no interference between individuals (or units); that is, the potential outcomes of one individual are assumed to be unaffected by the treatment assignment of other individuals. However, in many settings, this assumption obviously does not hold. For example, in the dependent happenings of infectious diseases, whether one person becomes infected depends on who else in the population is vaccinated. In this article, we consider a population of groups of individuals where interference is possible between individuals within the same group. We propose estimands for direct, indirect, total, and overall causal effects of treatment strategies in this setting. Relations among the estimands are established; for example, the total causal effect is shown to equal the sum of direct and indirect causal effects. Using an experimental design with a two-stage randomization procedure (first at the group level, then at the individual level within groups), unbiased estimators of the proposed estimands are presented. Variances of the estimators are also developed. The methodology is illustrated in two different settings where interference is likely: assessing causal effects of housing vouchers and of vaccines.
Group-randomized trials; Potential outcomes; Stable unit treatment value assumption; SUTVA; Vaccine
In this issue of the Journal, VanderWeele and Vansteelandt (Am J Epidemiol. 2010;172(12):1339–1348) provide simple formulae for estimation of direct and indirect effects using standard logistic regression when the exposure and outcome are binary, the mediator is continuous, and the odds ratio is the chosen effect measure. They also provide concisely stated lists of assumptions necessary for estimation of these effects, including various conditional independencies and homogeneity of exposure and mediator effects over covariate strata. They further suggest that this will allow effect decomposition in case-control studies if the sampling fractions and population outcome prevalence are known with certainty. In this invited commentary, the author argues that, in a well-designed case-control study in which the sampling fraction is known, it should not be necessary to rely on the odds ratio. The odds ratio has well-known deficiencies as a causal parameter, and its use severely complicates evaluation of confounding and effect homogeneity. Although VanderWeele and Vansteelandt propose that a rare disease assumption is not necessary for estimation of controlled direct effects using their approach, collapsibility concerns suggest otherwise when the goal is causal inference rather than merely measuring association. Moreover, their clear statement of assumptions necessary for the estimation of natural/pure effects suggests that these quantities will rarely be viable estimands in observational epidemiology.
causal inference; conditional independence; confounding; decomposition; estimation; interaction; logistic regression; odds ratio
The ‘birthweight paradox’ describes the phenomenon whereby birthweight-specific mortality curves cross when stratified on other exposures, most notably cigarette smoking. The paradox has been noted widely in the literature and numerous explanations and corrections have been suggested. Recently, causal diagrams have been used to illustrate the possibility for collider-stratification bias in models adjusting for birthweight. When two variables share a common effect, stratification on the variable representing that effect induces a statistical relation between otherwise independent factors. This bias has been proposed to explain the birthweight paradox.
Causal diagrams may illustrate sources of bias, but are limited to describing qualitative effects. In this paper, we provide causal diagrams that illustrate the birthweight paradox and use a simulation study to quantify the collider-stratification bias under a range of circumstances. Considered circumstances include exposures with and without direct effects on neonatal mortality, as well as with and without indirect effects acting through birthweight on neonatal mortality. The results of these simulations illustrate that when the birthweight-mortality relation is subject to substantial uncontrolled confounding, the bias on estimates of effect adjusted for birthweight may be sufficient to yield opposite causal conclusions, i.e. a factor that poses increased risk appears protective. Effects on stratum-specific birthweight-mortality curves were considered to illustrate the connection between collider-stratification bias and the crossing of the curves. The simulations demonstrate the conditions necessary to give rise to empirical evidence of the paradox.
collider-stratification bias; birthweight; directed acyclic graphs; neonatal nortality
It is frequently of interest to estimate the intervention effect that adjusts for post-randomization variables in clinical trials. In the recently completed HPTN 035 trial, there is differential condom use between the three microbicide gel arms and the No Gel control arm, so that intention to treat (ITT) analyses only assess the net treatment effect that includes the indirect treatment effect mediated through differential condom use. Various statistical methods in causal inference have been developed to adjust for post-randomization variables. We extend the principal stratification framework to time-varying behavioral variables in HIV prevention trials with a time-to-event endpoint, using a partially hidden Markov model (pHMM). We formulate the causal estimand of interest, establish assumptions that enable identifiability of the causal parameters, and develop maximum likelihood methods for estimation. Application of our model on the HPTN 035 trial reveals an interesting pattern of prevention effectiveness among different condom-use principal strata.
microbicide; causal inference; posttreatment variables; direct effect
Development of graphical/visual presentations of cancer etiology caused by environmental stressors is a process that requires combining the complex biological interactions between xenobiotics in living and occupational environment with genes (gene-environment interaction) and genomic and non-genomic based disease specific mechanisms in living organisms. Traditionally, presentation of causal relationships includes the statistical association between exposure to one xenobiotic and the disease corrected for the effect of potential confounders.
Within the FP6 project HENVINET, we aimed at considering together all known agents and mechanisms involved in development of selected cancer types. Selection of cancer types for causal diagrams was based on the corpus of available data and reported relative risk (RR). In constructing causal diagrams the complexity of the interactions between xenobiotics was considered a priority in the interpretation of cancer risk. Additionally, gene-environment interactions were incorporated such as polymorphisms in genes for repair and for phase I and II enzymes involved in metabolism of xenobiotics and their elimination. Information on possible age or gender susceptibility is also included. Diagrams are user friendly thanks to multistep access to information packages and the possibility of referring to related literature and a glossary of terms. Diagrams cover both chemical and physical agents (ionizing and non-ionizing radiation) and provide basic information on the strength of the association between type of exposure and cancer risk reported by human studies and supported by mechanistic studies. Causal diagrams developed within HENVINET project represent a valuable source of information for professionals working in the field of environmental health and epidemiology, and as educational material for students.
Cancer risk results from a complex interaction of environmental exposures with inherited gene polymorphisms, genetic burden collected during development and non genomic capacity of response to environmental insults. In order to adopt effective preventive measures and the associated regulatory actions, a comprehensive investigation of cancer etiology is crucial. Variations and fluctuations of cancer incidence in human populations do not necessarily reflect environmental pollution policies or population distribution of polymorphisms of genes known to be associated with increased cancer risk. Tools which may be used in such a comprehensive research, including molecular biology applied to field studies, require a methodological shift from the reductionism that has been used until recently as a basic axiom in interpretation of data. The complexity of the interactions between cells, genes and the environment, i.e. the resonance of the living matter with the environment, can be synthesized by systems biology. Within the HENVINET project such philosophy was followed in order to develop interactive causal diagrams for the investigation of cancers with possible etiology in environmental exposure.
Causal diagrams represent integrated knowledge and seed tool for their future development and development of similar diagrams for other environmentally related diseases such as asthma or sterility. In this paper development and application of causal diagrams for cancer are presented and discussed.
Treatment noncompliance and missing outcomes at posttreatment assessments are common problems in field experiments in naturalistic settings. Although the two complications often occur simultaneously, statistical methods that address both complications have not been routinely considered in data analysis practice in the prevention research field. This paper shows that identification and estimation of causal treatment effects considering both noncompliance and missing outcomes can be relatively easily conducted under various missing data assumptions. We review a few assumptions on missing data in the presence of noncompliance, including the latent ignorability proposed by Frangakis and Rubin (Biometrika 86:365–379, 1999), and show how these assumptions can be used in the parametric complier average causal effect (CACE) estimation framework. As an easy way of sensitivity analysis, we propose the use of alternative missing data assumptions, which will provide a range of causal effect estimates. In this way, we are less likely to settle with a possibly biased causal effect estimate based on a single assumption. We demonstrate how alternative missing data assumptions affect identification of causal effects, focusing on the CACE. The data from the Johns Hopkins School Intervention Study (Ialongo et al., Am J Community Psychol 27:599–642, 1999) will be used as an example.
Causal inference; Complier average causal effect; Latent ignorability; Missing at random; Missing data; Noncompliance
Network meta-analysis synthesizes direct and indirect evidence in a network of trials that compare multiple interventions and has the potential to rank the competing treatments according to the studied outcome. Despite its usefulness network meta-analysis is often criticized for its complexity and for being accessible only to researchers with strong statistical and computational skills. The evaluation of the underlying model assumptions, the statistical technicalities and presentation of the results in a concise and understandable way are all challenging aspects in the network meta-analysis methodology. In this paper we aim to make the methodology accessible to non-statisticians by presenting and explaining a series of graphical tools via worked examples. To this end, we provide a set of STATA routines that can be easily employed to present the evidence base, evaluate the assumptions, fit the network meta-analysis model and interpret its results.
Effective interventions require evidence on how individual causal pathways jointly determine disease. Based on the concept of systems epidemiology, this paper develops Diagram-based Analysis of Causal Systems (DACS) as an approach to analyze complex systems, and applies it by examining the contributions of proximal and distal determinants of childhood acute lower respiratory infections (ALRI) in sub-Saharan Africa.
Diagram-based Analysis of Causal Systems combines the use of causal diagrams with multiple routinely available data sources, using a variety of statistical techniques. In a step-by-step process, the causal diagram evolves from conceptual based on a priori knowledge and assumptions, through operational informed by data availability which then undergoes empirical testing, to integrated which synthesizes information from multiple datasets. In our application, we apply different regression techniques to Demographic and Health Survey (DHS) datasets for Benin, Ethiopia, Kenya and Namibia and a pooled World Health Survey (WHS) dataset for sixteen African countries. Explicit strategies are employed to make decisions transparent about the inclusion/omission of arrows, the sign and strength of the relationships and homogeneity/heterogeneity across settings.
Findings about the current state of evidence on the complex web of socio-economic, environmental, behavioral and healthcare factors influencing childhood ALRI, based on DHS and WHS data, are summarized in an integrated causal diagram. Notably, solid fuel use is structured by socio-economic factors and increases the risk of childhood ALRI mortality.
Diagram-based Analysis of Causal Systems is a means of organizing the current state of knowledge about a specific area of research, and a framework for integrating statistical analyses across a whole system. This partly a priori approach is explicit about causal assumptions guiding the analysis and about researcher judgment, and wrong assumptions can be reversed following empirical testing. This approach is well-suited to dealing with complex systems, in particular where data are scarce.
Africa; Children; Acute lower respiratory infections; Pneumonia; Health determinants; Causal diagrams; Multi-factorial causality; Systems epidemiology; Social epidemiology; Environmental epidemiology
Summary. Time dynamics are often ignored in causal modelling. Clearly, causality must operate in time and we show how this corresponds to a mechanistic, or system, understanding of causality. The established counterfactual definitions of direct and indirect effects depend on an ability to manipulate the mediator which may not hold in practice, and we argue that a mechanistic view may be better. Graphical representations based on local independence graphs and dynamic path analysis are used to facilitate communication as well as providing an overview of the dynamic relations ‘at a glance’. The relationship between causality as understood in a mechanistic and in an interventionist sense is discussed. An example using data from the Swiss HIV Cohort Study is presented.
Causal inference; Dynamic path analysis; Granger causality; Local independence; Mediation
In this paper we compare several methods for estimating population disease prevalence from data collected by two-phase sampling when there is non-response at the second phase. The traditional weighting type estimator requires the missing completely at random assumption and may yield biased estimates if the assumption does not hold. We review two approaches and propose one new approach to adjust for non-response assuming that the non-response depends on a set of covariates collected at the first phase: an adjusted weighting type estimator using estimated response probability from a response model; a modelling type estimator using predicted disease probability from a disease model; and a regression type estimator combining the adjusted weighting type estimator and the modelling type estimator. These estimators are illustrated using data from an Alzheimer’s disease study in two populations. Simulation results are presented to investigate the performances of the proposed estimators under various situations.
Microarray experiments generate vast amounts of data. The functional context of differentially expressed genes can be assessed by querying the Gene Ontology (GO) database via GoMiner. Directed acyclic graph representations, which are used to depict GO categories enriched with differentially expressed genes, are difficult to interpret and, depending on the particular analysis, may not be well suited for formulating new hypotheses. Additional graphical methods are therefore needed to augment the GO graphical representation.
We present an alternative visualization approach, area-proportional Euler diagrams, showing set relationships with semi-quantitative size information in a single diagram to support biological hypothesis formulation. The cardinalities of sets and intersection sets are represented by area-proportional Euler diagrams and their corresponding graphical (circular or polygonal) intersection areas. Optimally proportional representations are obtained using swarm and evolutionary optimization algorithms.
VennMaster's area-proportional Euler diagrams effectively structure and visualize the results of a GO analysis by indicating to what extent flagged genes are shared by different categories. In addition to reducing the complexity of the output, the visualizations facilitate generation of novel hypotheses from the analysis of seemingly unrelated categories that share differentially expressed genes.
We propose a new criterion for confounder selection when the underlying causal structure is unknown and only limited knowledge is available. We assume all covariates being considered are pretreatment variables and that for each covariate it is known (i) whether the covariate is a cause of treatment, and (ii) whether the covariate is a cause of the outcome. The causal relationships the covariates have with one another is assumed unknown. We propose that control be made for any covariate that is either a cause of treatment or of the outcome or both. We show that irrespective of the actual underlying causal structure, if any subset of the observed covariates suffices to control for confounding then the set of covariates chosen by our criterion will also suffice. We show that other, commonly used, criteria for confounding control do not have this property. We use formal theory concerning causal diagrams to prove our result but the application of the result does not rely on familiarity with causal diagrams. An investigator simply need ask, “Is the covariate a cause of the treatment?” and “Is the covariate a cause of the outcome?” If the answer to either question is “yes” then the covariate is included for confounder control. We discuss some additional covariate selection results that preserve unconfoundedness and that may be of interest when used with our criterion.
Causal inference; confounding; covariate selection; directed acyclic graphs
The literature on exposure to lipophilic agents such as polychlorinated biphenyls (PCBs) is conflicting, posing challenges for the interpretation of potential human health risks. Laboratory variation in quantifying PCBs may account for some of the conflicting study results. For example, for quantification purposes, blood is often used as a proxy for adipose tissue, which makes it necessary to model serum lipids when assessing health risks of PCBs. Using a simulation study, we evaluated four statistical models (unadjusted, standardized, adjusted, and two-stage) for the analysis of PCB exposure, serum lipids, and health outcome risk (breast cancer). We applied eight candidate true causal scenarios, depicted by directed acyclic graphs, to illustrate the ramifications of misspecification of underlying assumptions when interpreting results. Statistical models that deviated from underlying causal assumptions generated biased results. Lipid standardization, or the division of serum concentrations by serum lipids, was observed to be highly prone to bias. We conclude that investigators must consider biology, biologic medium (e.g., nonfasting blood samples), laboratory measurement, and other underlying modeling assumptions when devising a statistical plan for assessing health outcomes in relation to environmental exposures.
causal modeling; directed acyclic graphs; organochlorines; polychlorinated biphenyls; risk estimation; serum lipids
Past literature on exposure to lipophilic agents such as organochlorines (OCs) is conflicting, posing challenges for the interpretation of their potential human health risks. Since blood is often used as a proxy for adipose tissue, it is necessary to model serum lipids when assessing health risks of OCs. Using a simulation study, we evaluated four statistical models (unadjusted, standardized, adjusted, and two-stage) for the analysis of polychlorinated biphenyls (PCBs) exposure, serum lipids, and health outcome risk. Eight candidate true causal scenarios, depicted by directed acyclic graphs, were used to illustrate the ramifications of misspecification of underlying assumptions when interpreting results. Biased results were produced when statistical models that deviated from the underlying causal assumptions were used with the lipid standardization method found to be particularly prone to bias. We concluded that investigators must consider biology, biological medium, laboratory measurement, and other underlying modeling assumptions when devising a statistical model for assessing health outcomes in relation to environmental exposures.
Causal modeling; Directed acyclic graphs; Risk estimation; Serum lipids; Organochlorines; Polychlorinated biphenyls
Studying medical cases is an effective way to enhance clinical reasoning skills and reinforce clinical knowledge. An Ishikawa diagram, also known as a cause-and-effect diagram or fishbone diagram, is often used in quality management in manufacturing industries.
In this report, an Ishikawa diagram is used to demonstrate how to relate potential causes of a major presenting problem in a clinical setting. This tool can be used by teams in problem-based learning or in self-directed learning settings.
An Ishikawa diagram annotated with references to relevant medical cases and literature can be continually updated and can assist memory and retrieval of relevant medical cases and literature. It could also be used to cultivate a lifelong learning habit in medical professionals.
A number of results concerning attributable fractions for sufficient cause interactions are given. Results are given both for etiologic fractions (i.e. the proportion of the disease due to a particular sufficient cause) and for excess fractions (i.e. the proportion of disease that could be eliminated by removing a particular sufficient cause). Results are given both with and without assumptions of monotonicity. Under monotonicity assumptions, exact formulas can be given for the excess fraction. When etiologic fractions are of interest or when monotonicity assumptions do not hold for excess fractions then only lower bounds can be given. The interpretation of the results in this paper and in a proposal by Hoffmann et al. (2006) are discussed and compared. A method is described to estimate the lower bounds on attributable fractions using marginal structural models. Identification is discussed in settings in which time-dependent confounding may be present.
attributable fraction; interaction; marginal structural models; sufficient cause; synergism
This paper considers the problem of estimation in a general semiparametric regression model when error-prone covariates are modeled parametrically while covariates measured without error are modeled nonparametrically. To account for the effects of measurement error, we apply a correction to a criterion function. The specific form of the correction proposed allows Monte Carlo simulations in problems for which the direct calculation of a corrected criterion is difficult. Therefore, in contrast to methods that require solving integral equations of possibly multiple dimensions, as in the case of multiple error-prone covariates, we propose methodology which offers a simple implementation. The resulting methods are functional, they make no assumptions about the distribution of the mismeasured covariates. We utilize profile kernel and backfitting estimation methods and derive the asymptotic distribution of the resulting estimators. Through numerical studies we demonstrate the applicability of proposed methods to Poisson, logistic and multivariate Gaussian partially linear models. We show that the performance of our methods is similar to a computationally demanding alternative. Finally, we demonstrate the practical value of our methods when applied to Nevada Test Site (NTS) Thyroid Disease Study data.
Generalized estimating equations; generalized linear mixed models; kernel method; measurement error; Monte Carlo Corrected Score; semiparametric regression
The biological mechanisms in the association between the metabolic syndrome (MS) and various biomarkers, such as 25-hydroxyvitamin D (vit D) and magnesium, are not fully understood. Several of the proposed predictors of MS are also possible predictors of parathyroid hormone (PTH). We aimed to explore whether PTH is a possible mediator between MS and various possible explanatory variables in morbidly obese patients.
Fasting serum levels of PTH, vit D and magnesium were assessed in a cross-sectional study of 1,017 consecutive morbidly obese patients (68% women). Dependencies between MS and a total of seven possible explanatory variables as suggested in the literature, including PTH, vit D and magnesium, were specified in a path diagram, including both direct and indirect effects. Possible gender differences were also included. Effects were estimated using Bayesian path analysis, a multivariable regression technique, and expressed using standardized regression coefficients.
Sixty-eight percent of the patients had MS. In addition to type 2 diabetes and age, both PTH and serum phosphate had significant direct effects on MS; 0.36 (95% Credibility Interval (CrI) [0.15, 0.57]) and 0.28 (95% CrI [0.10,0.47]), respectively. However, due to significant gender differences, an increase in either PTH or phosphate corresponded to an increased OR for MS in women only. All proposed predictors of MS had significant direct effects on PTH, with vit D and phosphate the strongest; -0.27 (95% CrI [-0.33,-0.21]) and -0.26 (95% CrI [-0.32,-0.20]), respectively. Though neither vit D nor magnesium had significant direct effects on MS, for women they both affected MS indirectly, due to the strong direct effect of PTH on MS. For phosphate, the indirect effect on MS, mediated through serum calcium and PTH, had opposite sign than the direct effect, resulting in the total effect on MS being somewhat attenuated compared to the direct effect only.
Our results indicate that for women PTH is a plausible mediator in the association between MS and a range of explanatory variables, including vit D, magnesium and phosphate.
Epidemiologic research is often devoted to etiologic investigation, and so techniques that may facilitate mechanistic inferences are attractive. Some of these techniques rely on rigid and/or unrealistic assumptions, making the biologic inferences tenuous. The methodology investigated here is effect decomposition: the contrast between effect measures estimated with and without adjustment for one or more variables hypothesized to lie on the pathway through which the exposure exerts its effect. This contrast is typically used to distinguish the exposure's indirect effect, through the specified intermediate variables, from its direct effect, transmitted via pathways that do not involve the specified intermediates.
We apply a causal framework based on latent potential response types to describe the limitations inherent in effect decomposition analysis. For simplicity, we assume three measured binary variables with monotonic effects and randomized exposure, and use difference contrasts as measures of causal effect. Previous authors showed that confounding between intermediate and the outcome threatens the validity of the decomposition strategy, even if exposure is randomized. We define exchangeability conditions for absence of confounding of causal effects of exposure and intermediate, and generate two example populations in which the no-confounding conditions are satisfied. In one population we impose an additional prohibition against unit-level interaction (synergism). We evaluate the performance of the decomposition strategy against true values of the causal effects, as defined by the proportions of latent potential response types in the two populations.
We demonstrate that even when there is no confounding, partition of the total effect into direct and indirect effects is not reliably valid. Decomposition is valid only with the additional restriction that the population contain no units in which exposure and intermediate interact to cause the outcome. This restriction implies homogeneity of causal effects across strata of the intermediate.
Reliable effect decomposition requires not only absence of confounding, but also absence of unit-level interaction and use of linear contrasts as measures of causal effect. Epidemiologists should be wary of etiologic inference based on adjusting for intermediates, especially when using ratio effect measures or when absence of interacting potential response types cannot be confidently asserted.
effect decomposition; causality; confounding; counterfactual models; bias
To improve quality of life (QOL) in patients with multiple sclerosis (MS), it is important to decrease disability and prevent relapse. The aim of this study was to examine the causal and mutual relationships contributing to QOL in Japanese patients with MS, develop path diagrams, and explore interventions with the potential to improve patient QOL.
Data of 163 Japanese MS patients were obtained using the Functional Assessment of MS (FAMS) and Nottingham Adjustment Scale-Japanese version (NAS-J) tests, as well as four additional factors that affect QOL (employment status, change of income, availability of disease information, and communication with medical staff). Data were then used in structural equation modeling to develop path diagrams for factors contributing to QOL.
The Expanded Disability Status Scale (EDSS) score had a significant effect on the total FAMS score. Although EDSS negatively affected the FAMS symptom score, NAS-J subscale scores of anxiety/depression and acceptance were positively related to the FAMS symptom score. Changes in employment status after MS onset negatively affected all NAS-J scores. Knowledge of disease information improved the total NAS-J score, which in turn improved many FAMS subscale scores. Communication with doctors and nurses directly and positively affected some FAMS subscale scores.
Disability and change in employment status decrease patient QOL. However, the present findings suggest that other factors, such as acquiring information on MS and communicating with medical staff, can compensate for the worsening of QOL.
Multiple sclerosis; Quality of life; Structural equation modeling; Severity; Treatment; and Intervention