PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (558431)

Clipboard (0)
None

Related Articles

1.  Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies 
Epidemiology (Cambridge, Mass.)  2009;20(4):488-495.
Overadjustment is defined inconsistently. This term is meant to describe control (eg, by regression adjustment, stratification, or restriction) for a variable that either increases net bias or decreases precision without affecting bias. We define overadjustment bias as control for an intermediate variable (or a descending proxy for an intermediate variable) on a causal path from exposure to outcome. We define unnecessary adjustment as control for a variable that does not affect bias of the causal relation between exposure and outcome but may affect its precision. We use causal diagrams and an empirical example (the effect of maternal smoking on neonatal mortality) to illustrate and clarify the definition of overadjustment bias, and to distinguish overadjustment bias from unnecessary adjustment. Using simulations, we quantify the amount of bias associated with overadjustment. Moreover, we show that this bias is based on a different causal structure from confounding or selection biases. Overadjustment bias is not a finite sample bias, while inefficiencies due to control for unnecessary variables are a function of sample size.
doi:10.1097/EDE.0b013e3181a819a1
PMCID: PMC2744485  PMID: 19525685
2.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies 
Statistics in Medicine  2010;29(20):2137-2148.
Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity-score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity-score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean-squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity-score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity-score methods. Differences between IPTW and propensity-score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd.
doi:10.1002/sim.3854
PMCID: PMC3068290  PMID: 20108233
propensity score; observational study; binary data; risk difference; number needed to treat; matching; IPTW; inverse probability of treatment weighting; propensity-score matching
3.  Propensity Score-based Sensitivity Analysis Method for Uncontrolled Confounding 
American Journal of Epidemiology  2011;174(3):345-353.
The authors developed a sensitivity analysis method to address the issue of uncontrolled confounding in observational studies. In this method, the authors use a 1-dimensional function of the propensity score, which they refer to as the sensitivity function (SF), to quantify the hidden bias due to unmeasured confounders. The propensity score is defined as the conditional probability of being treated given the measured covariates. Then the authors construct SF-corrected inverse-probability-weighted estimators to draw inference on the causal treatment effect. This approach allows analysts to conduct a comprehensive sensitivity analysis in a straightforward manner by varying sensitivity assumptions on both the functional form and the coefficients in the 1-dimensional SF. Furthermore, 1-dimensional continuous functions can be well approximated by low-order polynomial structures (e.g., linear, quadratic). Therefore, even if the imposed SF is practically certain to be incorrect, one can still hope to obtain valuable information on treatment effects by conducting a comprehensive sensitivity analysis using polynomial SFs with varying orders and coefficients. The authors demonstrate the new method by implementing it in an asthma study which evaluates the effect of clinician prescription patterns regarding inhaled corticosteroids for children with persistent asthma on selected clinical outcomes.
doi:10.1093/aje/kwr096
PMCID: PMC3202161  PMID: 21659349
confounding factors (epidemiology); inverse probability weighting; propensity score; sensitivity analysis; sensitivity function; uncontrolled confounding
4.  Weight Trimming and Propensity Score Weighting 
PLoS ONE  2011;6(3):e18174.
Propensity score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve the performance of propensity score weighting and whether the benefits of trimming differ by propensity score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate propensity score weights. Results indicate that although misspecified logistic regression propensity score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated propensity scores. While trimming may be used to optimize propensity score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of propensity score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the propensity score model) rather than relying on ad-hoc methods such as weight trimming.
doi:10.1371/journal.pone.0018174
PMCID: PMC3069059  PMID: 21483818
5.  ESTIMATING TREATMENT EFFECTS ON HEALTHCARE COSTS UNDER EXOGENEITY: IS THERE A ‘MAGIC BULLET’? 
Methods for estimating average treatment effects, under the assumption of no unmeasured confounders, include regression models; propensity score adjustments using stratification, weighting, or matching; and doubly robust estimators (a combination of both). Researchers continue to debate about the best estimator for outcomes such as health care cost data, as they are usually characterized by an asymmetric distribution and heterogeneous treatment effects,. Challenges in finding the right specifications for regression models are well documented in the literature. Propensity score estimators are proposed as alternatives to overcoming these challenges. Using simulations, we find that in moderate size samples (n= 5000), balancing on propensity scores that are estimated from saturated specifications can balance the covariate means across treatment arms but fails to balance higher-order moments and covariances amongst covariates. Therefore, unlike regression model, even if a formal model for outcomes is not required, propensity score estimators can be inefficient at best and biased at worst for health care cost data. Our simulation study, designed to take a ‘proof by contradiction’ approach, proves that no one estimator can be considered the best under all data generating processes for outcomes such as costs. The inverse-propensity weighted estimator is most likely to be unbiased under alternate data generating processes but is prone to bias under misspecification of the propensity score model and is inefficient compared to an unbiased regression estimator. Our results show that there are no ‘magic bullets’ when it comes to estimating treatment effects in health care costs. Care should be taken before naively applying any one estimator to estimate average treatment effects in these data. We illustrate the performance of alternative methods in a cost dataset on breast cancer treatment.
doi:10.1007/s10742-011-0072-8
PMCID: PMC3244728  PMID: 22199462
Propensity score; non-linear regression; average treatment effect; health care costs
6.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies 
Pharmaceutical Statistics  2010;10(2):150-161.
In a study comparing the effects of two treatments, the propensity score is the probability of assignment to one treatment conditional on a subject's measured baseline covariates. Propensity-score matching is increasingly being used to estimate the effects of exposures using observational data. In the most common implementation of propensity-score matching, pairs of treated and untreated subjects are formed whose propensity scores differ by at most a pre-specified amount (the caliper width). There has been a little research into the optimal caliper width. We conducted an extensive series of Monte Carlo simulations to determine the optimal caliper width for estimating differences in means (for continuous outcomes) and risk differences (for binary outcomes). When estimating differences in means or risk differences, we recommend that researchers match on the logit of the propensity score using calipers of width equal to 0.2 of the standard deviation of the logit of the propensity score. When at least some of the covariates were continuous, then either this value, or one close to it, minimized the mean square error of the resultant estimated treatment effect. It also eliminated at least 98% of the bias in the crude estimator, and it resulted in confidence intervals with approximately the correct coverage rates. Furthermore, the empirical type I error rate was approximately correct. When all of the covariates were binary, then the choice of caliper width had a much smaller impact on the performance of estimation of risk differences and differences in means. Copyright © 2010 John Wiley & Sons, Ltd.
doi:10.1002/pst.433
PMCID: PMC3120982  PMID: 20925139
propensity score; observational study; binary data; risk difference; propensity-score matching; Monte Carlo simulations; bias; matching
7.  Insights into Different Results from Different Causal Contrasts in the Presence of Effect-Measure Modification 
Purpose
Both propensity score (PS) matching and inverse probability of treatment weighting (IPTW) allow causal contrasts, albeit different ones. In the presence of effect-measure modification, different analytic approaches produce different summary estimates.
Methods
We present a spreadsheet example that assumes a dichotomous exposure, covariate, and outcome. The covariate can be a confounder or not and a modifier of the relative risk (RR) or not. Based on expected cell counts, we calculate RR estimates using five summary estimators: Mantel-Haenszel (MH), maximum likelihood (ML), the standardized mortality ratio (SMR), PS matching, and a common implementation of IPTW.
Results
Without effect-measure modification, all approaches produce identical results. In the presence of effect-measure modification and regardless of the presence of confounding, results from the SMR and PS are identical, but IPTW can produce strikingly different results (e.g. RR=0.83 vs. RR=1.50). In such settings, MH and ML do not estimate a population parameter and results for those measures fall between PS and IPTW.
Conclusions
Discrepancies between PS and IPTW reflect different weighting of stratum specific effect estimates. SMR and PS matching assign weight according to the distribution of the effect-measure modifier in the exposed subpopulation, whereas IPTW assigns weights according to the distribution of the entire study population. In pharmacoepidemiology, contraindications to treatment that also modify the effect might be prevalent in the population, but would be rare among the exposed. In such settings, estimating the effect of exposure in the exposed rather than the whole population is preferable.
doi:10.1002/pds.1231
PMCID: PMC1581494  PMID: 16528796
epidemiologic methods; confounding factors (epidemiology); bias (epidemiology); effect measure modification; interaction; propensity score; inverse probability of treatment weighting; standardized mortality ratio; Mantel-Haenszel; maximum likelihood
8.  Analytic Strategies to Adjust Confounding Using Exposure Propensity Scores and Disease Risk Scores: Nonsteroidal Antiinflammatory Drugs (NSAID) and Short-term Mortality in the Elderly 
American journal of epidemiology  2005;161(9):891-898.
Little is known about optimal application and behavior of exposure propensity scores (EPS) in small studies. Based on a cohort of 103,133 elderly Medicaid beneficiaries, the effect of nonsteroidal anti-inflammatory drug (NSAID) use on 1-year all-cause mortality was assessed based on the assumption that there is no protective effect, and the preponderance of any observed effect would be confounded. To study the comparative behavior of EPS, disease risk scores (DRS), and ‘traditional’ disease models, we randomly re-sampled 1,000 subcohorts of 10,000, 1,000 and 500 people. The number of variables was limited in disease models, but not EPS and DRS. Estimated EPS were used to adjust for confounding by matching, inverse probability of treatment weighting (IPTW), stratification, and modeling. The crude rate ratio (RR) of death for NSAID users was 0.68. ‘Traditional’ adjustment resulted in a RR of 0.80 (95% confidence interval:0.77–0.84). The RR closest to 1 was achieved by IPTW (0.85;0.82–0.88). With decreasing study size, estimates remained further from the null, which was most pronounced for IPTW (N=500: RR=0.72;0.26–1.68). In this setting, analytic strategies using EPS or DRS were not generally superior to ‘traditional’. Various ways to use EPS and DRS behaved differently with smaller study size.
doi:10.1093/aje/kwi106
PMCID: PMC1407370  PMID: 15840622
epidemiologic methods; research design; confounding factors (epidemiology); bias (epidemiology); cohort studies; nonsteroidal anti-inflammatory drugs; AUC, area under the receiver operating characteristic curve; CI, confidence interval; EPS, exposure propensity score; DRS, disease risk score; IPTW, inverse probability of treatment weighting; NSAID, nonsteroidal antiinflammatory drug; OR, odds ratio; RR, relative risk
9.  An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics 
A concrete example of the collaborative double-robust targeted likelihood estimator (C-TMLE) introduced in a companion article in this issue is presented, and applied to the estimation of causal effects and variable importance parameters in genomic data. The focus is on non-parametric estimation in a point treatment data structure. Simulations illustrate the performance of C-TMLE relative to current competitors such as the augmented inverse probability of treatment weighted estimator that relies on an external non-collaborative estimator of the treatment mechanism, and inefficient estimation procedures including propensity score matching and standard inverse probability of treatment weighting. C-TMLE is also applied to the estimation of the covariate-adjusted marginal effect of individual HIV mutations on resistance to the anti-retroviral drug lopinavir. The influence curve of the C-TMLE is used to establish asymptotically valid statistical inference. The list of mutations found to have a statistically significant association with resistance is in excellent agreement with mutation scores provided by the Stanford HIVdb mutation scores database.
doi:10.2202/1557-4679.1182
PMCID: PMC3126668  PMID: 21731530
causal effect; cross-validation; collaborative double robust; double robust; efficient influence curve; penalized likelihood; penalization; estimator selection; locally efficient; maximum likelihood estimation; model selection; super efficiency; super learning; targeted maximum likelihood estimation; targeted nuisance parameter estimator selection; variable importance
10.  An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies 
Multivariate Behavioral Research  2011;46(3):399-424.
The propensity score is the probability of treatment assignment conditional on observed baseline characteristics. The propensity score allows one to design and analyze an observational (nonrandomized) study so that it mimics some of the particular characteristics of a randomized controlled trial. In particular, the propensity score is a balancing score: conditional on the propensity score, the distribution of observed baseline covariates will be similar between treated and untreated subjects. I describe 4 different propensity score methods: matching on the propensity score, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, and covariate adjustment using the propensity score. I describe balance diagnostics for examining whether the propensity score model has been adequately specified. Furthermore, I discuss differences between regression-based methods and propensity score-based methods for the analysis of observational data. I describe different causal average treatment effects and their relationship with propensity score analyses.
doi:10.1080/00273171.2011.568786
PMCID: PMC3144483  PMID: 21818162
11.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality 
Multivariate behavioral research  2011;46(1):119-151.
Propensity score methods allow investigators to estimate causal treatment effects using observational or nonrandomized data. In this article we provide a practical illustration of the appropriate steps in conducting propensity score analyses. For illustrative purposes, we use a sample of current smokers who were discharged alive after being hospitalized with a diagnosis of acute myocardial infarction. The exposure of interest was receipt of smoking cessation counseling prior to hospital discharge and the outcome was mortality with 3 years of hospital discharge. We illustrate the following concepts: first, how to specify the propensity score model; second, how to match treated and untreated participants on the propensity score; third, how to compare the similarity of baseline characteristics between treated and untreated participants after stratifying on the propensity score, in a sample matched on the propensity score, or in a sample weighted by the inverse probability of treatment; fourth, how to estimate the effect of treatment on outcomes when using propensity score matching, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, or covariate adjustment using the propensity score. Finally, we compare the results of the propensity score analyses with those obtained using conventional regression adjustment.
doi:10.1080/00273171.2011.540480
PMCID: PMC3266945  PMID: 22287812 CAMSID: cams1834
12.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality 
Multivariate Behavioral Research  2011;46(1):119-151.
Propensity score methods allow investigators to estimate causal treatment effects using observational or nonrandomized data. In this article we provide a practical illustration of the appropriate steps in conducting propensity score analyses. For illustrative purposes, we use a sample of current smokers who were discharged alive after being hospitalized with a diagnosis of acute myocardial infarction. The exposure of interest was receipt of smoking cessation counseling prior to hospital discharge and the outcome was mortality with 3 years of hospital discharge. We illustrate the following concepts: first, how to specify the propensity score model; second, how to match treated and untreated participants on the propensity score; third, how to compare the similarity of baseline characteristics between treated and untreated participants after stratifying on the propensity score, in a sample matched on the propensity score, or in a sample weighted by the inverse probability of treatment; fourth, how to estimate the effect of treatment on outcomes when using propensity score matching, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, or covariate adjustment using the propensity score. Finally, we compare the results of the propensity score analyses with those obtained using conventional regression adjustment.
doi:10.1080/00273171.2011.540480
PMCID: PMC3266945  PMID: 22287812
13.  Model Averaging Methods for Weight Trimming 
Journal of official statistics  2008;24(4):517-540.
In sample surveys where sampled units have unequal probabilities of inclusion, associations between the inclusion probabilities and the statistic of interest can induce bias. Weights equal to the inverse of the probability of inclusion are often used to counteract this bias. Highly disproportional sample designs have highly variable weights, which can introduce undesirable variability in statistics such as the population mean or linear regression estimates. Weight trimming reduces large weights to a fixed maximum value, reducing variability but introducing bias. Most standard approaches are ad-hoc in that they do not use the data to optimize bias-variance tradeoffs. This manuscript develops variable selection models, termed “weight pooling” models, that extend weight trimming procedures in a Bayesian model averaging framework to produce “data driven” weight trimming estimators. We develop robust yet efficient models that approximate fully-weighted estimators when bias correction is of greatest importance, and approximate unweighted estimators when variance reduction is critical.
PMCID: PMC2783643  PMID: 19946471
Sample survey; sampling weights; Bayesian population inference; weight pooling; variable selection; fractional Bayes Factors
14.  A comparison of perioperative outcomes of Video-Assisted Thoracic Surgical (VATS) Lobectomy with open thoracotomy and lobectomy: Results of an analysis using propensity score based weighting 
Background
Randomized trials comparing VATS lobectomy to open lobectomy are of small size. We analyzed a case-control series using propensity score-weighting to adjust for important covariates in order to compare the clinical outcomes of the two techniques.
Methods
We compared patients undergoing lobectomy for clinical stage I lung cancer (NSCLC) by either VATS or open (THOR) methods. Inverse probability of treatment weighted estimators, with weights derived from propensity scores, were used to adjust cohorts for determinants of perioperative morbidity and mortality including age, gender, preop FEV1, ASA class, and Charlson Comorbidity Index (CCI). Bootstrap methods provided standard errors. Endpoints were postoperative stay (LOS), chest tube duration, complications, and lymph node retrieval.
Results
We analyzed 136 consecutive lobectomy patients. Operative mortality was 1/62 (1.6%) for THOR and 1/74 (1.4%) for VATS, P = 1.00. 5/74 (6.7%) VATS were converted to open procedures. Adjusted median LOS was 7 days (THOR) versus 4 days (VATS), P < 0.0001, HR = 0.33. Adjusted median chest tube duration (days) was 5 (THOR) versus 3 (VATS), P < 0.0001, HR = 0.42. Complication rates were 39% (THOR) versus 34% (VATS), P = 0.61. Adjusted mean number of lymph nodes dissected per patient was 18.1 (THOR) versus 14.8 (VATS), p = 0.17.
Conclusions
After balancing covariates that affect morbidity, mortality and LOS in this case-control series using propensity-weighting, the results confirm that VATS lobectomy is associated with a statistically significant shorter LOS, similar mortality and complication rates and similar rates of lymph node removal in patients with clinical stage I NSCLC.
doi:10.1186/1750-1164-4-1
PMCID: PMC2848683  PMID: 20307297
15.  Model Averaging Methods for Weight Trimming in Generalized Linear Regression Models 
In sample surveys where units have unequal probabilities of inclusion, associations between the inclusion probability and the statistic of interest can induce bias in unweighted estimates. This is true even in regression models, where the estimates of the population slope may be biased if the underlying mean model is misspecified or the sampling is nonignorable. Weights equal to the inverse of the probability of inclusion are often used to counteract this bias. Highly disproportional sample designs have highly variable weights; weight trimming reduces large weights to a maximum value, reducing variability but introducing bias. Most standard approaches are ad hoc in that they do not use the data to optimize bias-variance trade-offs. This article uses Bayesian model averaging to create “data driven” weight trimming estimators. We extend previous results for linear regression models (Elliott 2008) to generalized linear regression models, developing robust models that approximate fully-weighted estimators when bias correction is of greatest importance, and approximate unweighted estimators when variance reduction is critical.
PMCID: PMC3530169  PMID: 23275683
Sample survey; sampling weights; weight winsorization; Bayesian population inference; weight pooling; variable selection; fractional Bayes Factors
16.  Constructing Inverse Probability Weights for Marginal Structural Models 
American Journal of Epidemiology  2008;168(6):656-664.
The method of inverse probability weighting (henceforth, weighting) can be used to adjust for measured confounding and selection bias under the four assumptions of consistency, exchangeability, positivity, and no misspecification of the model used to estimate weights. In recent years, several published estimates of the effect of time-varying exposures have been based on weighted estimation of the parameters of marginal structural models because, unlike standard statistical methods, weighting can appropriately adjust for measured time-varying confounders affected by prior exposure. As an example, the authors describe the last three assumptions using the change in viral load due to initiation of antiretroviral therapy among 918 human immunodeficiency virus-infected US men and women followed for a median of 5.8 years between 1996 and 2005. The authors describe possible tradeoffs that an epidemiologist may encounter when attempting to make inferences. For instance, a tradeoff between bias and precision is illustrated as a function of the extent to which confounding is controlled. Weight truncation is presented as an informal and easily implemented method to deal with these tradeoffs. Inverse probability weighting provides a powerful methodological tool that may uncover causal effects of exposures that are otherwise obscured. However, as with all methods, diagnostics and sensitivity analyses are essential for proper use.
doi:10.1093/aje/kwn164
PMCID: PMC2732954  PMID: 18682488
bias (epidemiology); causality; confounding factors (epidemiology); probability weighting; regression model
17.  Estimating the treatment effect from non-randomized studies: The example of reduced intensity conditioning allogeneic stem cell transplantation in hematological diseases 
BMC Blood Disorders  2012;12:10.
Background
In some clinical situations, for which RCT are rare or impossible, the majority of the evidence comes from observational studies, but standard estimations could be biased because they ignore covariates that confound treatment decisions and outcomes.
Methods
Three observational studies were conducted to assess the benefit of Allo-SCT in hematological malignancies of multiple myeloma, follicular lymphoma and Hodgkin’s disease. Two statistical analyses were performed: the propensity score (PS) matching approach and the inverse probability weighting (IPW) approach.
Results
Based on PS-matched samples, a survival benefit in MM patients treated by Allo-SCT, as compared to similar non-allo treated patients, was observed with an HR of death at 0.35 (95%CI: 0.14-0.88). Similar results were observed in HD, 0.23 (0.07-0.80) but not in FL, 1.28 (0.43-3.77). Estimated benefits of Allo-SCT for the original population using IPW were erased in HR for death at 0.72 (0.37-1.39) for MM patients, 0.60 (0.19-1.89) for HD patients, and 2.02 (0.88-4.66) for FL patients.
Conclusion
Differences in estimated benefits rely on whether the underlying population to which they apply is an ideal randomized experimental population (PS) or the original population (IPW). These useful methods should be employed when assessing the effects of innovative treatment in non-randomized experiments.
doi:10.1186/1471-2326-12-10
PMCID: PMC3532369  PMID: 22898556
Propensity score; Allogeneic stem cell transplantation; Treatment effect; Non-randomized studies
18.  The Use of Propensity Scores in Mediation Analysis 
Multivariate behavioral research  2011;46(3):425-452.
Mediation analysis uses measures of hypothesized mediating variables to test theory for how a treatment achieves effects on outcomes and to improve subsequent treatments by identifying the most efficient treatment components. Most current mediation analysis methods rely on untested distributional and functional form assumptions for valid conclusions, especially regarding the relation between the mediator and outcome variables. Propensity score methods offer an alternative whereby the propensity score is used to compare individuals in the treatment and control groups who would have had the same value of the mediator had they been assigned to the same treatment condition. This article describes the use of propensity score weighting for mediation with a focus on explicating the underlying assumptions. Propensity scores have the potential to offer an alternative estimation procedure for mediation analysis with alternative assumptions from those of standard mediation analysis. The methods are illustrated investigating the mediational effects of an intervention to improve sense of mastery to reduce depression using data from the Job Search Intervention Study (JOBS II). We find significant treatment effects for those individuals who would have improved sense of mastery when in the treatment condition but no effects for those who would not have improved sense of mastery under treatment.
doi:10.1080/00273171.2011.576624
PMCID: PMC3293166  PMID: 22399826
19.  Estimating Heterogeneous Treatment Effects with Observational Data* 
Sociological methodology  2012;42(1):314-347.
Individuals differ not only in their background characteristics, but also in how they respond to a particular treatment, intervention, or stimulation. In particular, treatment effects may vary systematically by the propensity for treatment. In this paper, we discuss a practical approach to studying heterogeneous treatment effects as a function of the treatment propensity, under the same assumption commonly underlying regression analysis: ignorability. We describe one parametric method and two non-parametric methods for estimating interactions between treatment and the propensity for treatment. For the first method, we begin by estimating propensity scores for the probability of treatment given a set of observed covariates for each unit and construct balanced propensity score strata; we then estimate propensity score stratum-specific average treatment effects and evaluate a trend across them. For the second method, we match control units to treated units based on the propensity score and transform the data into treatment-control comparisons at the most elementary level at which such comparisons can be constructed; we then estimate treatment effects as a function of the propensity score by fitting a non-parametric model as a smoothing device. For the third method, we first estimate non-parametric regressions of the outcome variable as a function of the propensity score separately for treated units and for control units and then take the difference between the two non-parametric regressions. We illustrate the application of these methods with an empirical example of the effects of college attendance on womens fertility.
PMCID: PMC3591476  PMID: 23482633
causal effects; treatment effects; heterogeneity; propensity scores; matching
20.  Estimating Causal Effects in Mediation Analysis using Propensity Scores 
Mediation is usually assessed by a regression-based or structural equation modeling (SEM) approach that we will refer to as the classical approach. This approach relies on the assumption that there are no confounders that influence both the mediator, M, and the outcome, Y. This assumption holds if individuals are randomly assigned to levels of M but generally random assignment is not possible. We propose the use of propensity scores to help remove the selection bias that may result when individuals are not randomly assigned to levels of M. The propensity score is the probability that an individual receives a particular level of M. Results from a simulation study are presented to demonstrate this approach, referred to as Classical + Propensity Model (C+PM), confirming that the population parameters are recovered and that selection bias is successfully dealt with. Comparisons are made to the classical approach that does not include propensity scores. Propensity scores were estimated by a logistic regression model. If all confounders are included in the propensity model, then the C+PM is unbiased. If some, but not all, of the confounders are included in the propensity model, then the C+PM estimates are biased although not as severely as the classical approach (i.e. no propensity model is included).
doi:10.1080/10705511.2011.582001
PMCID: PMC3212948  PMID: 22081755
21.  The role of the c-statistic in variable selection for propensity score models 
The applied literature on propensity scores has often cited the c-statistic as a measure of the ability of the propensity score to control confounding. However, a high c-statistic in the propensity model is neither necessary nor sufficient for control of confounding. Moreover, use of the c-statistic as a guide in constructing propensity scores may result in less overlap in propensity scores between treated and untreated subjects; this may require the analyst to restrict populations for inference. Such restrictions may reduce precision of estimates and change the population to which the estimate applies. Variable selection based on prior subject matter knowledge, empirical observation, and sensitivity analysis is preferable and avoids many of these problems.
doi:10.1002/pds.2074
PMCID: PMC3081361  PMID: 21351315
Propensity scores; c-statistic; variable selection; confounding
22.  Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations 
We consider nonparametric regression of a scalar outcome on a covariate when the outcome is missing at random (MAR) given the covariate and other observed auxiliary variables. We propose a class of augmented inverse probability weighted (AIPW) kernel estimating equations for nonparametric regression under MAR. We show that AIPW kernel estimators are consistent when the probability that the outcome is observed, that is, the selection probability, is either known by design or estimated under a correctly specified model. In addition, we show that a specific AIPW kernel estimator in our class that employs the fitted values from a model for the conditional mean of the outcome given covariates and auxiliaries is double-robust, that is, it remains consistent if this model is correctly specified even if the selection probabilities are modeled or specified incorrectly. Furthermore, when both models happen to be right, this double-robust estimator attains the smallest possible asymptotic variance of all AIPW kernel estimators and maximally extracts the information in the auxiliary variables. We also describe a simple correction to the AIPW kernel estimating equations that while preserving double-robustness it ensures efficiency improvement over nonaugmented IPW estimation when the selection model is correctly specified regardless of the validity of the second model used in the augmentation term. We perform simulations to evaluate the finite sample performance of the proposed estimators, and apply the methods to the analysis of the AIDS Costs and Services Utilization Survey data. Technical proofs are available online.
doi:10.1198/jasa.2010.tm08463
PMCID: PMC3491912  PMID: 23144520
Asymptotics; Augmented kernel estimating equations; Double robustness; Efficiency; Inverse probability weighted kernel estimating equations; Kernel smoothing
23.  Pharmacologic Boosting of Atazanavir in Maintenance HIV-1 Therapy: The COREYA Propensity-Score Adjusted Study 
PLoS ONE  2012;7(11):e49289.
Background
Among HIV-1 infected patients who achieved virologic suppression, the use of atazanavir without pharmacologic boosting is debated. We evaluated the efficacy and tolerance of maintenance therapy with unboosted atazanavir in clinical practice.
Methods and Results
This multicenter retrospective cohort study evaluated the efficacy of switching HIV-1-infected patients controlled on triple therapy to unboosted (ATV0, n = 98) versus ritonavir-boosted atazanavir (ATV/r, n = 254) +2 nucleos(t)ide reverse transcriptase inhibitors. The primary endpoint was time to virologic failure (VF, >200 copies/mL). ATV groups were compared controlling for potential confounding bias by inverse probability weighted Cox analysis and propensity-score matching. Overall and adjusted VF rates were similar for both strategies. Both strategies improved dyslipidemia and creatininemia, with less jaundice in the ATV0 group.
Conclusion
In previously well-suppressed patients, within an observational cohort setting, ATV0–based triple-therapy appeared as effective as ATV/r- based triple-therapy to maintain virologic suppression, even if co-administered with TDF, but was better tolerated.
doi:10.1371/journal.pone.0049289
PMCID: PMC3494679  PMID: 23152890
24.  Evaluating bias correction in weighted proportional hazards regression 
Lifetime Data Analysis  2008;15(1):120-146.
Often in observational studies of time to an event, the study population is a biased (i.e., unrepresentative) sample of the target population. In the presence of biased samples, it is common to weight subjects by the inverse of their respective selection probabilities. Pan and Schaubel (2008) recently proposed inference procedures for an inverse selection probability weighted (ISPW) Cox model, applicable when selection probabilities are not treated as fixed but estimated empirically. The proposed weighting procedure requires auxiliary data to estimate the weights and is computationally more intense than unweighted estimation. The ignorability of sample selection process in terms of parameter estimators and predictions is often of interest, from several perspectives: e.g., to determine if weighting makes a significant difference to the analysis at hand, which would in turn address whether the collection of auxiliary data was required in future studies; to evaluate previous studies which did not correct for selection bias. In this article, we propose methods to quantify the degree of bias corrected by the weighting procedure in the partial likelihood and Breslow-Aalen estimators. Asymptotic properties of the proposed test statistics are derived. The finite-sample significance level and power are evaluated through simulation. The proposed methods are then applied to data from a national organ failure registry to evaluate the bias in a post kidney transplant survival model.
doi:10.1007/s10985-008-9102-4
PMCID: PMC3367517  PMID: 18958616
Confidence bands; Inverse-selection-probability weights; Observational studies; Proportional hazards model; Selection bias; Wald test
25.  Adjusting effect estimates for unmeasured confounding with validation data using propensity score calibration 
American journal of epidemiology  2005;162(3):279-289.
Often important confounders are not available in studies. Sensitivity analyses based on the relation of single, but not multiple, unmeasured confounders with an exposure of interest in a separate validation study have been proposed. The authors controlled for measured confounding in the main cohort using propensity scores (PS) and addressed unmeasured confounding by estimating two additional PS in a validation study. The ‘error-prone’ PS exclusively used information available in the main cohort. The ‘gold-standard’ PS additionally included covariates available only in the validation study. Based on these two PS in the validation study, regression calibration was applied to adjust regression coefficients. This propensity score calibration (PSC) adjusts for unmeasured confounding in cohort studies with validation data under certain, usually untestable, assumptions. PSC was used to assess nonsteroidal antiinflammatory drugs (NSAID) and 1-year mortality in a large cohort of elderly. ‘Traditional’ adjustment resulted in a relative risk (RR) in NSAID users of 0.80 (95% confidence interval: 0.77–0.83) compared to an unadjusted RR of 0.68 (0.66–0.71). Application of PSC resulted in a more plausible RR of 1.06 (1.00–1.12). Until validity and limitations of PSC have been assessed in different settings, the method should be seen as a sensitivity analysis.
doi:10.1093/aje/kwi192
PMCID: PMC1444885  PMID: 15987725
epidemiologic methods; research design; confounding factors (epidemiology); bias (epidemiology); cohort studies; propensity score calibration; AUC, area under the receiver operating characteristic curve; CI, confidence interval; NSAID, nonsteroidal antiinflammatory drug; OR, odds ratio; PS, propensity score; PSC, propensity score calibration; RR, relative risk

Results 1-25 (558431)