Search tips
Search criteria

Results 1-25 (1000674)

Clipboard (0)

Related Articles

1.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality 
Multivariate Behavioral Research  2011;46(1):119-151.
Propensity score methods allow investigators to estimate causal treatment effects using observational or nonrandomized data. In this article we provide a practical illustration of the appropriate steps in conducting propensity score analyses. For illustrative purposes, we use a sample of current smokers who were discharged alive after being hospitalized with a diagnosis of acute myocardial infarction. The exposure of interest was receipt of smoking cessation counseling prior to hospital discharge and the outcome was mortality with 3 years of hospital discharge. We illustrate the following concepts: first, how to specify the propensity score model; second, how to match treated and untreated participants on the propensity score; third, how to compare the similarity of baseline characteristics between treated and untreated participants after stratifying on the propensity score, in a sample matched on the propensity score, or in a sample weighted by the inverse probability of treatment; fourth, how to estimate the effect of treatment on outcomes when using propensity score matching, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, or covariate adjustment using the propensity score. Finally, we compare the results of the propensity score analyses with those obtained using conventional regression adjustment.
PMCID: PMC3266945  PMID: 22287812
2.  The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments 
Statistics in Medicine  2013;33(7):1242-1258.
Propensity score methods are increasingly being used to estimate causal treatment effects in observational studies. In medical and epidemiological studies, outcomes are frequently time-to-event in nature. Propensity-score methods are often applied incorrectly when estimating the effect of treatment on time-to-event outcomes. This article describes how two different propensity score methods (matching and inverse probability of treatment weighting) can be used to estimate the measures of effect that are frequently reported in randomized controlled trials: (i) marginal survival curves, which describe survival in the population if all subjects were treated or if all subjects were untreated; and (ii) marginal hazard ratios. The use of these propensity score methods allows one to replicate the measures of effect that are commonly reported in randomized controlled trials with time-to-event outcomes: both absolute and relative reductions in the probability of an event occurring can be determined. We also provide guidance on variable selection for the propensity score model, highlight methods for assessing the balance of baseline covariates between treated and untreated subjects, and describe the implementation of a sensitivity analysis to assess the effect of unmeasured confounding variables on the estimated treatment effect when outcomes are time-to-event in nature. The methods in the paper are illustrated by estimating the effect of discharge statin prescribing on the risk of death in a sample of patients hospitalized with acute myocardial infarction. In this tutorial article, we describe and illustrate all the steps necessary to conduct a comprehensive analysis of the effect of treatment on time-to-event outcomes. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
PMCID: PMC4285179  PMID: 24122911
propensity score; observational study; propensity score matching; inverse probability of treatment weighting; survival analysis; event history analysis; confounding; marginal effects
3.  The performance of different propensity score methods for estimating marginal hazard ratios 
Statistics in Medicine  2012;32(16):2837-2849.
Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non-randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time-to-event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time-to-event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population-average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest-neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time-to-event outcomes. Copyright © 2012 John Wiley & Sons, Ltd.
PMCID: PMC3747460  PMID: 23239115
propensity score; survival analysis; inverse probability of treatment weighting (IPTW); Monte Carlo simulations; observational study; time-to-event outcomes
4.  An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies 
Multivariate Behavioral Research  2011;46(3):399-424.
The propensity score is the probability of treatment assignment conditional on observed baseline characteristics. The propensity score allows one to design and analyze an observational (nonrandomized) study so that it mimics some of the particular characteristics of a randomized controlled trial. In particular, the propensity score is a balancing score: conditional on the propensity score, the distribution of observed baseline covariates will be similar between treated and untreated subjects. I describe 4 different propensity score methods: matching on the propensity score, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, and covariate adjustment using the propensity score. I describe balance diagnostics for examining whether the propensity score model has been adequately specified. Furthermore, I discuss differences between regression-based methods and propensity score-based methods for the analysis of observational data. I describe different causal average treatment effects and their relationship with propensity score analyses.
PMCID: PMC3144483  PMID: 21818162
5.  Estimating Effects of Nursing Intervention via Propensity Score Analysis 
Nursing research  2008;57(6):444-452.
Lack of randomization of nursing intervention in outcome effectiveness studies may lead to imbalanced covariates. Consequently, estimation of nursing intervention effect can be biased as in other observational studies. Propensity score analysis is an effective statistical method to reduce such bias and further derive causal effects in observational studies.
To illustrate the use of propensity score analysis in quantitative nursing research through an example of pain management effect on length of hospital stay.
Propensity scores are generated through a regression model treating the nursing intervention as the dependent variable and all confounding covariates as predictor variables. Then propensity scores are used to adjust for this nonrandomized assignment of nursing intervention through three approaches: regression covariance adjustment, stratification, and matching in the predictive outcome model for nursing intervention.
Propensity score analysis reduces the confounding covariates into a single variable of propensity score. After stratification and matching on propensity scores, observed covariates between nursing intervention groups are more balanced within each stratum or in the matched samples. The likelihood of receiving pain management is accounted for in the outcome model through the propensity scores. Both regression covariance adjustment and matching methods report a significant pain management effect on length of hospital stay in this example. The pain management effect can be regarded as causal when the strongly ignorable treatment assignment assumption holds.
Propensity score analysis provides an alternative statistical approach to the classical multivariate regression, stratification and matching techniques for examining the effects of nursing intervention with a large number of confounding covariates in the background. It can be used to derive causal effects of nursing intervention in observational studies under certain circumstances.
PMCID: PMC2778306  PMID: 19018219
matching; nursing effectiveness research; nursing interventions; propensity score
6.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies 
Statistics in Medicine  2010;29(20):2137-2148.
Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity-score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity-score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean-squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity-score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity-score methods. Differences between IPTW and propensity-score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd.
PMCID: PMC3068290  PMID: 20108233
propensity score; observational study; binary data; risk difference; number needed to treat; matching; IPTW; inverse probability of treatment weighting; propensity-score matching
Methods for estimating average treatment effects, under the assumption of no unmeasured confounders, include regression models; propensity score adjustments using stratification, weighting, or matching; and doubly robust estimators (a combination of both). Researchers continue to debate about the best estimator for outcomes such as health care cost data, as they are usually characterized by an asymmetric distribution and heterogeneous treatment effects,. Challenges in finding the right specifications for regression models are well documented in the literature. Propensity score estimators are proposed as alternatives to overcoming these challenges. Using simulations, we find that in moderate size samples (n= 5000), balancing on propensity scores that are estimated from saturated specifications can balance the covariate means across treatment arms but fails to balance higher-order moments and covariances amongst covariates. Therefore, unlike regression model, even if a formal model for outcomes is not required, propensity score estimators can be inefficient at best and biased at worst for health care cost data. Our simulation study, designed to take a ‘proof by contradiction’ approach, proves that no one estimator can be considered the best under all data generating processes for outcomes such as costs. The inverse-propensity weighted estimator is most likely to be unbiased under alternate data generating processes but is prone to bias under misspecification of the propensity score model and is inefficient compared to an unbiased regression estimator. Our results show that there are no ‘magic bullets’ when it comes to estimating treatment effects in health care costs. Care should be taken before naively applying any one estimator to estimate average treatment effects in these data. We illustrate the performance of alternative methods in a cost dataset on breast cancer treatment.
PMCID: PMC3244728  PMID: 22199462
Propensity score; non-linear regression; average treatment effect; health care costs
8.  Bias associated with using the estimated propensity score as a regression covariate 
Statistics in medicine  2013;33(1):74-87.
The use of propensity score methods to adjust for selection bias in observational studies has become increasingly popular in public health and medical research. A substantial portion of studies using propensity score adjustment treat the propensity score as a conventional regression predictor. Through a Monte Carlo simulation study, Austin and colleagues. investigated the bias associated with treatment effect estimation when the propensity score is used as a covariate in nonlinear regression models, such as logistic regression and Cox proportional hazards models. We show that the bias exists even in a linear regression model when the estimated propensity score is used and derive the explicit form of the bias. We also conduct an extensive simulation study to compare the performance of such covariate adjustment with propensity score stratification, propensity score matching, inverse probability of treatment weighted method, and nonparametric functional estimation using splines. The simulation scenarios are designed to reflect real data analysis practice. Instead of specifying a known parametric propensity score model, we generate the data by considering various degrees of overlap of the covariate distributions between treated and control groups. Propensity score matching excels when the treated group is contained within a larger control pool, while the model-based adjustment may have an edge when treated and control groups do not have too much overlap. Overall, adjusting for the propensity score through stratification or matching followed by regression or using splines, appears to be a good practical strategy.
PMCID: PMC4004383  PMID: 23787715
observational studies; matching; stratification; weighting
9.  Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation 
Multivariate behavioral research  2012;47(1):115-135.
Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of ensemble-based methods (bagged regression trees, random forests, and boosted regression trees) to directly estimate average treatment effects by imputing potential outcomes. We used an extensive series of Monte Carlo simulations to estimate bias, variance, and mean squared error of treatment effects estimated using different ensemble methods. For comparative purposes, we compared the performance of these methods with inverse probability of treatment weighting using the propensity score when logistic regression or ensemble methods were used to estimate the propensity score. Using boosted regression trees of depth 3 or 4 to impute potential outcomes tended to result in estimates with bias equivalent to that of the best performing methods. Using an empirical case study, we compared inferences on the effect of in-hospital smoking cessation counseling on subsequent mortality in patients hospitalized with an acute myocardial infarction.
PMCID: PMC3293511  PMID: 22419832 CAMSID: cams2143
10.  Using imputed pre-treatment cholesterol in a propensity score model to reduce confounding by indication: results from the multi-ethnic study of atherosclerosis 
Studying the effects of medications on endpoints in an observational setting is an important yet challenging problem due to confounding by indication. The purpose of this study is to describe methodology for estimating such effects while including prevalent medication users. These techniques are illustrated in models relating statin use to cardiovascular disease (CVD) in a large multi-ethnic cohort study.
The Multi-Ethnic Study of Atherosclerosis (MESA) includes 6814 participants aged 45-84 years free of CVD. Confounding by indication was mitigated using a two step approach: First, the untreated values of cholesterol were treated as missing data and the values imputed as a function of the observed treated value, dose and type of medication, and participant characteristics. Second, we construct a propensity-score modeling the probability of medication initiation as a function of measured covariates and estimated pre-treatment cholesterol value. The effect of statins on CVD endpoints were assessed using weighted Cox proportional hazard models using inverse probability weights based on the propensity score.
Based on a meta-analysis of randomized controlled trials (RCT) statins are associated with a reduced risk of CVD (relative risk ratio = 0.73, 95% CI: 0.70, 0.77). In an unweighted Cox model adjusting for traditional risk factors we observed little association of statins with CVD (hazard ratio (HR) = 0.97, 95% CI: 0.60, 1.59). Using weights based on a propensity model for statins that did not include the estimated pre-treatment cholesterol we observed a slight protective association (HR = 0.92, 95% CI: 0.54-1.57). Results were similar using a new-user design where prevalent users of statins are excluded (HR = 0.91, 95% CI: 0.45-1.80). Using weights based on a propensity model with estimated pre-treatment cholesterol the effects of statins (HR = 0.74, 95% CI: 0.38, 1.42) were consistent with the RCT literature.
The imputation of pre-treated cholesterol levels for participants on medication at baseline in conjunction with a propensity score yielded estimates that were consistent with the RCT literature. These techniques could be useful in any example where inclusion of participants exposed at baseline in the analysis is desirable, and reasonable estimates of pre-exposure biomarker values can be estimated.
PMCID: PMC3694006  PMID: 23800038
Multiple imputation; Confounding by indication; Propensity score; Inverse probability of treatment weights; Statins
11.  Estimating Heterogeneous Treatment Effects with Observational Data* 
Sociological methodology  2012;42(1):314-347.
Individuals differ not only in their background characteristics, but also in how they respond to a particular treatment, intervention, or stimulation. In particular, treatment effects may vary systematically by the propensity for treatment. In this paper, we discuss a practical approach to studying heterogeneous treatment effects as a function of the treatment propensity, under the same assumption commonly underlying regression analysis: ignorability. We describe one parametric method and two non-parametric methods for estimating interactions between treatment and the propensity for treatment. For the first method, we begin by estimating propensity scores for the probability of treatment given a set of observed covariates for each unit and construct balanced propensity score strata; we then estimate propensity score stratum-specific average treatment effects and evaluate a trend across them. For the second method, we match control units to treated units based on the propensity score and transform the data into treatment-control comparisons at the most elementary level at which such comparisons can be constructed; we then estimate treatment effects as a function of the propensity score by fitting a non-parametric model as a smoothing device. For the third method, we first estimate non-parametric regressions of the outcome variable as a function of the propensity score separately for treated units and for control units and then take the difference between the two non-parametric regressions. We illustrate the application of these methods with an empirical example of the effects of college attendance on womens fertility.
PMCID: PMC3591476  PMID: 23482633
causal effects; treatment effects; heterogeneity; propensity scores; matching
12.  Bias and variance trade-offs when combining propensity score weighting and regression: with an application to HIV status and homeless men 
The quality of propensity scores is traditionally measured by assessing how well they make the distributions of covariates in the treatment and control groups match, which we refer to as “good balance”. Good balance guarantees less biased estimates of the treatment effect. However, the cost of achieving good balance is that the variance of the estimates increases due to a reduction in effective sample size, either through the introduction of propensity score weights or dropping cases when propensity score matching. In this paper, we investigate whether it is best to optimize the balance or to settle for a less than optimal balance and use double robust estimation to adjust for remaining differences. We compare treatment effect estimates from regression, propensity score weighting, and double robust estimation with varying levels of effort expended to achieve balance using data from a study about the differences in outcomes by HIV status in heterosexually active homeless men residing in Los Angeles. Because of how costly data collection efforts are for this population, it is important to find an alternative estimation method that does not reduce effective sample size as much as methods that aggressively aim to optimize balance. Results from a simulation study suggest that there are instances in which we can obtain more precise treatment effect estimates without increasing bias too much by using a combination of regression and propensity score weights that achieve a less than optimal balance. There is a bias-variance tradeoff at work in propensity score estimation; every step toward better balance usually means an increase in variance and at some point a marginal decrease in bias may not be worth the associated increase in variance.
PMCID: PMC3433039  PMID: 22956891
Propensity score; Double robust estimation; HIV status; Homeless men
13.  Type I Error Rates, Coverage of Confidence Intervals, and Variance Estimation in Propensity-Score Matched Analyses* 
Propensity-score matching is frequently used in the medical literature to reduce or eliminate the effect of treatment selection bias when estimating the effect of treatments or exposures on outcomes using observational data. In propensity-score matching, pairs of treated and untreated subjects with similar propensity scores are formed. Recent systematic reviews of the use of propensity-score matching found that the large majority of researchers ignore the matched nature of the propensity-score matched sample when estimating the statistical significance of the treatment effect. We conducted a series of Monte Carlo simulations to examine the impact of ignoring the matched nature of the propensity-score matched sample on Type I error rates, coverage of confidence intervals, and variance estimation of the treatment effect. We examined estimating differences in means, relative risks, odds ratios, rate ratios from Poisson models, and hazard ratios from Cox regression models. We demonstrated that accounting for the matched nature of the propensity-score matched sample tended to result in type I error rates that were closer to the advertised level compared to when matching was not incorporated into the analyses. Similarly, accounting for the matched nature of the sample tended to result in confidence intervals with coverage rates that were closer to the nominal level, compared to when matching was not taken into account. Finally, accounting for the matched nature of the sample resulted in estimates of standard error that more closely reflected the sampling variability of the treatment effect compared to when matching was not taken into account.
PMCID: PMC2949360  PMID: 20949126
propensity score; matching; propensity-score matching; variance estimation; coverage; simulations; type I error; observational studies
14.  Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples 
Statistics in Medicine  2009;28(25):3083-3107.
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile–quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright © 2009 John Wiley & Sons, Ltd.
PMCID: PMC3472075  PMID: 19757444
balance; goodness-of-fit; observational study; propensity score; matching; propensity-score matching; standardized difference; bias
15.  Major Medical Outcomes with Spinal Augmentation versus Conservative Therapy 
JAMA internal medicine  2013;173(16):1514-1521.
The symptomatic benefits of spinal augmentation (vertebroplasty or kyphoplasty) for the treatment of osteoporotic vertebral compression fractures are controversial. Recent population-based studies using medical billing claims have reported significant reductions in mortality with spinal augmentation compared to conservative therapy, but in non-randomized settings such as these, there is the potential for selection bias to influence results.
To compare major medical outcomes following treatment of osteoporotic vertebral fractures with spinal augmentation or conservative therapy. Additionally, we will evaluate the role of selection bias using pre-procedure outcomes and propensity score analysis.
Design, Setting, and Participants
Retrospective cohort analysis of Medicare claims for 2002–2006. We compared 30-day and 1-year outcomes in patients with newly-diagnosed vertebral fractures treated with spinal augmentation (augmented; n=10 541) or conservative therapy (control; n=115 851). Outcomes were compared using traditional multivariate analyses adjusted for patient demographics and comorbid conditions. We also used propensity score matching to select 9017 pairs from the initial groups to compare the same outcomes.
Main Outcomes and Measures
Mortality, major complications, and healthcare utilization.
Using traditional covariate adjustments, mortality was significantly lower in the augmented group compared to controls (5.2% vs 6.7% at one year; hazard ratio, 0.83; 95% CI, 0.75–0.92). However, patients in the augmented group who had not yet undergone augmentation (pre-procedure subgroup) had lower rates of medical complications 30 days post-fracture compared to controls (6.5% vs 9.5%; odds ratio, 0.66; 95% CI, 0.57–0.78), suggesting that the augmented group was less medically ill. After propensity score matching to better account for selection bias, one-year mortality was not significantly different between groups. Furthermore, one-year major medical complications were also similar between groups, and the augmented group had higher rates of healthcare utilization, including hospital and intensive care unit admissions and discharges to skilled nursing facilities.
Conclusions and Relevance
After accounting for selection bias, spinal augmentation did not improve mortality or major medical outcomes and was associated with greater healthcare utilization compared to conservative therapy. Our results also highlight how analyses of claims-based data that do not adequately account for unrecognized confounding can arrive at misleading conclusions.
PMCID: PMC4023124  PMID: 23836009
16.  Carboplatin and Paclitaxel with vs without Bevacizumab in Older Patients with Advanced Non-Small-Cell Lung Cancer 
A randomized trial demonstrates that adding bevacizumab to carboplatin and paxlitaxel improves survival in advanced non-small-cell lung cancer (NSCLC).
To examine whether adding bevacizumab to carboplatin and paclitaxel chemotherapy is associated with improved survival in the NSCLC Medicare population.
Design, Setting and Participants
Retrospective cohort study of Medicare beneficiaries aged 65 and older with stage IIIB or IV non-squamous NSCLC diagnosed in 2002–2007 in a Surveillance, Epidemiology and End Results (SEER) region. Patients were categorized into three cohorts based on diagnosis year and the type of initial chemotherapy administered within 4 months of diagnosis: 1) bevacizumab-carboplatin-paclitaxel (BCP) diagnosed 2006–7; 2) carboplatin-paclitaxel diagnosed 2006–7 (CP 2006–7); and, 3) CP diagnosed 2002–5 (CP 2002–5). The effects of BCP and CP on overall survival were compared using Cox proportional hazards models and propensity score analyses including information about patient characteristics recorded in SEER-Medicare.
Main Outcome Measure
Overall survival measured from the first date of chemotherapy treatment until death or the censoring date of December 31, 2009.
4,168 patients had either BCP or CP chemotherapy. The median survival (interquartile range) estimates were 9.7 (4.4–18.6) months, 8.9 (3.5–19.3) months, and 8.0 (3.7–17.2) months for BCP, CP 2006–7, and CP 2002–5 recipients, respectively. One-year survival probabilities (95% confidence interval [CI]) were 39.6% (34.6%–45.4%) for BCP, versus 40.1% (37.4%–43.0%) for CP 2006–7 and 35.6% (33.8%–37.5%) for CP 2002–5. Neither multivariable nor propensity score-adjusted Cox models demonstrated a survival advantage for BCP compared to CP cohorts. In propensity score-stratified models, the hazard ratio (HR) for overall survival for BCP compared with CP 2006–7 was 1.01 (95% CI, 0.89–1.16; P=.85); and compared with CP 2002–5 was 0.93 (95% CI, 0.83–1.06; P=.28). The propensity score-weighted model and propensity score-matching model similarly failed to demonstrate a statistically significant superiority for BCP. Subgroup and sensitivity analyses for key variables did not change these findings.
Adding bevacizumab to carboplatin and paclitaxel was not associated with better survival among Medicare patients with advanced NSCLC.
PMCID: PMC3418968  PMID: 22511687
17.  Model Feedback in Bayesian Propensity Score Estimation 
Biometrics  2013;69(1):263-273.
Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: 1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and 2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is “feedback” between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting vs. carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in
PMCID: PMC3622139  PMID: 23379793
Bayesian estimation; causal inference; comparative effectiveness; model feedback; propensity score
18.  Prehospital Lactated Ringer's Solution Treatment and Survival in Out-of-Hospital Cardiac Arrest: A Prospective Cohort Analysis 
PLoS Medicine  2013;10(2):e1001394.
In a cohort of more than 500,000 individuals who experienced out-of-hospital cardiac arrest in Japan, Akihito Hagihara and colleagues studied whether administration of lactated Ringer's solution was associated with survival and functional outcomes.
No studies have evaluated whether administering intravenous lactated Ringer's (LR) solution to patients with out-of-hospital cardiac arrest (OHCA) improves their outcomes, to our knowledge. Therefore, we examined the association between prehospital use of LR solution and patients' return of spontaneous circulation (ROSC), 1-month survival, and neurological or physical outcomes at 1 month after the event.
Methods and Findings
We conducted a prospective, non-randomized, observational study using national data of all patients with OHCA from 2005 through 2009 in Japan. We performed a propensity analysis and examined the association between prehospital use of LR solution and short- and long-term survival. The study patients were ≥18 years of age, had an OHCA before arrival of EMS personnel, were treated by EMS personnel, and were then transported to hospitals. A total of 531,854 patients with OHCA met the inclusion criteria. Among propensity-matched patients, compared with those who did not receive pre-hospital intravenous fluids, prehospital use of LR solution was associated with an increased likelihood of ROSC before hospital arrival (odds ratio [OR] adjusted for all covariates [95% CI] = 1.239 [1.146–1.339] [p<0.001], but with a reduced likelihood of 1-month survival with minimal neurological or physical impairment (cerebral performance category 1 or 2, OR adjusted for all covariates [95% CI] = 0.764 [0.589–0.992] [p = 0.04]; and overall performance category 1 or 2, OR adjusted for all covariates [95% CI] = 0.746 [0.573–0.971] [p = 0.03]). There was no association between prehospital use of LR solution and 1-month survival (OR adjusted for all covariates [95% CI] = 0.960 [0.854–1.078]).
In Japanese patients experiencing OHCA, the prehospital use of LR solution was independently associated with a decreased likelihood of a good functional outcome 1 month after the event, but with an increased likelihood of ROSC before hospital arrival. Prehospital use of LR solution was not associated with 1-month survival. Further study is necessary to verify these findings.
Please see later in the article for the Editors' Summary
Editors' Summary
Cardiac arrest, a condition in which the heart suddenly stops pumping, is caused by problems with the heart's internal electrical system, which controls the rate and rhythm of the heart contractions that pump blood around the body. If this electrical system malfunctions, an abnormal heartbeat or “arrhythmia” develops that, in some cases, causes cardiac arrest. Because blood is no longer being pumped around the body, the organs and tissues of the body do not receive the oxygen they need to function. Consciousness is lost immediately and, if medical attention is not provided quickly, death follows within a few minutes—about 95% of people who have a cardiac arrest die before they reach hospital or emergency medical help. Moreover, survivors of cardiac arrest are often left with permanent damage to the brain and other organs. Early cardiopulmonary resuscitation (CPR; chest compression to pump the heart and mouth-to-mouth resuscitation to inflate the lungs) and early defibrillation (delivery of an electric shock to the heart to restore its normal rhythm) reduce the risk of death and permanent organ damage after cardiac arrest.
Why Was This Study Done?
Another procedure that is sometimes used during pre-hospital resuscitation of cardiac arrest cases is intravenous fluid administration—delivering liquid into a vein through an intravenous needle. A solution that is often used for this purpose is lactated Ringer's (LR) solution, a mixture of inorganic salts and sodium lactate. However, the effects of intravenous LR solution on the outcomes of patients who have an out-of-hospital cardiac arrest have not been studied. In this prospective cohort analysis, the researchers examine the association between the pre-hospital use of LR solution and the return of spontaneous circulation, one-month survival, and neurological and physical outcomes at one month after cardiac arrest among patients in Japan who have had an out-of-hospital cardiac arrest. A prospective cohort analysis identifies a group of patients with a specific condition and examines how they subsequently fare.
What Did the Researchers Do and Find?
In Japan, the Fire and Disaster Management Agency records all out-of-hospital cardiac arrest cases in a nationwide database. The researchers used this database to identify more than half a million out-of-hospital cardiac arrest cases that occurred in Japan between 2005 and 2009. To examine the association between pre-hospital use of LR solution and short- and long-term survival, the researchers used a statistical technique called propensity analysis. This technique is used in observational studies to control for confounding—unknown differences between people who receive an intervention and those who do not receive an intervention that might affect outcomes and thus make it hard to draw conclusions about the intervention's true effects. By examining a large number of variables (for example, age, sex, and time taken for help to arrive), the researchers gave every patient a propensity score that indicated their probability of receiving pre-hospital LR solution, and then used this score to match each patient who received LR solution with a similar patient who did not receive LR solution. Among propensity-matched patients, pre-hospital use of LR solution was associated with a slightly increased chance of return of spontaneous circulation before arrival at a hospital and with a decreased chance of 1-month survival with minimal neurological or physical impairment. Among the whole cohort, pre-hospital use of LR solution was not associated with overall one-month survival.
What Do These Findings Mean?
These findings suggest that, among Japanese patients with out-of-hospital cardiac arrest, pre-hospital use of LR solution was associated with less chance of good functional outcomes at one month. However, the present study has several limitations. For example, data on in-hospital treatment following out-of-hospital cardiac arrest were not available, so the outcome differences between patients receiving and not receiving LR solution potentially could reflect differences in their in-hospital treatment. Moreover, although the researchers undertook a propensity analysis, this study, like all observational studies, can only partly control for selection bias and confounding factors. Thus, the observed associations between LR solution use and short- and long-term outcomes may actually reflect the effects of some unknown characteristic shared by the patients who received LR solution. Because of these and other limitations, it is essential that the findings of this study are verified before recommendations about the pre-hospital use of LR solution in patients with out-of-hospital cardiac arrest are made.
Additional Information
Please access these Web sites via the online version of this summary at
The US National Heart Lung and Blood Institute provides information on sudden cardiac arrest and on heart arrhythmias
The American Heart Association also information in several languages on sudden cardiac death and on arrhythmias; a selection of personal stories about arrhythmia and cardiac arrest is also available
The not-for-profit Sudden Cardiac Arrest Foundation provides information on all aspects of cardiac arrest, including survivor stories
MedlinePlus provides links to other resources about cardiac arrest (in English and Spanish)
PMCID: PMC3576391  PMID: 23431275
19.  Comparative Effectiveness of Linezolid and Vancomycin among a National Cohort of Patients Infected with Methicillin-Resistant Staphylococcus aureus▿  
Antimicrobial Agents and Chemotherapy  2010;54(10):4394-4400.
While newer antibiotics play a key role in treating methicillin-resistant Staphylococcus aureus (MRSA) infections, knowledge of their real-world clinical impact is limited. We sought to quantify the effectiveness of linezolid compared to that of vancomycin among MRSA-infected patients. This national retrospective cohort study included adult patients admitted to all Veterans Affairs hospitals between January 2002 and June 2008, infected with MRSA, and treated with either linezolid (oral or intravenous [i.v.]) or vancomycin (i.v.). Patients were followed from their treatment initiation date until the event of interest, discharge, death, or December 2008. Utilizing propensity score methods, we estimated the treatment effects of linezolid primarily on time to discharge and secondarily on time to all-cause in-hospital mortality, therapy discontinuation, and all-cause 90-day readmission with Cox proportional-hazard models. We identified 20,107 patients treated with linezolid (3.2%) or vancomycin (96.8%). Baseline covariates were well balanced by treatment group within propensity score quintiles and between propensity score matched patients (626 pairs). The discharge rate was significantly higher among patients treated with linezolid, representing a decreased length of stay, in both the propensity score adjusted (hazard ratio [HR], 1.38; 95% confidence interval [95% CI], 1.27 to 1.50) and matched (HR, 1.70; 95% CI, 1.44 to 2.00) analyses. A significantly decreased rate of therapy discontinuation, indicating longer therapy duration, was observed in the linezolid group (adjusted HR, 0.64; 95% CI, 0.54 to 0.75; matched HR, 0.49; 95% CI, 0.36 to 0.65). In this clinical population of MRSA-infected patients, linezolid therapy was as effective as vancomycin therapy with respect to in-hospital survival and readmission.
PMCID: PMC2944576  PMID: 20660681
20.  Analysis of Observational Studies in the Presence of Treatment Selection Bias: Effects of Invasive Cardiac Management on AMI Survival Using Propensity Score and Instrumental Variable Methods 
Comparisons of outcomes between patients treated and untreated in observational studies may be biased due to differences in patient prognosis between groups, often because of unobserved treatment selection biases.
To compare 4 analytic methods for removing the effects of selection bias in observational studies: multivariable model risk adjustment, propensity score risk adjustment, propensity-based matching, and instrumental variable analysis.
Design, Setting, and Patients
A national cohort of 122 124 patients who were elderly (aged 65–84 years), receiving Medicare, and hospitalized with acute myocardial infarction (AMI) in 1994–1995, and who were eligible for cardiac catheterization. Baseline chart reviews were taken from the Cooperative Cardiovascular Project and linked to Medicare health administrative data to provide a rich set of prognostic variables. Patients were followed up for 7 years through December 31, 2001, to assess the association between long-term survival and cardiac catheterization within 30 days of hospital admission.
Main Outcome Measure
Risk-adjusted relative mortality rate using each of the analytic methods.
Patients who received cardiac catheterization (n=73 238) were younger and had lower AMI severity than those who did not. After adjustment for prognostic factors by using standard statistical risk-adjustment methods, cardiac catheterization was associated with a 50% relative decrease in mortality (for multivariable model risk adjustment: adjusted relative risk [RR], 0.51; 95% confidence interval [CI], 0.50–0.52; for propensity score risk adjustment: adjusted RR, 0.54; 95% CI, 0.53–0.55; and for propensity-based matching: adjusted RR, 0.54; 95% CI, 0.52–0.56). Using regional catheterization rate as an instrument, instrumental variable analysis showed a 16% relative decrease in mortality (adjusted RR, 0.84; 95% CI, 0.79–0.90). The survival benefits of routine invasive care from randomized clinical trials are between 8% and 21 %.
Estimates of the observational association of cardiac catheterization with long-term AMI mortality are highly sensitive to analytic method. All standard risk-adjustment methods have the same limitations regarding removal of unmeasured treatment selection biases. Compared with standard modeling, instrumental variable analysis may produce less biased estimates of treatment effects, but is more suited to answering policy questions than specific clinical questions.
PMCID: PMC2170524  PMID: 17227979
21.  Utilization of the propensity score method: an exploratory comparison of proxy-completed to self-completed responses in the Medicare Health Outcomes Survey 
This research examined the use of the propensity score method to compare proxy-completed responses to self-completed responses in the first three baseline cohorts of the Medicare Health Outcomes Survey, administered in 1998, 1999, and 2000, respectively. A proxy is someone other than the respondent who completes the survey for the respondent.
The propensity score method of matched sampling was used to compare proxy and self-completed responses. A propensity score is a value that equals the estimated probability of a given individual belonging to a treatment group given the observed background characteristics of that individual. Proxy and self-completed responses were compared on demographics, the SF-36, chronic conditions, activities of daily living, and depression-screening questions. For each individual survey respondent, logistic regression was used to calculate the probability that this individual belonged to the proxy respondent group (propensity score). Pre and post adjustment comparisons were tested by calculating effect sizes.
Differences between self and proxy-completed responses were substantially reduced with the use of the propensity score method. However, differences were still found in the SF-36, several demographics, several impaired activities of daily living, several chronic conditions, and one depression-screening question.
The propensity score method helped to reduce differences between proxy-completed and self-completed survey responses, thereby providing an approximation to a randomized controlled experiment of proxy-completed versus self-completed survey responses.
PMCID: PMC222919  PMID: 14570594
Propensity score; Medicare Health Outcomes Survey; elderly; proxy
22.  The use of propensity scores to assess the generalizability of results from randomized trials 
Randomized trials remain the most accepted design for estimating the effects of interventions, but they do not necessarily answer a question of primary interest: Will the program be effective in a target population in which it may be implemented? In other words, are the results generalizable? There has been very little statistical research on how to assess the generalizability, or “external validity,” of randomized trials. We propose the use of propensity-score-based metrics to quantify the similarity of the participants in a randomized trial and a target population. In this setting the propensity score model predicts participation in the randomized trial, given a set of covariates. The resulting propensity scores are used first to quantify the difference between the trial participants and the target population, and then to match, subclassify, or weight the control group outcomes to the population, assessing how well the propensity score-adjusted outcomes track the outcomes actually observed in the population. These metrics can serve as a first step in assessing the generalizability of results from randomized trials to target populations. This paper lays out these ideas, discusses the assumptions underlying the approach, and illustrates the metrics using data on the evaluation of a schoolwide prevention program called Positive Behavioral Interventions and Supports.
PMCID: PMC4051511  PMID: 24926156
Causal inference; External validity; Positive Behavioral Interventions and Supports; Research synthesis
23.  Causal Inference in Longitudinal Comparative Effectiveness Studies With Repeated Measures of A Continuous Intermediate Variable 
Statistics in medicine  2014;33(20):3509-3527.
We propose a principal stratification approach to assess causal effects in non-randomized longitudinal comparative effectiveness studies with a binary endpoint outcome and repeated measures of a continuous intermediate variable. Our method is an extension of the principal stratification approach by Lin et al. [10,11], originally proposed for a longitudinal randomized study to assess the treatment effect of a continuous outcome adjusting for the heterogeneity of a repeatedly measured binary intermediate variable. Our motivation for this work comes from a comparison of the effect of two glucose-lowering medications on a clinical cohort of patients with type 2 diabetes. Here we consider a causal inference problem assessing how well the two medications work relative to one another on two binary endpoint outcomes: cardiovascular disease related hospitalization and all-cause mortality. Clinically, these glucose-lowering medications can have differential effects on the intermediate outcome, glucose level over time. Ultimately we want to compare medication effects on the endpoint outcomes among individuals in the same glucose trajectory stratum while accounting for the heterogeneity in baseline covariates (i.e., to obtain “principal effects” on the endpoint outcomes). The proposed method involves a 3-step model estimation procedure. Step 1 identifies principal strata associated with the intermediate variable using hybrid growth mixture modeling analyses [13]. Step 2 obtains the stratum membership using the pseudoclass technique [17,18], and derives propensity scores for treatment assignment. Step 3 obtains the stratum-specific treatment effect on the endpoint outcome weighted by inverse propensity probabilities derived from Step 2.
PMCID: PMC4122661  PMID: 24577715
Causal inference; Comparative effectiveness studies; Growth mixture model; Principal stratification; Propensity score
24.  Propensity scores in the presence of effect modification: A case study using the comparison of mortality on hemodialysis versus peritoneal dialysis 
To control for confounding bias from non-random treatment assignment in observational data, both traditional multivariable models and more recently propensity score approaches have been applied. Our aim was to compare a propensity score-stratified model with a traditional multivariable-adjusted model, specifically in estimating survival of hemodialysis (HD) versus peritoneal dialysis (PD) patients.
Using the Dutch End-Stage Renal Disease Registry, we constructed a propensity score, predicting PD assignment from age, gender, primary renal disease, center of dialysis, and year of first renal replacement therapy. We developed two Cox proportional hazards regression models to estimate survival on PD relative to HD, a propensity score-stratified model stratifying on the propensity score and a multivariable-adjusted model, and tested several interaction terms in both models.
The propensity score performed well: it showed a reasonable fit, had a good c-statistic, calibrated well and balanced the covariates. The main-effects multivariable-adjusted model and the propensity score-stratified univariable Cox model resulted in similar relative mortality risk estimates of PD compared with HD (0.99 and 0.97, respectively) with fewer significant covariates in the propensity model. After introducing the missing interaction variables for effect modification in both models, the mortality risk estimates for both main effects and interactions remained comparable, but the propensity score model had nearly as many covariates because of the additional interaction variables.
Although the propensity score performed well, it did not alter the treatment effect in the outcome model and lost its advantage of parsimony in the presence of effect modification.
PMCID: PMC2890634  PMID: 20459823
25.  Effect of hospital level variation in the use of carotid artery stenting versus carotid endarterectomy on perioperative stroke and death in asymptomatic patients 
Journal of vascular surgery  2013;57(3):627-634.
Perioperative stroke and death (PSD) is more common after carotid artery stenting (CAS) than after carotid endarterectomy (CEA) in symptomatic patients, but it is unclear if this is also true in asymptomatic patients. Further, use of both CEA and CAS varies geographically, suggesting possible variation in outcomes. We compared odds of PSD after CAS and CEA in asymptomatic patients to determine the impact of this variation.
Design of Study
We identified CAS and CEA procedures and hospitals where they were performed in 2005–2009 California hospital discharge data. Preoperative symptom status and medical comorbidities were determined using administrative codes. We compared PSD rates after CAS and CEA using logistic regression and propensity score matching. We quantified hospital level variation in the relative utilization of CAS by calculating hospital-specific probabilities of CAS use among propensity score matched patients. We then calculated a weighted average for each hospital and used this as a predictor of PSD.
We identified 6,053 CAS and 36,524 CEA procedures that treated asymptomatic patients in 278 hospitals. PSD occurred in 250 CAS and 660 CEA patients, yielding unadjusted PSD rates of 4.1% and 1.8%, respectively (P<.001). Compared with CAS patients, CEA patients were more likely to be older than 70 (66% vs. 62%, P<.001), but less likely to have 3 or more Elixhauser comorbidities (37% vs. 39%, P<.001). Multivariate models demonstrated that CAS was associated with increased odds of PSD (OR 1.865, 95% CI 1.373–2.534, P<.001). Estimation of average treatment effects based on propensity scores also demonstrated 1.9% increased probability of PSD with CAS (P<.001). The average probability of receiving CAS across all hospitals and strata was 13.8%, but the inter-quartile range was 0.9%–21.5%, suggesting significant hospital level variation. In univariate analysis, patients treated at hospitals with higher CAS utilization had higher odds of PSD as compared to patients in hospitals that performed CAS less (OR 2.141, 95% CI 1.328–3.454, P=.002). Multivariate analysis did not demonstrate this effect, but again demonstrated higher odds of PSD after CAS (OR 1.963, 95% CI 1.393–2.765, P<.001).
CEA has lower odds of PSD compared to CAS in asymptomatic patients. Increased utilization of CAS at the hospital level is associated with increased odds of PSD among asymptomatic patients, but this effect appears to be related to generally worse outcomes after CAS as compared to CEA.
PMCID: PMC3692978  PMID: 23312937

Results 1-25 (1000674)