PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1100796)

Clipboard (0)
None

Related Articles

1.  The performance of different propensity score methods for estimating marginal hazard ratios 
Statistics in Medicine  2012;32(16):2837-2849.
Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non-randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time-to-event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time-to-event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population-average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest-neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time-to-event outcomes. Copyright © 2012 John Wiley & Sons, Ltd.
doi:10.1002/sim.5705
PMCID: PMC3747460  PMID: 23239115
propensity score; survival analysis; inverse probability of treatment weighting (IPTW); Monte Carlo simulations; observational study; time-to-event outcomes
2.  ESTIMATING TREATMENT EFFECTS ON HEALTHCARE COSTS UNDER EXOGENEITY: IS THERE A ‘MAGIC BULLET’? 
Methods for estimating average treatment effects, under the assumption of no unmeasured confounders, include regression models; propensity score adjustments using stratification, weighting, or matching; and doubly robust estimators (a combination of both). Researchers continue to debate about the best estimator for outcomes such as health care cost data, as they are usually characterized by an asymmetric distribution and heterogeneous treatment effects,. Challenges in finding the right specifications for regression models are well documented in the literature. Propensity score estimators are proposed as alternatives to overcoming these challenges. Using simulations, we find that in moderate size samples (n= 5000), balancing on propensity scores that are estimated from saturated specifications can balance the covariate means across treatment arms but fails to balance higher-order moments and covariances amongst covariates. Therefore, unlike regression model, even if a formal model for outcomes is not required, propensity score estimators can be inefficient at best and biased at worst for health care cost data. Our simulation study, designed to take a ‘proof by contradiction’ approach, proves that no one estimator can be considered the best under all data generating processes for outcomes such as costs. The inverse-propensity weighted estimator is most likely to be unbiased under alternate data generating processes but is prone to bias under misspecification of the propensity score model and is inefficient compared to an unbiased regression estimator. Our results show that there are no ‘magic bullets’ when it comes to estimating treatment effects in health care costs. Care should be taken before naively applying any one estimator to estimate average treatment effects in these data. We illustrate the performance of alternative methods in a cost dataset on breast cancer treatment.
doi:10.1007/s10742-011-0072-8
PMCID: PMC3244728  PMID: 22199462
Propensity score; non-linear regression; average treatment effect; health care costs
3.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies 
Statistics in Medicine  2010;29(20):2137-2148.
Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity-score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity-score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean-squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity-score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity-score methods. Differences between IPTW and propensity-score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd.
doi:10.1002/sim.3854
PMCID: PMC3068290  PMID: 20108233
propensity score; observational study; binary data; risk difference; number needed to treat; matching; IPTW; inverse probability of treatment weighting; propensity-score matching
4.  Assessing Causality in the Association between Child Adiposity and Physical Activity Levels: A Mendelian Randomization Analysis 
PLoS Medicine  2014;11(3):e1001618.
Here, Timpson and colleagues performed a Mendelian Randomization analysis to determine whether childhood adiposity causally influences levels of physical activity. The results suggest that increased adiposity causes a reduction in physical activity in children; however, this study does not exclude lower physical activity also leading to increasing adiposity.
Please see later in the article for the Editors' Summary
Background
Cross-sectional studies have shown that objectively measured physical activity is associated with childhood adiposity, and a strong inverse dose–response association with body mass index (BMI) has been found. However, few studies have explored the extent to which this association reflects reverse causation. We aimed to determine whether childhood adiposity causally influences levels of physical activity using genetic variants reliably associated with adiposity to estimate causal effects.
Methods and Findings
The Avon Longitudinal Study of Parents and Children collected data on objectively assessed activity levels of 4,296 children at age 11 y with recorded BMI and genotypic data. We used 32 established genetic correlates of BMI combined in a weighted allelic score as an instrumental variable for adiposity to estimate the causal effect of adiposity on activity.
In observational analysis, a 3.3 kg/m2 (one standard deviation) higher BMI was associated with 22.3 (95% CI, 17.0, 27.6) movement counts/min less total physical activity (p = 1.6×10−16), 2.6 (2.1, 3.1) min/d less moderate-to-vigorous-intensity activity (p = 3.7×10−29), and 3.5 (1.5, 5.5) min/d more sedentary time (p = 5.0×10−4). In Mendelian randomization analyses, the same difference in BMI was associated with 32.4 (0.9, 63.9) movement counts/min less total physical activity (p = 0.04) (∼5.3% of the mean counts/minute), 2.8 (0.1, 5.5) min/d less moderate-to-vigorous-intensity activity (p = 0.04), and 13.2 (1.3, 25.2) min/d more sedentary time (p = 0.03). There was no strong evidence for a difference between variable estimates from observational estimates. Similar results were obtained using fat mass index. Low power and poor instrumentation of activity limited causal analysis of the influence of physical activity on BMI.
Conclusions
Our results suggest that increased adiposity causes a reduction in physical activity in children and support research into the targeting of BMI in efforts to increase childhood activity levels. Importantly, this does not exclude lower physical activity also leading to increased adiposity, i.e., bidirectional causation.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
The World Health Organization estimates that globally at least 42 million children under the age of five are obese. The World Health Organization recommends that all children undertake at least one hour of physical activity daily, on the basis that increased physical activity will reduce or prevent excessive weight gain in children and adolescents. In practice, while numerous studies have shown that body mass index (BMI) shows a strong inverse correlation with physical activity (i.e., active children are thinner than sedentary ones), exercise programs specifically targeted at obese children have had only very limited success in reducing weight. The reasons for this are not clear, although environmental factors such as watching television and lack of exercise facilities are traditionally blamed.
Why Was This Study Done?
One of the reasons why obese children do not lose weight through exercise might be that being fat in itself leads to a decrease in physical activity. This is termed reverse causation, i.e., obesity causes sedentary behavior, rather than the other way around. The potential influence of environmental factors (e.g., lack of opportunity to exercise) makes it difficult to prove this argument. Recent research has demonstrated that specific genotypes are related to obesity in children. Specific variations within the DNA of individual genes (single nucleotide polymorphisms, or SNPs) are more common in obese individuals and predispose to greater adiposity across the weight distribution. While adiposity itself can be influenced by many environmental factors that complicate the interpretation of observed associations, at the population level, genetic variation is not related to the same factors, and over the life course cannot be changed. Investigations that exploit these properties of genetic associations to inform the interpretation of observed associations are termed Mendelian randomization studies. This research technique is used to reduce the influence of confounding environmental factors on an observed clinical condition. The authors of this study use Mendelian randomization to determine whether a genetic tendency towards high BMI and fat mass is correlated with reduced levels of physical activity in a large cohort of children.
What Did the Researchers Do and Find?
The researchers looked at a cohort of children from a large long-term health research project (the Avon Longitudinal Study of Parents and Children). BMI and total body fat were recorded. Total daily activity was measured via a small movement-counting device. In addition, the participants underwent genotyping to detect the presence of several SNPs known to be linked to obesity. For each child a total BMI allelic score was determined based on the number of obesity-related genetic variants carried by that individual. The association between obesity and reduced physical activity was then studied in two ways. Direct correlation between actual BMI and physical activity was measured (observational data). Separately, the link between BMI allelic score and physical activity was also determined (Mendelian randomization or instrumental variable analysis). The observational data showed that boys were more active than girls and had lower BMI. Across both sexes, a higher-than-average BMI was associated with lower daily activity. In genetic analyses, allelic score had a positive correlation with BMI, with one particular SNP being most strongly linked to high BMI and total fat mass. A high allelic score for BMI was also correlated with lower levels of daily physical activity. The authors conclude that children who are obese and have an inherent predisposition to high BMI also have a propensity to reduced levels of physical activity, which may compound their weight gain.
What Do These Findings Mean?
This study provides evidence that being fat is in itself a risk factor for low activity levels, separately from external environmental influences. This may be an example of “reverse causation,” i.e., high BMI causes a reduction in physical activity. Alternatively, there may be a bidirectional causality, so that those with a genetic predisposition to high fat mass exercise less, leading to higher BMI, and so on, in a vicious circle. A significant limitation of the study is that validated allelic scores for physical activity are not available. Thus, it is not possible to determine whether individuals with a high allelic score for BMI also have a propensity to exercise less, or whether it is simply the circumstance of being overweight that discourages activity. This study does suggest that trying to persuade obese children to lose weight by exercising more is likely to be ineffective unless additional strategies to reduce BMI, such as strict diet control, are also implemented.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001618.
The US Centers for Disease Control and Prevention provides obesity-related statistics, details of prevention programs, and an overview on public health strategy in the United States
A more worldwide view is given by the World Health Organization
The UK National Health Service website gives information on physical activity guidelines for different age groups
The International Obesity Task Force is a network of organizations that seeks to alert the world to the growing health crisis threatened by soaring levels of obesity
MedlinePlus—which brings together authoritative information from the US National Library of Medicine, National Institutes of Health, and other government agencies and health-related organizations—has a page on obesity
Additional information on the Avon Longitudinal Study of Parents and Children is available
The British Medical Journal has an article that describes Mendelian randomization
doi:10.1371/journal.pmed.1001618
PMCID: PMC3958348  PMID: 24642734
5.  Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation 
Multivariate behavioral research  2012;47(1):115-135.
Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of ensemble-based methods (bagged regression trees, random forests, and boosted regression trees) to directly estimate average treatment effects by imputing potential outcomes. We used an extensive series of Monte Carlo simulations to estimate bias, variance, and mean squared error of treatment effects estimated using different ensemble methods. For comparative purposes, we compared the performance of these methods with inverse probability of treatment weighting using the propensity score when logistic regression or ensemble methods were used to estimate the propensity score. Using boosted regression trees of depth 3 or 4 to impute potential outcomes tended to result in estimates with bias equivalent to that of the best performing methods. Using an empirical case study, we compared inferences on the effect of in-hospital smoking cessation counseling on subsequent mortality in patients hospitalized with an acute myocardial infarction.
doi:10.1080/00273171.2012.640600
PMCID: PMC3293511  PMID: 22419832 CAMSID: cams2143
6.  Variance reduction in randomised trials by inverse probability weighting using the propensity score 
Statistics in Medicine  2013;33(5):721-737.
In individually randomised controlled trials, adjustment for baseline characteristics is often undertaken to increase precision of the treatment effect estimate. This is usually performed using covariate adjustment in outcome regression models. An alternative method of adjustment is to use inverse probability-of-treatment weighting (IPTW), on the basis of estimated propensity scores. We calculate the large-sample marginal variance of IPTW estimators of the mean difference for continuous outcomes, and risk difference, risk ratio or odds ratio for binary outcomes. We show that IPTW adjustment always increases the precision of the treatment effect estimate. For continuous outcomes, we demonstrate that the IPTW estimator has the same large-sample marginal variance as the standard analysis of covariance estimator. However, ignoring the estimation of the propensity score in the calculation of the variance leads to the erroneous conclusion that the IPTW treatment effect estimator has the same variance as an unadjusted estimator; thus, it is important to use a variance estimator that correctly takes into account the estimation of the propensity score. The IPTW approach has particular advantages when estimating risk differences or risk ratios. In this case, non-convergence of covariate-adjusted outcome regression models frequently occurs. Such problems can be circumvented by using the IPTW adjustment approach. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
doi:10.1002/sim.5991
PMCID: PMC4285308  PMID: 24114884
variance estimation; baseline adjustment
7.  Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies 
Epidemiology (Cambridge, Mass.)  2009;20(4):488-495.
Overadjustment is defined inconsistently. This term is meant to describe control (eg, by regression adjustment, stratification, or restriction) for a variable that either increases net bias or decreases precision without affecting bias. We define overadjustment bias as control for an intermediate variable (or a descending proxy for an intermediate variable) on a causal path from exposure to outcome. We define unnecessary adjustment as control for a variable that does not affect bias of the causal relation between exposure and outcome but may affect its precision. We use causal diagrams and an empirical example (the effect of maternal smoking on neonatal mortality) to illustrate and clarify the definition of overadjustment bias, and to distinguish overadjustment bias from unnecessary adjustment. Using simulations, we quantify the amount of bias associated with overadjustment. Moreover, we show that this bias is based on a different causal structure from confounding or selection biases. Overadjustment bias is not a finite sample bias, while inefficiencies due to control for unnecessary variables are a function of sample size.
doi:10.1097/EDE.0b013e3181a819a1
PMCID: PMC2744485  PMID: 19525685
8.  Weight Trimming and Propensity Score Weighting 
PLoS ONE  2011;6(3):e18174.
Propensity score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve the performance of propensity score weighting and whether the benefits of trimming differ by propensity score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate propensity score weights. Results indicate that although misspecified logistic regression propensity score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated propensity scores. While trimming may be used to optimize propensity score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of propensity score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the propensity score model) rather than relying on ad-hoc methods such as weight trimming.
doi:10.1371/journal.pone.0018174
PMCID: PMC3069059  PMID: 21483818
9.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies 
Pharmaceutical Statistics  2010;10(2):150-161.
In a study comparing the effects of two treatments, the propensity score is the probability of assignment to one treatment conditional on a subject's measured baseline covariates. Propensity-score matching is increasingly being used to estimate the effects of exposures using observational data. In the most common implementation of propensity-score matching, pairs of treated and untreated subjects are formed whose propensity scores differ by at most a pre-specified amount (the caliper width). There has been a little research into the optimal caliper width. We conducted an extensive series of Monte Carlo simulations to determine the optimal caliper width for estimating differences in means (for continuous outcomes) and risk differences (for binary outcomes). When estimating differences in means or risk differences, we recommend that researchers match on the logit of the propensity score using calipers of width equal to 0.2 of the standard deviation of the logit of the propensity score. When at least some of the covariates were continuous, then either this value, or one close to it, minimized the mean square error of the resultant estimated treatment effect. It also eliminated at least 98% of the bias in the crude estimator, and it resulted in confidence intervals with approximately the correct coverage rates. Furthermore, the empirical type I error rate was approximately correct. When all of the covariates were binary, then the choice of caliper width had a much smaller impact on the performance of estimation of risk differences and differences in means. Copyright © 2010 John Wiley & Sons, Ltd.
doi:10.1002/pst.433
PMCID: PMC3120982  PMID: 20925139
propensity score; observational study; binary data; risk difference; propensity-score matching; Monte Carlo simulations; bias; matching
10.  Constructing Inverse Probability Weights for Marginal Structural Models 
American Journal of Epidemiology  2008;168(6):656-664.
The method of inverse probability weighting (henceforth, weighting) can be used to adjust for measured confounding and selection bias under the four assumptions of consistency, exchangeability, positivity, and no misspecification of the model used to estimate weights. In recent years, several published estimates of the effect of time-varying exposures have been based on weighted estimation of the parameters of marginal structural models because, unlike standard statistical methods, weighting can appropriately adjust for measured time-varying confounders affected by prior exposure. As an example, the authors describe the last three assumptions using the change in viral load due to initiation of antiretroviral therapy among 918 human immunodeficiency virus-infected US men and women followed for a median of 5.8 years between 1996 and 2005. The authors describe possible tradeoffs that an epidemiologist may encounter when attempting to make inferences. For instance, a tradeoff between bias and precision is illustrated as a function of the extent to which confounding is controlled. Weight truncation is presented as an informal and easily implemented method to deal with these tradeoffs. Inverse probability weighting provides a powerful methodological tool that may uncover causal effects of exposures that are otherwise obscured. However, as with all methods, diagnostics and sensitivity analyses are essential for proper use.
doi:10.1093/aje/kwn164
PMCID: PMC2732954  PMID: 18682488
bias (epidemiology); causality; confounding factors (epidemiology); probability weighting; regression model
11.  Confounding control in a non-experimental study of STAR*D data: Logistic regression balanced covariates better than boosted CART 
Annals of epidemiology  2013;23(4):204-209.
Purpose
Propensity scores, a powerful bias-reduction tool, can balance treatment groups on measured covariates in non-experimental studies. We demonstrate the use of multiple propensity score estimation methods to optimize covariate balance.
Methods
We used secondary data from 1,292 adults with non-psychotic major depressive disorder in the Sequenced Treatment Alternatives to Relieve Depression trial (2001–2004). After initial citalopram treatment failed, patient preference influenced assignment to medication augmentation (n=565) or switch (n=727). To reduce selection bias, we used boosted classification and regression trees (BCART) and logistic regression iteratively to identify two potentially optimal propensity scores. We assessed and compared covariate balance.
Results
After iterative selection of interaction terms to minimize imbalance, logistic regression yielded better balance than BCART (average standardized absolute mean difference across 47 covariates: 0.03 vs. 0.08, matching; 0.02 vs. 0.05, weighting).
Conclusions
Comparing multiple propensity score estimates is a pragmatic way to optimize balance. Logistic regression remains valuable for this purpose. Simulation studies are needed to compare propensity score models under varying conditions. Such studies should consider more flexible estimation methods, such as logistic models with automated selection of interactions or hybrid models using main effects logistic regression instead of a constant log-odds as the initial model for BCART.
doi:10.1016/j.annepidem.2013.01.004
PMCID: PMC3773847  PMID: 23419508
propensity score; statistics as topic; models, statistical; epidemiologic methods; estimation techniques
12.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality 
Multivariate behavioral research  2011;46(1):119-151.
Propensity score methods allow investigators to estimate causal treatment effects using observational or nonrandomized data. In this article we provide a practical illustration of the appropriate steps in conducting propensity score analyses. For illustrative purposes, we use a sample of current smokers who were discharged alive after being hospitalized with a diagnosis of acute myocardial infarction. The exposure of interest was receipt of smoking cessation counseling prior to hospital discharge and the outcome was mortality with 3 years of hospital discharge. We illustrate the following concepts: first, how to specify the propensity score model; second, how to match treated and untreated participants on the propensity score; third, how to compare the similarity of baseline characteristics between treated and untreated participants after stratifying on the propensity score, in a sample matched on the propensity score, or in a sample weighted by the inverse probability of treatment; fourth, how to estimate the effect of treatment on outcomes when using propensity score matching, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, or covariate adjustment using the propensity score. Finally, we compare the results of the propensity score analyses with those obtained using conventional regression adjustment.
doi:10.1080/00273171.2011.540480
PMCID: PMC3266945  PMID: 22287812 CAMSID: cams1834
13.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality 
Multivariate Behavioral Research  2011;46(1):119-151.
Propensity score methods allow investigators to estimate causal treatment effects using observational or nonrandomized data. In this article we provide a practical illustration of the appropriate steps in conducting propensity score analyses. For illustrative purposes, we use a sample of current smokers who were discharged alive after being hospitalized with a diagnosis of acute myocardial infarction. The exposure of interest was receipt of smoking cessation counseling prior to hospital discharge and the outcome was mortality with 3 years of hospital discharge. We illustrate the following concepts: first, how to specify the propensity score model; second, how to match treated and untreated participants on the propensity score; third, how to compare the similarity of baseline characteristics between treated and untreated participants after stratifying on the propensity score, in a sample matched on the propensity score, or in a sample weighted by the inverse probability of treatment; fourth, how to estimate the effect of treatment on outcomes when using propensity score matching, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, or covariate adjustment using the propensity score. Finally, we compare the results of the propensity score analyses with those obtained using conventional regression adjustment.
doi:10.1080/00273171.2011.540480
PMCID: PMC3266945  PMID: 22287812
14.  Using imputed pre-treatment cholesterol in a propensity score model to reduce confounding by indication: results from the multi-ethnic study of atherosclerosis 
Background
Studying the effects of medications on endpoints in an observational setting is an important yet challenging problem due to confounding by indication. The purpose of this study is to describe methodology for estimating such effects while including prevalent medication users. These techniques are illustrated in models relating statin use to cardiovascular disease (CVD) in a large multi-ethnic cohort study.
Methods
The Multi-Ethnic Study of Atherosclerosis (MESA) includes 6814 participants aged 45-84 years free of CVD. Confounding by indication was mitigated using a two step approach: First, the untreated values of cholesterol were treated as missing data and the values imputed as a function of the observed treated value, dose and type of medication, and participant characteristics. Second, we construct a propensity-score modeling the probability of medication initiation as a function of measured covariates and estimated pre-treatment cholesterol value. The effect of statins on CVD endpoints were assessed using weighted Cox proportional hazard models using inverse probability weights based on the propensity score.
Results
Based on a meta-analysis of randomized controlled trials (RCT) statins are associated with a reduced risk of CVD (relative risk ratio = 0.73, 95% CI: 0.70, 0.77). In an unweighted Cox model adjusting for traditional risk factors we observed little association of statins with CVD (hazard ratio (HR) = 0.97, 95% CI: 0.60, 1.59). Using weights based on a propensity model for statins that did not include the estimated pre-treatment cholesterol we observed a slight protective association (HR = 0.92, 95% CI: 0.54-1.57). Results were similar using a new-user design where prevalent users of statins are excluded (HR = 0.91, 95% CI: 0.45-1.80). Using weights based on a propensity model with estimated pre-treatment cholesterol the effects of statins (HR = 0.74, 95% CI: 0.38, 1.42) were consistent with the RCT literature.
Conclusions
The imputation of pre-treated cholesterol levels for participants on medication at baseline in conjunction with a propensity score yielded estimates that were consistent with the RCT literature. These techniques could be useful in any example where inclusion of participants exposed at baseline in the analysis is desirable, and reasonable estimates of pre-exposure biomarker values can be estimated.
doi:10.1186/1471-2288-13-81
PMCID: PMC3694006  PMID: 23800038
Multiple imputation; Confounding by indication; Propensity score; Inverse probability of treatment weights; Statins
15.  Application of a Propensity Score Approach for Risk Adjustment in Profiling Multiple Physician Groups on Asthma Care 
Health Services Research  2005;40(1):253-278.
Objectives
To develop a propensity score-based risk adjustment method to estimate the performance of 20 physician groups and to compare performance rankings using our method to a standard hierarchical regression-based risk adjustment method.
Data Sources/Study Setting
Mailed survey of patients from 20 California physician groups between July 1998 and February 1999.
Study Design
A cross-sectional analysis of physician group performance using patient satisfaction with asthma care. We compared the performance of the 20 physician groups using a novel propensity score-based risk adjustment method. More specifically, by using a multinomial logistic regression model we estimated for each patient the propensity scores, or probabilities, of having been treated by each of the 20 physician groups. To adjust for different distributions of characteristics across groups, patients cared for by a given group were first stratified into five strata based on their propensity of being in that group. Then, strata-specific performance was combined across the five strata. We compared our propensity score method to hierarchical model-based risk adjustment without using propensity scores. The impact of different risk-adjustment methods on performance was measured in terms of percentage changes in absolute and quintile ranking (AR, QR), and weighted κ of agreement on QR.
Results
The propensity score-based risk adjustment method balanced the distributions of all covariates among the 20 physician groups, providing evidence for validity. The propensity score-based method and the hierarchical model-based method without propensity scores provided substantially different rankings (75 percent of groups differed in AR, 50 percent differed in QR, weighted κ=0.69).
Conclusions
We developed and tested a propensity score method for profiling multiple physician groups. We found that our method could balance the distributions of covariates across groups and yielded substantially different profiles compared with conventional methods. Propensity score-based risk adjustment should be considered in studies examining quality comparisons.
doi:10.1111/j.1475-6773.2005.00352.x
PMCID: PMC1361136  PMID: 15663712
Physician group; profiling; propensity score; regression-to-the-mean; risk adjustment
16.  A Tutorial on Propensity Score Estimation for Multiple Treatments Using Generalized Boosted Models 
Statistics in medicine  2013;32(19):3388-3414.
The use of propensity scores to control for pretreatment imbalances on observed variables in non-randomized or observational studies examining the causal effects of treatments or interventions has become widespread over the past decade. For settings with two conditions of interest such as a treatment and a control, inverse probability of treatment weighted (IPTW) estimation with propensity scores estimated via boosted models has been shown in simulation studies to yield causal effect estimates with desirable properties. There are tools (e.g., the twang package in R) and guidance for implementing this method with two treatments. However, there is not such guidance for analyses of three or more treatments. The goals of this paper are two-fold: (1) to provide step-by-step guidance for researchers who want to implement propensity score weighting for multiple treatments and (2) to propose the use of generalized boosted models (GBM) for estimation of the necessary propensity score weights. We define the causal quantities that may be of interest to studies of multiple treatments and derive weighted estimators of those quantities. We present a detailed plan for using GBM to estimate propensity scores and using those scores to estimate weights and causal effects. Tools for assessing balance and overlap of pretreatment variables among treatment groups in the context of multiple treatments are also provided. A case study examining the effects of three treatment programs for adolescent substance abuse demonstrates the methods.
doi:10.1002/sim.5753
PMCID: PMC3710547  PMID: 23508673
Causal Effects; Causal Modeling; GBM; Inverse Probability of Treatment Weighting; TWANG
17.  Comparing paired vs non-paired statistical methods of analyses when making inferences about absolute risk reductions in propensity-score matched samples 
Statistics in Medicine  2011;30(11):1292-1301.
Propensity-score matching allows one to reduce the effects of treatment-selection bias or confounding when estimating the effects of treatments when using observational data. Some authors have suggested that methods of inference appropriate for independent samples can be used for assessing the statistical significance of treatment effects when using propensity-score matching. Indeed, many authors in the applied medical literature use methods for independent samples when making inferences about treatment effects using propensity-score matched samples. Dichotomous outcomes are common in healthcare research. In this study, we used Monte Carlo simulations to examine the effect on inferences about risk differences (or absolute risk reductions) when statistical methods for independent samples are used compared with when statistical methods for paired samples are used in propensity-score matched samples. We found that compared with using methods for independent samples, the use of methods for paired samples resulted in: (i) empirical type I error rates that were closer to the advertised rate; (ii) empirical coverage rates of 95 per cent confidence intervals that were closer to the advertised rate; (iii) narrower 95 per cent confidence intervals; and (iv) estimated standard errors that more closely reflected the sampling variability of the estimated risk difference. Differences between the empirical and advertised performance of methods for independent samples were greater when the treatment-selection process was stronger compared with when treatment-selection process was weaker. We recommend using statistical methods for paired samples when using propensity-score matched samples for making inferences on the effect of treatment on the reduction in the probability of an event occurring. Copyright © 2011 John Wiley & Sons, Ltd.
doi:10.1002/sim.4200
PMCID: PMC3110307  PMID: 21337595
propensity score; propensity-score matching; risk difference; absolute risk reduction; Monte Carlo simulations; statistical inference; hypothesis testing; type I error rate; categorical data analysis
18.  Propensity score based comparison of long term outcomes with 3D conformal radiotherapy (3DCRT) versus Intensity Modulated Radiation Therapy (IMRT) in the treatment of esophageal cancer 
Purpose
Although 3DCRT is the worldwide standard for the treatment of esophageal cancers, IMRT improves dose conformality and reduces radiation exposure to normal tissues. We hypothesized that the dosimetric advantages of IMRT should translate to substantive benefits in clinical outcomes compared to 3DCRT.
Methods and Materials
Analysis was performed on 676 nonrandomized patients (3DCRT=413, IMRT=263) with stage Ib-IVa (AJCC 2002) esophageal cancers treated with chemoradiation at a single institution from 1998–2008. An inverse probability of treatment weighting (IPW) and inclusion of propensity score (treatment probability) as a covariate were used to compare overall survival (OS) time, time to local failure, and time to distant metastasis, while accounting for effects of other clinically relevant covariates. Propensity scores were estimated using logistic regression.
Results
A fitted multivariate inverse probability weighted (IPW)-adjusted Cox model showed that OS time was significantly associated with several well-known prognostic factors, along with radiation modality (IMRT vs 3DCRT, HR=0.72, p<0.001). Compared to IMRT, 3DCRT patients had a significantly greater risk of dying (72.6% vs 52.9%, IPW log rank test: p<0.0001) and for local-regional recurrence (LRR) (p=0.0038). There was no difference in cancer-specific mortality (Gray’s test, p=0.86), or distant metastasis (p=0.99) between the two groups. An increased cumulative incidence of cardiac deaths was seen in the 3DCRT group (p=0.049), but most deaths were undocumented (5 year estimate: 11.7% in 3DCRT vs 5.4% in IMRT, Gray’s test, p=0.0029).
Conclusions
Overall survival, locoregional control, and non-cancer related deaths were significantly better for IMRT compared to 3DCRT. Although these results need confirmation, IMRT should be considered for the treatment of esophageal cancer.
doi:10.1016/j.ijrobp.2012.02.015
PMCID: PMC3923623  PMID: 22867894
IMRT; 3D-conformal radiation therapy; chemoradiation; esophageal cancer; propensity score
19.  Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples 
Statistics in Medicine  2009;28(25):3083-3107.
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile–quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright © 2009 John Wiley & Sons, Ltd.
doi:10.1002/sim.3697
PMCID: PMC3472075  PMID: 19757444
balance; goodness-of-fit; observational study; propensity score; matching; propensity-score matching; standardized difference; bias
20.  A comparison of perioperative outcomes of Video-Assisted Thoracic Surgical (VATS) Lobectomy with open thoracotomy and lobectomy: Results of an analysis using propensity score based weighting 
Background
Randomized trials comparing VATS lobectomy to open lobectomy are of small size. We analyzed a case-control series using propensity score-weighting to adjust for important covariates in order to compare the clinical outcomes of the two techniques.
Methods
We compared patients undergoing lobectomy for clinical stage I lung cancer (NSCLC) by either VATS or open (THOR) methods. Inverse probability of treatment weighted estimators, with weights derived from propensity scores, were used to adjust cohorts for determinants of perioperative morbidity and mortality including age, gender, preop FEV1, ASA class, and Charlson Comorbidity Index (CCI). Bootstrap methods provided standard errors. Endpoints were postoperative stay (LOS), chest tube duration, complications, and lymph node retrieval.
Results
We analyzed 136 consecutive lobectomy patients. Operative mortality was 1/62 (1.6%) for THOR and 1/74 (1.4%) for VATS, P = 1.00. 5/74 (6.7%) VATS were converted to open procedures. Adjusted median LOS was 7 days (THOR) versus 4 days (VATS), P < 0.0001, HR = 0.33. Adjusted median chest tube duration (days) was 5 (THOR) versus 3 (VATS), P < 0.0001, HR = 0.42. Complication rates were 39% (THOR) versus 34% (VATS), P = 0.61. Adjusted mean number of lymph nodes dissected per patient was 18.1 (THOR) versus 14.8 (VATS), p = 0.17.
Conclusions
After balancing covariates that affect morbidity, mortality and LOS in this case-control series using propensity-weighting, the results confirm that VATS lobectomy is associated with a statistically significant shorter LOS, similar mortality and complication rates and similar rates of lymph node removal in patients with clinical stage I NSCLC.
doi:10.1186/1750-1164-4-1
PMCID: PMC2848683  PMID: 20307297
21.  Estimating Heterogeneous Treatment Effects with Observational Data* 
Sociological methodology  2012;42(1):314-347.
Individuals differ not only in their background characteristics, but also in how they respond to a particular treatment, intervention, or stimulation. In particular, treatment effects may vary systematically by the propensity for treatment. In this paper, we discuss a practical approach to studying heterogeneous treatment effects as a function of the treatment propensity, under the same assumption commonly underlying regression analysis: ignorability. We describe one parametric method and two non-parametric methods for estimating interactions between treatment and the propensity for treatment. For the first method, we begin by estimating propensity scores for the probability of treatment given a set of observed covariates for each unit and construct balanced propensity score strata; we then estimate propensity score stratum-specific average treatment effects and evaluate a trend across them. For the second method, we match control units to treated units based on the propensity score and transform the data into treatment-control comparisons at the most elementary level at which such comparisons can be constructed; we then estimate treatment effects as a function of the propensity score by fitting a non-parametric model as a smoothing device. For the third method, we first estimate non-parametric regressions of the outcome variable as a function of the propensity score separately for treated units and for control units and then take the difference between the two non-parametric regressions. We illustrate the application of these methods with an empirical example of the effects of college attendance on womens fertility.
PMCID: PMC3591476  PMID: 23482633
causal effects; treatment effects; heterogeneity; propensity scores; matching
22.  The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments 
Statistics in Medicine  2013;33(7):1242-1258.
Propensity score methods are increasingly being used to estimate causal treatment effects in observational studies. In medical and epidemiological studies, outcomes are frequently time-to-event in nature. Propensity-score methods are often applied incorrectly when estimating the effect of treatment on time-to-event outcomes. This article describes how two different propensity score methods (matching and inverse probability of treatment weighting) can be used to estimate the measures of effect that are frequently reported in randomized controlled trials: (i) marginal survival curves, which describe survival in the population if all subjects were treated or if all subjects were untreated; and (ii) marginal hazard ratios. The use of these propensity score methods allows one to replicate the measures of effect that are commonly reported in randomized controlled trials with time-to-event outcomes: both absolute and relative reductions in the probability of an event occurring can be determined. We also provide guidance on variable selection for the propensity score model, highlight methods for assessing the balance of baseline covariates between treated and untreated subjects, and describe the implementation of a sensitivity analysis to assess the effect of unmeasured confounding variables on the estimated treatment effect when outcomes are time-to-event in nature. The methods in the paper are illustrated by estimating the effect of discharge statin prescribing on the risk of death in a sample of patients hospitalized with acute myocardial infarction. In this tutorial article, we describe and illustrate all the steps necessary to conduct a comprehensive analysis of the effect of treatment on time-to-event outcomes. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
doi:10.1002/sim.5984
PMCID: PMC4285179  PMID: 24122911
propensity score; observational study; propensity score matching; inverse probability of treatment weighting; survival analysis; event history analysis; confounding; marginal effects
23.  TV Viewing and Physical Activity Are Independently Associated with Metabolic Risk in Children: The European Youth Heart Study 
PLoS Medicine  2006;3(12):e488.
Background
TV viewing has been linked to metabolic-risk factors in youth. However, it is unclear whether this association is independent of physical activity (PA) and obesity.
Methods and Findings
We did a population-based, cross-sectional study in 9- to 10-y-old and 15- to 16-y-old boys and girls from three regions in Europe (n = 1,921). We examined the independent associations between TV viewing, PA measured by accelerometry, and metabolic-risk factors (body fatness, blood pressure, fasting triglycerides, inverted high-density lipoprotein (HDL) cholesterol, glucose, and insulin levels). Clustered metabolic risk was expressed as a continuously distributed score calculated as the average of the standardized values of the six subcomponents. There was a positive association between TV viewing and adiposity (p = 0.021). However, after adjustment for PA, gender, age group, study location, sexual maturity, smoking status, birth weight, and parental socio-economic status, the association of TV viewing with clustered metabolic risk was no longer significant (p = 0.053). PA was independently and inversely associated with systolic and diastolic blood pressure, fasting glucose, insulin (all p < 0.01), and triglycerides (p = 0.02). PA was also significantly and inversely associated with the clustered risk score (p < 0.0001), independently of obesity and other confounding factors.
Conclusions
TV viewing and PA may be separate entities and differently associated with adiposity and metabolic risk. The association between TV viewing and clustered metabolic risk is mediated by adiposity, whereas PA is associated with individual and clustered metabolic-risk indicators independently of obesity. Thus, preventive action against metabolic risk in children may need to target TV viewing and PA separately.
A study of over 1,900 European children showed that TV viewing and physical activity in children are separately associated with obesity and metabolic risk.
Editors' Summary
Background.
Childhood obesity is a rapidly growing problem. Twenty-five years ago, overweight children were rare. Now, 155 million of the world's children are overweight, and 30–45 million are obese. Both conditions are diagnosed by comparing a child's body mass index (BMI; weight divided by height squared) with the average BMI for their age and sex. Being overweight during childhood is worrying because it is one of the so-called metabolic-risk factors that increase the chances of developing diabetes, heart problems, or strokes later in life. Other metabolic-risk factors are fatness around the belly, blood-fat disorders, high blood pressure, and problems with how the body uses insulin and blood sugar. Until recently, like obesity, these other metabolic-risk factors were seen only in adults, but now they are becoming increasingly common in children. In the US, 1 in 20 adolescents has metabolic syndrome—three or more of these risk factors. Environmental and behavioural changes have probably contributed to the increase in metabolic syndrome in children. As a group, they tend to be less physically active nowadays and they eat bigger portions of energy-dense foods more often. Increased TV viewing during childhood (and the use of other media such as computer games) has also been linked to increased obesity and to poorer health as an adult.
Why Was This Study Done?
One popular theory is that TV viewing may affect obesity and other metabolic-risk factors by displacing PA. Instead of playing in the yard after school, the theory suggests, children laze about in front of the TV. However, there is limited evidence to support this idea, and health professionals need to know whether TV viewing and PA are related, and how they affect metabolic-risk factors, in order to improve children's health. In this study, the researchers examined the associations between TV viewing, PA, and metabolic-risk factors in European children.
What Did the Researchers Do and Find?
The researchers enrolled nearly 2,000 children in two age groups from three areas in Europe. They measured the children's height and weight, estimated how fat they were by measuring skin fold thickness, measured their blood pressure, and examined the levels of glucose, insulin, and different fats in their blood. The children completed a computer questionnaire about the lengths of time for which they watched TV and how often they ate while doing so, and their PA was measured using a device called an accelerometer that each child wore for four days. When these data were analyzed statistically, the researchers found that TV viewing was slightly associated with clustered metabolic risk (the average of the individual metabolic-risk factors). This association was due to an association between TV viewing and obesity—the children who watched most TV tended to be the fattest children. However, TV viewing was not related to PA. The most active children were not necessarily those who watched least TV. Most importantly, PA was related to all individual risk factors except for obesity and with clustered metabolic risk. These associations were independent of obesity.
What Do These Findings Mean?
These results suggest that TV viewing does not damage children's health by displacing PA as popularly believed. The finding that the association between TV viewing and clustered metabolic-risk factors is mediated by obesity suggests that targeting behaviours like eating while watching TV might be a good way to improve children's health. Indeed, the researchers provide some evidence that eating while watching TV is associated with being overweight, but the results of this post hoc analysis—one that was not planned in advance—need to be confirmed. Another limitation of the study is the possibility that the children inaccurately reported their TV watching habits. Also, because measurements of metabolic-risk factors were made only once, it is impossible to say whether TV viewing or lack of PA actually causes an increase in metabolic-risk factors.
Nevertheless, these results strongly suggest that promoting PA is beneficial in relation to metabolic-risk factors, but less so in relation to obesity in childhood. TV viewing and PA should be treated as separate targets in programs designed to reverse the obesity and metabolic-syndrome epidemic in children.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/doi:10.1371/journal.pmed.0030488.
US Centers for Disease Control and Prevention, information on overweight and obesity
International Obesity Taskforce, information on obesity and its prevention, particularly in childhood
Global Prevention Alliance, details of international efforts to halt the obesity epidemic and its associated chronic diseases
American Heart Association, information for patients and professionals on metabolic syndrome and children's health
doi:10.1371/journal.pmed.0030488
PMCID: PMC1705825  PMID: 17194189
24.  Analyzing Genetic Association Studies with an Extended Propensity Score Approach 
Statistical applications in genetics and molecular biology  2012;11(5):10.1515/1544-6115.1790 /j/sagmb.2012.11.issue-5/1544-6115.1790/1544-6115.1790.xml.
Propensity scores are commonly used to address confounding in observational studies. However, they have not been previously adapted to deal with bias in genetic association studies. We propose an extension of our previous method (Zhao et al., 2009) that uses a multilevel propensity score approach and allows one to estimate the effect of a genotype under an additive model and also simultaneously adjusts for confounders such as genetic ancestry and patient and disease characteristics. Using simulation studies, we demonstrate that this extended genetic propensity score (eGPS) can adequately adjust and consistently correct for bias due to confounding in a variety of circumstances. Under all simulation scenarios, the eGPS method yields estimates with bias close to 0 (mean=0.018, standard error=0.01). Our method also preserves statistical properties such as coverage probability, Type I error, and power. We illustrate this approach in a population-based genetic association study of testicular germ cell tumors and KITLG and SPRY4 susceptibility genes. We conclude that our method provides a novel and broadly applicable analytic strategy for obtaining less biased and more valid estimates of genetic associations.
doi:10.1515/1544-6115.1790
PMCID: PMC3518898  PMID: 23104843
population-based genetic association; propensity scores; population stratification; confounding; genetic and non-genetic covariates; susceptibility genes
25.  HANDLING MISSING DATA BY DELETING COMPLETELY OBSERVED RECORDS 
When data are missing, analyzing records that are completely observed may cause bias or inefficiency. Existing approaches in handling missing data include likelihood, imputation and inverse probability weighting. In this paper, we propose three estimators inspired by deleting some completely observed data in the regression setting. First, we generate artificial observation indicators that are independent of outcome given the observed data and draw inferences conditioning on the artificial observation indicators. Second, we propose a closely related weighting method. The proposed weighting method has more stable weights than those of the inverse probability weighting method (Zhao and Lipsitz, 1992). Third, we improve the efficiency of the proposed weighting estimator by subtracting the projection of the estimating function onto the nuisance tangent space. When data are missing completely at random, we show that the proposed estimators have asymptotic variances smaller than or equal to the variance of the estimator obtained from using completely observed records only. Asymptotic relative efficiency computation and simulation studies indicate that the proposed weighting estimators are more efficient than the inverse probability weighting estimators under wide range of practical situations especially when when the missingness proportion is large.
doi:10.1016/j.jspi.2008.10.024
PMCID: PMC2674251  PMID: 20160863

Results 1-25 (1100796)