Sample Description
The study sample for this case study consisted of 2,342 participants, of whom 1,588 received in-patient smoking cessation counseling and 754 did not. The baseline characteristics of exposed and unexposed participants are described in . Patients receiving smoking cessation counseling tended to be younger (p < .001), were less likely to be female (p < .032), tended to have a lower burden of comorbid conditions, and were more likely to receive prescriptions for cardiac medications at hospital discharge compared with patients who did not receive in-patient smoking cessation counseling. There were statistically significant differences in 22 of the 33 baseline characteristics between exposed and unexposed participants in the study sample. Twenty of the variables had standardized differences that exceeded 0.10. Thus, as is typical in observational studies, there were systematic differences in baseline characteristics between treated and untreated patients.
There were no statistically significant differences in basic demographic characteristics (age and sex) and in the probability of death within 3 years of discharge between participants with complete data on baseline covariates and participants who were excluded due to missing data on baseline covariates.
Matching on the Propensity Score
The standard deviation of the logit of the propensity score was equal to 0.7013542. Thus, 0.2 of the standard deviation of the logit of the propensity score was equal to 0.14027084. Therefore, matched treated and untreated participants were required to have logits of the propensity score that differed by at most 0.14027084.
When participants who received in-patient smoking cessation counseling were matched with participants who did not receive smoking cessation counseling on the logit of the initially specified propensity score model, 682 matched pairs were formed. Thus, 90% of patients who did not receive in-patient smoking cessation counseling were successfully matched to a patient who did receive in-patient smoking cessation counseling.
Balance Diagnostics
Propensity score matching. The baseline characteristics of patients receiving in-patient smoking cessation counseling and those not receiving counseling in the initial propensity score matched sample are described in . Across the 33 baseline covariates, the absolute standardized differences ranged from a low of 0 to a high of 0.064, with a median of 0.018, indicating that the means and prevalences of continuous and dichotomous variables were very similar between treatment groups in the matched sample. The variance ratios ranged from a low of 0.81 (admission heart rate) to a high of 1.36 (sodium), indicating that the variance of some continuous variables was different between the two treatment groups in the initial propensity score matched sample.
| TABLE 2Baseline Characteristics of Treated and Untreated Participants in the First Propensity Score Matched Sample |
In an attempt to further minimize some of the residual differences in the distribution of the baseline covariates between treatment groups, the original specification of the propensity score model was modified. The first modification was to relax the assumption that the continuous variables were each linearly related to the log-odds of exposure. The propensity score model was modified so that restricted cubic smoothing splines with five knots were used to model the relationship between continuous baseline variable and the log-odds of exposure (
Harrell, 2001). The matching process described earlier was repeated and the similarity of the distribution of treated and untreated participants in the resultant matched sample was assessed. Despite modifications of the propensity score model, there remained continuous variables whose variances were greater in one group than in the other group (variance ratios ranging from 0.91 to 1.34). The highest variance ratio was for glucose. The current specification of the propensity score model was then further modified by including interactions between glucose (and the variables required for modeling glucose using restricted cubic smoothing splines) and several of the dichotomous variables.
The resultant matched sample consisted of 646 matched pairs (85.7% of patients not receiving smoking cessation counseling were successfully matched to a patient receiving counseling with a similar value of the logit of the propensity score). The baseline characteristics of treated and untreated participants are described in . The standardized differences ranged from a low of 0 to a high of 0.055 with a median of 0.014 (25th and 75th percentiles: 0.006 and 0.038, respectively). The variance ratios for continuous variables ranged from 0.86 to 1.15. The absolute standardized differences for all 55 two-way interactions between continuous baseline covariates ranged from 0.001 to 0.076 with a median of 0.016.
| TABLE 3Baseline Characteristics of Treated and Untreated Participants in the Final Propensity Score Matched Sample |
The aforementioned analyses indicate that the means and variances of continuous variables were similar between treatment groups in the matched sample. Similarly, the prevalence of dichotomous variables was similar between treatment groups. In addition, the mean of two-way interactions between continuous baseline covariates was similar between treatment groups in the propensity score matched sample.
reports empirical cumulative distribution plots and quantile-quantile plots for four continuous baseline covariates: age, systolic blood pressure, creatinine, and glucose. These plots indicate that the distribution of each of these four continuous variables was very similar between treatment groups in the propensity score matched sample. Similar plots could be produced for the remaining continuous baseline covariates.
Taken together, the aforementioned analyses indicate that the modified propensity score model appears to have been adequately specified. After matching on the estimated propensity score, observed systematic differences between treated and untreated participants appear to have been greatly reduced or eliminated.
The final specification of the propensity score model is used for the remainder of the case study. In a particular application of propensity score methods, one would typically optimize the specification of the propensity score to the particular propensity score method that is being employed. We have elected to use the current specification of the propensity score for all four propensity score methods for two reasons. First, it allows readers to compare the relative performance of different propensity score methods with a uniform specification of the propensity score model. Second, modifying the specification of the propensity score model across different propensity score methods appears to be at odds with the conceptual perspective that there is one true propensity score model.
Among the 754 participants in the study sample who did not receive smoking cessation counseling, there were substantial differences in baseline characteristics between the 646 participants included in the matched sample and the 108 participants who were not included in the matched sample (due to no appropriate participant who did receive smoking cessation counseling being identified). There existed statistically significant differences in 24 of the 33 baseline covariates between matched and unmatched participants. Furthermore, 29 of the baseline covariates had standardized differences that exceeded 0.10 between matched and unmatched participants. For instance, the mean of age matched and unmatched participants were 58.7 years and 70.9 years, respectively.
Stratification on the propensity score. The quintiles of the estimated propensity score were 0.55243, 0.67427, 0.75205, and 0.82271, respectively. The proportion of participants within each stratum who received smoking cessation counseling ranged from a low of 39.1% in the stratum with the lowest propensity score to a high of 86.1% in the stratum with the highest propensity score. In the stratum of participants with the lowest propensity score, the minimum, 25th percentile, median, 75th percentile, and maximum propensity score for participants who did not receive smoking cessation counseling were 0.005, 0.294, 0.415, 0.484, and 0.551, respectively. In participants who did receive smoking cessation counseling, these statistics were 0.081, 0.381, 0.475, 0.519, and 0.552, respectively. Thus, in this lower stratum, the distribution of the propensity score was shifted modestly lower in untreated participants compared with treated participants. However, overall, there was reasonable overlap in the propensity score between treated and untreated participants. In each of the middle three strata, the distribution of the propensity score was very similar between treated and untreated participants. In the fifth stratum, the maximum propensity score in treated participants was 0.981, whereas it was 0.944 in untreated participants. In some settings, inadequate overlap in the propensity score may be observed between treated and untreated participants within a given propensity score stratum (if this occurs, it often occurs in either the lowest or highest strata). If this occurs, some applied investigators may choose to exclude untreated participants with very low propensity scores or treated participants with very high propensity scores. However, when this is done, one needs to be aware that one is changing the population to which the estimated treatment effect applies.
For the 33 variables described in , the minimum absolute standardized differences were 0, 0.007, 0.012, 0.002, and 0.008 across the five propensity score strata. The maximum absolute standardized differences were 0.213, 0.221, 0.210, 0.253, and 0.220 across the five strata. The median absolute standardized differences were 0.074, 0.062, 0.077, 0.069, and 0.074 across the five strata. Within-quintile standardized differences were computed for each of the 55 pairwise interactions between continuous variables. The minimum standardized differences were 0.003, 0.001, 0.004, 0.001, and 0.003 across the five propensity score strata. The maximum standardized differences were 0.202, 0.159, 0.230, 0.166, and 0.228 across the five strata. The median standardized differences were 0.090, 0.044, 0.082, 0.082, and 0.072 across the five strata.
The aforementioned sets of balance diagnostics suggest that, on average, treated and untreated participants have similar distributions of measured baseline covariates within strata of the propensity score. One could complement the aforementioned quantitative analyses by graphical analyses comparing the distribution of continuous covariates between treatment groups within each stratum of the propensity score. For instance, one could use within-stratum empirical cumulative distribution plots or quantile-quantile plots to compare the distribution of continuous covariates between treatment groups. Due to space constraints, we omit these analyses from this article.
In comparing the within-quintile balance with that observed in the propensity score matched sample described earlier, one notes that modestly greater imbalance persists when stratifying on the propensity score compared with when matching on the propensity score (e.g., compare the median standardized differences). This is consistent with prior empirical observations (
Austin & Mamdani, 2006;
Austin, 2009c) and with the results from prior Monte Carlo simulations (
Austin, 2009c;
Austin, Grootendorst, & Anderson, 2007). Greater residual imbalance tends to be eliminated by matching on the propensity score than by stratifying on the quintiles of the propensity score.
Propensity score weighting. The individual inverse probability of treatment weights ranged from 1.0 to 18.0. The weighted standardized differences were computed for the 33 variables listed in . The absolute standardized differences ranged from 0.001 to 0.031 with a median of 0.010 (the 25th and 75th percentiles were 0.007 and 0.015, respectively). The variance ratios for the continuous variables ranged from 0.36 to 0.50. The absolute standardized differences for the 55 two-way interactions between continuous variables ranged from 0.001 to 0.031 with a median of 0.012 (the 25th and 75th percentiles were 0.007 and 0.018, respectively). Thus, although the means and prevalences of continuous and dichotomous variables were well balanced between treatment groups in the weighted sample, there is some evidence of greater dispersion in untreated patients compared with treated patients.
describes empirical cumulative distribution functions and nonparametric estimates of the density functions for four continuous covariates in treated and untreated participants separately in the sample weighted by the inverse probability of treatment. In examining the eight panels in, one observes that the distribution of each of the four continuous variables was very similar between treated and untreated participants in the weighted sample.
The evidence provided by the empirical cumulative distribution functions and the nonparametric density plots appears to be in conflict with that provided by the ratios of the variances of the continuous variables. The former suggests that the distributions are comparable between treatment groups, whereas the latter suggests that greater variability is found in untreated participants than in treated participants. Upon further examination, it was found that the inverse probability of treatment weights were systematically higher in untreated participants than in treated participants. We hypothesize that a few large weights in the untreated participants may have resulted in inflated variance estimates in this population, resulting in shrunken variance ratios.
Covariate adjustment using the propensity score. In the full study sample, the weighted conditional absolute standardized differences ranged from 0.001 to 0.194 for the 33 variables listed in . The median weighted conditional absolute standardized difference was 0.062, whereas the first and third quartiles were 0.024 and 0.093, respectively.
displays the graphical balance diagnostics based on quantile regression for age, systolic blood pressure, creatinine, and glucose. The relationship between the quantiles of the baseline variable and the propensity score in treated participants is described using the five solid lines, whereas the relationship between the quantiles of the baseline variable and the propensity score in untreated participants is described using the five dashed lines. In examining , one notes that the distribution of each of the four baseline covariates is approximately similar between treatment groups across the range of the propensity score. However, there was some evidence of differences in the 95th percentile of the conditional distributions between treated and untreated participants for three of the four continuous covariates.
Based on the results of the balance diagnostics described in the preceding sections, we were satisfied that our specification of the propensity score was adequate. Having satisfied ourselves that the propensity score model was adequately specified, we proceeded to estimate the effect of treatment on outcomes using the four different propensity score methods.
Estimated Treatment Effects
Propensity score matching. The matched sample consisted of 646 matched pairs. In this matched sample, 91 treated participants and 103 untreated participants died within 3 years of hospital discharge. The probabilities of death within 3 years of discharge were 0.141 (91/646) and 0.159 (103/646) for treated and untreated participants, respectively. The absolute reduction in the probability of 3-year mortality was 0.0185 (95% confidence interval [—0.018, 0.055]). There was no significant difference in the probability of 3-year mortality between treatment groups (p = .3173). The NNT, the reciprocal of the absolute risk reduction, was 54. Thus, one would need to provide in-patient counseling to 54 smokers in order to avoid one death within 3 years of hospital discharge. The relative risk of death in treated participants compared with untreated participants was 91/103 = 0.88 (95% confidence interval: [0.69, 1.13]). Thus, in-patient smoking cessation counseling reduced the risk of 3-year mortality by 12%. However, the relative risk was not statistically significantly different from unity (p = .3176). Thus, there was no evidence that the provision of smoking cessation counseling reduced the risk of death in current smokers within 3 years of hospital discharge.
The left panel of depicts the Kaplan-Meier survival curves in treated and untreated participants in the propensity score matched sample. The two survival curves were not significantly different from one another (p = .2486). Using a Cox proportional hazards model, the estimated hazard ratio was 0.874 (95% confidence interval: [0.672, 1.136]). Thus, provision of smoking cessation counseling prior to hospital discharge reduced the hazard of subsequent death by 12.6%. However, this effect was not statistically different from the null effect (p = .3130).
Stratification on the propensity score. The probability of 3-year mortality in participants not receiving in-patient smoking cessation counseling was 0.37, 0.16, 0.10, 0.05, and 0.05 in the first through fifth strata of the propensity score, respectively. The probability of 3-year mortality in participants receiving inpatient smoking cessation counseling was 0.25, 0.14, 0.09, 0.04, and 0.04 in the first through fifth strata, respectively. Thus, the absolute reduction in 3-year mortality was 0.126, 0.023, 0.015, 0.007, and 0.001, in the first through fifth propensity score strata, respectively. The mean of these five stratum-specific absolute risk reductions is 0.034. Thus, if all current smokers received in-patient smoking cessation counseling, the probability of 3-year mortality would be reduced by 0.034. The standard error of the pooled risk difference was 0.015. Thus, a 95% confidence interval for the absolute reduction in the probability of mortality within 3 years is (0.006, 0.063). The Mantel-Haenszel estimate of the pooled relative risk across the propensity score strata was 0.75 (95% confidence interval: 0.59–0.96). Provision of smoking cessation counseling significantly reduced the risk of death within 3 years by 25% (p = .0236).
depicts the stratum-specific Kaplan-Meier survival curves for treated and untreated subjects across the five propensity-score strata. When using a Cox proportional hazards model that stratified on the five propensity score strata, the estimated hazard ratio was 0.72 (95% confidence interval: 0.57–0.91). Thus, receipt of in-patient smoking cessation counseling reduced the hazard of death by 28%. This effect was statistically significant (p = .0065).
As a sensitivity analysis, we repeated the aforementioned analyses using 10 strata rather than 5 strata. The estimated absolute risk reduction was 0.027 (95% confidence interval: [—0.002, 0.056]), whereas the pooled relative risk was 0.805 (95% confidence interval: 0.628–1.031). For time to death, the estimated hazard ratio was 0.775 (95% confidence interval: 0.608–0.989). Thus, smoking cessation counseling decreased the hazard of death by 22.5% (p = .04).
Propensity score weighting. Using the first weighted estimate, counseling reduced the probability of death within 3 years by 0.020 (95% confidence interval: —0.008 to 0.047), which was not statistically significantly different from 0 (
p = .1558). The

estimate of the absolute reduction in the probability of mortality due to counseling was 0.025 (95% confidence interval: —0.002 to 0.052). Counseling did not reduce the probability of mortality within 3 years of discharge (
p = .0689).
Using the first weighted estimator, the relative risk of death within 3 years in treated patients compared with untreated patients was 0.86 (95% confidence interval: 0.66–1.10). Using the doubly robust estimator, the relative risk was 0.82 (95% confidence interval: 0.66–1.03).
Using logistic regression in the weighted sample, the resultant odds ratio was 0.84 (95% confidence interval: 0.63–1.11). When the logistic regression model was modified by adjusting for the 33 variables in , the estimated odds ratio was attenuated to 0.98 (95% confidence interval: 0.95–1.00). In neither case was the estimated odds ratio statistically significantly different from 1 (p = .2193 and .0998, respectively).
When a Cox proportional hazards model was used in the weighted sample, the estimated hazard ratio for counseling was 0.850 (95% confidence interval: 0.655 to 1.102). Thus, counseling did not reduce the hazard of subsequent death (p = .2203). The right panel of displays the estimates of the Kaplan-Meier survival curves in the sample weighted by the inverse probability of treatment. One observes that in-patient smoking cessation counseling improved survival postdischarge. At 3 years, the probability of death was 0.880 and 0.860 in those who did and did not receive counseling, respectively. The absolute reduction in the probability of death within 3 years due to smoking cessation counseling was 0.020. However, there was no evidence that the two survival curves were different from one another (p = .2107).
Covariate adjustment using the propensity score. When we used logistic regression to regress the odds of survival to 3 years on an indicator variable for treatment status and the propensity score, one inferred that receipt of inpatient smoking cessation counseling reduced the odds of death within 3 years of discharge by 20.8% (odds ratio: 0.792; 95% confidence interval: 0.600–1.046). Similarly, treatment reduced the hazard of postdischarge mortality by 20.2% (hazard ratio: 0.798; 95% confidence interval: 0.624–1.022). Neither the odds ratio nor the hazard ratio were statistical significantly different from the null treatment effect (p = .100 and .074, respectively). As noted in the Methods section, use of these approaches is discouraged as they have been shown to lead to biased estimation of odds ratios and hazard ratios.
When using the method based on that described by
Imbens (2004), we estimated that the probability of death within 3 years if all participants were untreated was 0.144, whereas the probability of death if all participants were treated was 0.121. Thus, treatment reduced the population probability of death within 3 years by 0.023 (95% confidence interval was [—0.005, 0.052]). Similarly, the relative risk was 0.84 (16% relative reduction in the probability of death within 3 years of hospital discharge; 95% confidence interval: 0.68–1.04). Thus, using covariate adjustment using the propensity score, neither the effect of counseling on the absolute or relative reduction in the probability of mortality was statistically significant from the null effect.
Regression adjustment. When logistic regression was used to regress an indicator variable denoting survival to 3 years postdischarge on an indicator variable denoting receipt of smoking cessation counseling and the 33 baseline covariates listed in , the adjusted odds ratio for smoking cessation counseling was 0.73 (95% confidence interval: 0.54–0.98). Thus, smoking cessation counseling reduced the odds of mortality (p = .0371). When the logistic regression model was modified by using restricted cubic smoothing splines to model the relationship between continuous baseline covariates and the log-odds of mortality, the resultant odds ratio for counseling was 0.77 (95% confidence interval: 0.56–1.05). Thus, smoking cessation counseling did not significantly reduce the odds of mortality (p = .0942).
When we used a Cox proportional hazards model to regress survival time on treatment status and the 33 baseline covariates listed in , the adjusted hazard ratio for smoking cessation counseling was 0.72 (95% confidence interval: 0.57–0.92). Thus, smoking cessation counseling reduced the hazard of postdischarge mortality (p = .0080). When we modified the Cox proportional hazards model by using restricted cubic smoothing splines to model the relationship between continuous baseline covariates and the log-hazard of mortality, the resultant hazard ratio for counseling was 0.78 (95% confidence interval: 0.61–0.99). Thus, counseling significantly reduced the hazard of death (p = .0441).