As described in Cropsey and colleagues,[

20] 71 individuals started the intervention immediately and 289 provided waitlist control data. Of these, 33 were excluded from this study due to missing baseline data and 66 were excluded due to missing follow-up data. This left us with 58 individuals in the intervention group and 203 individuals in the control group. The control group was younger than the intervention group (p=0.013 by unpaired T-test), but the magnitude of the difference was not great (3 years). There were no other statistically significant differences between the two groups in baseline characteristics.

In the last two rows of , we present the follow-up characteristics of the sample. There was a statistically significant change in the rate of CO-confirmed abstinence comparing baseline to follow-up in the intervention group (3% to 47%, p<0.001 by McNemar's test) but not when comparing baseline to follow-up in the control group (6% to 6%, p=1.00 by McNemar's test). Although the abstinence rate did not change in the control group (12 participants having abstinence at each time period), just five participants had evidence of abstinence at both periods. In this table, statistically significant differences in withdrawal symptoms between the two groups emerge. Withdrawal symptoms showed statistically significant increases from baseline (p<0.003 by paired T-tests within both groups). However, the increase was higher in the control than the intervention group (p<0.001 comparing the change scores between the two groups using an unpaired T-test).

| **Table 3**Characteristics of data sample. Higher withdrawal symptom scores indicate more symptoms. |

We can use to estimate the proportion in each principal stratum. As shown in , all of those who abstain while in the control condition are in the principal stratum that always abstains. Hence, in , there are 6% in the stratum that always abstains. Also, as shown in , all of those who do not abstain while in the intervention condition never abstain. Thus, there are 53% who never abstain. This leaves 41% in the principal stratum that only abstains in the intervention arm.

In , we present the estimated effect of the intervention on withdrawal symptoms in different groups. In , we present the effect of the intervention on the entire sample. The intervention had a statistically significant reduction in withdrawal symptoms in the entire sample as shown in (p<0.005) without and with regression adjustment. The magnitude of the effect in the entire sample was large relative to the standard deviation of the baseline withdrawal scores (greater than 0.5 standard deviations).

| **Table 4**Estimated effect of intervention on withdrawal symptoms in select subgroups. |

In , we present estimates of the effect in the subgroup observed to abstain in the sample, as per the recommended subgroup analysis of Shiffman and colleagues.[

22] The effect of the intervention was not statistically significant either without or with adjustment for baseline covariates. The magnitude of the adjusted estimate in those observed to abstain in the study was very small (a reduction in withdrawal symptoms of −0.5 units). However, as depicted in and , estimating an intervention effect in this subgroup is not meaningful since it represents the effect in a population that is a mixture of principal strata.

In and , we also present estimates of the effect in groups categorized by potential abstention status at plausible values of the sensitivity parameters *u* and *v*. The estimate of the intervention effect is only a function of the sensitivity parameter *u* in those who abstain regardless of intervention assignment. Among those who would abstain from smoking regardless of intervention assignment, the intervention reduces symptoms by an estimated −5.5 to −11.8 on the Minnesota Withdrawal Scale over a range of the sensitivity parameter *u* from 1 to 3; the absolute magnitude of the estimated effects are monotone increasing over this range. The results are statistically significant (p<0.015) over the range of *u* presented. When *u* is equal to 1, we assume that the average withdrawal symptoms in the intervention arm of the group that can abstain only when assigned to the intervention are equal to those of the group that can always abstain. When *u* is equal to 3, we assume that the average withdrawal symptoms when assigned to the intervention are much lower for those who can always abstain. Since the baseline standard deviation of the Minnesota Withdrawal Scores is approximately 5.0 with a range of 0 to 32, an estimated effect of at least −5.5 (greater than 1.0 standard deviation) is clinically relevant.

The effect estimate in the group that would never have CO-confirmed abstention from smoking regardless of intervention assignment varies over assumptions about the sensitivity parameter *v*. The point estimates for the effect of this intervention range from −0.8 to −7.4 over the range of the sensitivity parameter. The absolute magnitude of the point estimates are not as great as those in the group that always abstains regardless of intervention assignment. Here, the inferences are dependent on the choice of the sensitivity parameter. The estimates are only statistically significant (p<0.05) when *v* is greater than 1.4. When *v* is equal to 1, we assume that the average withdrawal symptoms in the control arm of the group that can abstain only when assigned to the intervention are equal to those that can never abstain. When *v* is equal to 3, we assume that the average withdrawal symptoms when assigned to the control are higher for those who can never abstain.

The estimates of the intervention effect in those who would only abstain in the intervention arm are a function of both sensitivity parameters *u* and *v*. The effect estimates in this group range from −6.3 to 3.2. The estimates are statistically significant except for a band when *v* equals approximately 2. Further, as *v* increases above 2, the direction of the effect flips and the intervention worsens symptoms.

We examined pairwise comparisons of effect estimates among principal strata for evidence that potential abstention group membership moderated effects. The p-values for tests of moderation were dependent on the sensitivity parameters and ranged from <0.001 to 0.99. As an example of how the p-values vary over the sensitivity analysis, the p-value comparing the effect in those who always abstain versus those who abstain only when assigned to the intervention was 0.684 when both sensitivity parameters were equal to 1. However, the p-value was 0.035 when both sensitivity parameters were equal to 1.33.

We examined compliance as calculated by the percentage of used patches returned among those in the intervention group during the week of follow-up (week 4 or week 5). Forty-three of the 58 intervention participants had compliance data. Overall, there was 87% compliance, with 93% compliance among those with CO-confirmed abstention and 83% compliance among those without abstention (p=0.234 by a T-test).

In comparison, Zhang and Rubin's[

5] estimates that do not make assumptions such as ours give bounds on the treatment effect among those who always abstain of −26 to −2.2.