We have presented longitudinal models for binary daily abstinence data in smoking cessation trials. Our approach uses the finest-level data on quit history commonly available and provides a nuanced description of the evolution of the outcomes. By comparison, the standard approach of summarizing the point prevalence abstinence at a single designated time ignores the bulk of the available information, calling into question the common practice of collecting daily smoking data [24
The standard analysis of point-prevalence abstinence is subject to the arbitrary choice of assessment time. Although most studies conduct the assessment at EOT, the length of the recommended treatment phase varies by drug [25
], and therefore the reported point prevalence abstinence could refer to 8, 10 or 12 weeks of treatment. Although meta-analysis suggests that the treatment effect is insensitive to study duration [26
], because quit rates generally decline over time this practice can diminish cross-trial comparability. In contrast, longitudinal analysis models the daily quit probability, rendering the interpretation independent of other elements of the study design.
As the longitudinal analysis uses all time points in the treatment period, the drug effect OR represents an effect averaged across time, whereas the standard analysis reflects the effect only at EOT. Our data analysis revealed higher ORs with the longitudinal models, suggesting some variation of drug effect over time. Regardless of the estimated ORs, the corresponding CIs from longitudinal models are generally narrower than those from simple logistic models, with the size of the difference depending on the within-subject correlation [20
A randomized longitudinal study with repeated outcomes collected at k
>1 times can reduce sample size requirements compared to a design with the outcome collected at a single time. Neuhaus and Segal [20
] found that to achieve the same power for detecting a designated OR, the sample size N1
in a longitudinal design is smaller than the sample size N0
in a conventional design, in the ratio
, here ρ
is the within-subject correlation. The saving is greatest when ρ
=0 and declines with increasing ρ
, until there is no saving at all when ρ
A longitudinal analysis of daily smoking status has greater potential efficiency gain when some observations are missing, because it can include the available data from all randomized subjects, even those lost to follow-up, whereas the standard approach has to either exclude the drop-outs or assume that they continued to smoke. Although such an assumption is held to be conservative, there is an increasing awareness that its indiscriminate use may lead to bias and lack of comparability between studies [26
26]. Fortunately our example had few missing observations.
An important advantage of longitudinal modeling is its ability to incorporate time-varying predictors. In some studies, treatment (such as drug dose) changes over time by design [28
]. Even if the treatment is constant, including the treatment-by-time interaction allows us to test whether its effects change over time, as in our example. We investigated the influence of smoking history on later success by coding summary measures of history as time-varying predictors. Our results suggest that history is an important independent predictor of future outcome. Generally, there is concern that including outcome history may over-adjust and thereby attenuate treatment main effects. In our results, the drug effect OR in the ME model was 2.30 before adjusting for history, declining to 2.14 after adjustment. Possibly the history variable absorbed some of the treatment effect, resulting in the decreased OR. Still, the large size and strong statistical significance of the adjusted effect suggests that bupropion continues to have an effect no matter how long one has been taking it and regardless of its effect to date.
Although the ME and GEE models work similarly in many cases, one must bear in mind that their regression coefficients have different interpretations [17
]. The drug effect OR in the ME model represents the odds of the outcome for a person taking the drug compared to the same person not taking the drug. In contrast, the GEE OR represents the average OR of the drug group compared to the placebo group. Estimates from GEE are generally smaller than those from ME, and the attenuation increases with the between-subjects variability. After applying the scale factor as we illustrated, the two estimates are comparable [18
A marginal approach like GEE is problematic when one wishes to model time-varying effects [21
]. As shown in our example, GEE is unreliable when one includes the outcome histories as predictors. Moreover, ME is preferable when there is substantial dropout, because one can estimate it consistently with weaker assumptions on the missing-data mechanism [29
A potential limitation of our analysis is that we used treatment phase data only. Given the logarithmic nature of relapse curves [30
], outcomes at later follow-ups such as 6 or 12 months are considered superior indicators of treatment success. Commonly, however, daily smoking data are either not collected or are unreliable after EOT, limiting the use of daily data beyond that point. Nevertheless, the analysis we presented provides a way to evaluate the dynamic process in the treatment period using all the available information; one may conduct separate analyses on later outcomes using standard methods.
Another limitation of our analysis is its use of TLFB data, which are self-reported and thus may be inaccurate in a fraction of subjects who falsely report abstinence [31
]. Nevertheless, our data suggest that the drug effect is similar whether we use self-reported or verified EOT point prevalence abstinence as the outcome (). One possible reason is that subjects in both arms are equally likely to report false abstinence and therefore any biases are offsetting. Models for daily abstinence would not be biased by subjects who under-report cigarette counts, as long as they do not claim abstinence on smoking days.
TLFB data are typically collected every one or two weeks and are therefore potentially subject to recall bias, which may affect estimates of within-subject correlation if the subjects incorrectly report the same or very similar counts for all the days being recorded at each visit. To address this possibility, we repeated the analysis using only data from the last day of each week (results not shown), giving us a single day’s data from each weekly TLFB questionnaire. We observed the same large correlation, suggesting that it is real and not simply an artifact of recall bias. Moreover, the drug effect OR was similar to that from the daily data, with a slightly wider CI. This is expected because the efficiency gain from using daily vs. weekly data is substantial only if the correlation is small. Even if efficiency gain is minimal, analyzing daily data reveals changes occurring at scales less than a week, as illustrated by and . The advent of electronic diaries, allowing collection of cigarette counts by ecological momentary assessment, may obviate the need for daily summarization of cigarette counts.
One could in principle obtain a more efficient analysis from the series of daily cigarette counts. Because such data are subject to severe heaping, in the form of over-reporting of round numbers of cigarettes smoked [5
], there is substantial potential for bias in such analyses, and it has been considered preferable to use the daily abstinence indicators. As we have suggested, another approach is to model the duration of abstinent and smoking episodes, incorporating the possibility of permanent recovery and relapse [6
]. Such models, when estimated exclusively from treatment-period data, can make excellent predictions for remote long-term outcomes [32
]. With current types of data, the day is still the smallest time interval, but as electronic recording devices become more common, it will be possible to measure inter-cigarette intervals to the second, providing for an even finer analysis.