|Home | About | Journals | Submit | Contact Us | Français|
Various statistical methods have been used for data analysis in alcohol treatment studies. Trajectory analyses can better capture differences in treatment effects and may provide insight on the optimal duration of future clinical trials and grace periods. This improves on the limitation of commonly used parametric (e.g., linear) methods that cannot capture non-linear temporal trends in the data.
We propose an exploratory approach, using more flexible smoothing mixed effects models, more accurately to characterize the temporal patterns of the drinking data. We estimated the trajectories of the treatment arms for data sets from two sources: a multi-site topiramate study, and the Combined Pharmacotherapies (acamprosate and naltrexone) and Behavioral Interventions study.
Our methods illustrate that drinking outcomes of both the topiramate and placebo arms declined over the entire course of the trial but with a greater rate of decline for the topiramate arm. By the point-wise confidence intervals, the heavy drinking probabilities for the topiramate arm might differ from those of the placebo arm as early as week 2. Furthermore, the heavy drinking probabilities of both arms seemed to stabilize at the end of the study. Overall, naltrexone was better than placebo in reducing drinking over time, yet was not different from placebo for subjects receiving the combination of a brief medical management and an intensive combined behavioral intervention.
The estimated trajectory plots clearly showed non-linear temporal trends of the treatment with different medications on drinking outcomes and offered more detailed interpretation of the results. This trajectory analysis approach is proposed as a valid exploratory method for evaluating efficacy in pharmacotherapy trials in alcoholism.
Recently, there has been substantial research interest in the use of various pharmacological agents as promising adjuncts to psychosocial treatment to improve drinking outcomes among alcohol dependent individuals. The efficacy of these pharmacological agents, such as naltrexone and topiramate, has been studied in randomized double-blind clinical trials, with outcome measures obtained through the retrospective collection of person-centered drinking data using the timeline follow-back (TLFB) method (Sobell and Sobell, 1992). This method requires subjects to recollect their daily drinking pattern over a specified period of time prior to the present visit. Important events during the period for which the drinking data are collected are used to increase the accuracy of data obtained using TLFB, which has been validated as reliable in several clinical studies (Sobell and Sobell, 1992).
Various statistical methods have been used to analyze TLFB data. For instance, Johnson and colleagues (Johnson et al., 2003; 2007) condensed the daily drinking records into drinking summary data in both single-site and multi-site topiramate trials, the efficacy of which was evaluated through linear regression methods. To evaluate the efficacy of acamprosate and naltrexone for the treatment of alcohol dependence, Anton et al. (2006) used similar data reduction techniques to analyze the drinking outcomes in the Combined Pharmacotherapies and Behavioral Interventions for Alcohol Dependence (COMBINE) study. Furthermore, data sets from the multi-site topiramate study and the COMBINE study were used to derive a new endpoint, percentage of subjects with no heavy drinking days, and the efficiency of this outcome was compared with that of other traditional outcomes derived from averaged values of the drinking data (Falk et al., 2010). In both studies, drinking outcomes were summarized for the time period(s) of interest, e.g., across the whole treatment assessment period (Johnson et al., 2003), or by individual months (Anton et al., 2006; Falk et al., 2010) or weeks (Johnson et al., 2007). However, such analyses are not as efficient as those that use original daily drinking record, which can elicit meaningful patterns of alcohol use over time and can increase the statistical power for detecting treatment effects. For example, Gueorguieva et al. (2007) used a latent class growth model to capture the heterogeneity in daily drinking outcome and re-evaluate the efficacy of naltrexone in two clinical trials that were reported previously as being negative. They identified three latent drinking groups and showed a positive treatment effect of naltrexone in both trials. Similarly, to test the sensitivity of the statistical analyses to detect a treatment effect, Liu et al. (2008) proposed a multi-level two-part random effects model to analyze the daily drinking record in a single-site topiramate study (Johnson et al., 2003), which improved the efficiency of the data analysis.
A novel exploratory method to reduce potential biases from the different data analytic approaches that have been used to analyze drinking data in pharmacotherapy trials would be to take into account the trajectory of the quantified amount of alcohol consumption. Such trajectories could provide a better illustration of the change in drinking level over time and thus would be more clinically meaningful in capturing differences in treatment effects between comparison groups. Also, in studies where there exists a substantial amount of dropout, the trajectory method can reduce potential bias caused by dropout and improve the efficiency of the statistical analyses by using all the available data, if the data are “missing at random” (Little and Rubin 2006).
Trajectory analyses also can offer important insights as to the optimal duration of future clinical trials. For example, if the drinking outcomes of the two arms differed evidently at an early time and then reached a plateau, we could shorten the follow-up of such a clinical trial considerably, thereby resulting in a more efficient trial design. Furthermore, such analyses would be more closely related to the new Food and Drug Administration guidelines for including a “grace period” wherein the data are only analyzed for efficacy after sufficient time has elapsed for the medication to achieve its full pharmacological effects (Falk et al., 2010; Food and Drug Administration, 2006). Notably, however, to date there has been no pre-specified length of the grace period in pharmacotherapy trials, and its onset and duration could vary for different putative therapeutic medications. In this regard, a trajectory analysis would be helpful because a preliminary trial could be conducted before the main study to provide an informative estimate of the optimal length of the grace period.
In the literature for pharmacological treatments for alcoholism, previous analytic approaches have usually made parametric (e.g., linear) assumptions for trajectory analysis (Verbeke and Molenberghs, 2000; Fitzmaurice et al., 2004). Because the longitudinal drinking outcome could vary over time in a non-linear manner, which would be unknown in advance, it could be quite difficult to model reliably its time trend using a simple parametric function. Here, we propose more advanced and flexible models, e.g., smoothing mixed effects models, to characterize more accurately the temporal patterns in drinking outcomes, whilst accounting for the correlation among observations within the same subject. We note that Anton et al. (1998) and Kelly et al. (2008) applied the robust non-linear regression methods in alcohol studies. However, none of these papers studied the drinking trajectories for alcohol treatment trials.
Our data come from alcohol trials where subjects were treated under two different drinking paradigms – that is, those drinking up to the time of randomization (i.e., multi-site topiramate trial) and those who entered the study after a brief period of abstinence (i.e., COMBINE study). The two data sets also compared the effects of a medication with efficacy ascribed to its multiple neuropharmacological effects (i.e., topiramate) with those of 2 agents, each of which had an effect at a single neurotransmitter system (i.e., naltrexone and acamprosate in the COMBINE trial). Taken together, our choice of data sets offered us an opportunity to test different trajectories of drinking in different experimental paradigms and for various putative treatment medications.
This double-blind, randomized, placebo-controlled 14-week clinical trial consisting of 371 individuals included men and women aged between 18 and 65 years (mean age, 47 years) who were diagnosed with alcohol dependence and drank 35 or more (men) and 28 or more (women) standard drinks per week (see Table 1 for more detail). A standard drink contains about 14 g of pure alcohol (Miller et al., 1991). The clinical trial was conducted between January 27, 2004 and August 4, 2006 at 17 sites in the U.S. (Johnson et al., 2007). In this cohort, all subjects were enrolled whilst still drinking heavily. Of them, 27% were women, and ethnic minorities constituted 15% of the sample. The study design comprised two treatment arms (i.e., topiramate or placebo), and all patients received a brief psychosocial intervention every week – Brief Behavioral Compliance Enhancement Treatment. The medication dose was titrated from the beginning of week 0 to the end of week 5 and was maintained from the end of week 5 to the beginning of week 14. Participants had to achieve a minimum topiramate dose of 50 mg/d or the placebo equivalent to remain in the trial. Retention rates at study end among those randomized were 61.2% for the topiramate group and 76.6% for the placebo group.
This randomized, 16-week double-blind placebo-controlled trial included 1383 alcohol-dependent individuals aged between 19 and 80 years (median age, 44 years) (Table 1). The study was conducted at 11 academic sites in the U.S. between January 2001 and January 2004 (Anton et al., 2006). In this cohort, all subjects were enrolled following a brief period of abstinence (4 to 21 days). For the cohort, 31% were women, and ethnic minorities constituted 23% of the sample. There were four medication conditions (placebo, acamprosate 3 g/d, naltrexone 100 mg/d, and acamprosate plus naltrexone). The study design was that eight groups (n=1226) received medical management (MM), a 9-session intervention focused on enhancing medication adherence and abstinence. Four of these groups (n=619) also received more intensive counseling (Cognitive Behavioral Intervention; CBI) delivered by alcoholism treatment specialists, resulting in a 2 (acamprosate/placebo) × 2 (naltrexone/placebo) × 2 (CBI/no CBI) factorial design. A ninth group (n=157) received CBI alone, without pills or MM, and was included to address the separate question of “non-specific” effects. Each participant in the pill-taking groups took the same number of pills (up to 8 ) of active medication or placebo daily for 16 weeks. There were no statistically significant differences in study retention between treatment groups; although a number of people did not complete one or more aspects of treatment, 94% (group range, 92%–96%) provided complete within-treatment (weeks 1–16) drinking data.
We were interested in capturing the possible non-linear trajectories of the daily drinking outcomes, using smoothing (semi-parametric) regression methods (Wood, 2006; Wu and Zhang, 2006) In such models, the shape of the functional relationship is not predetermined but can be automatically adjusted to capture unusual or unexpected features of the data.
Specifically, the following two models were applied to compare the trajectories between treatment arms over the study period: (1) for the binary outcomes, and (2) for continuous outcomes. Below, we only show the detailed form for the first model and relegate that for the second model to supplemental materials. We only considered two treatment arms without higher order interactions (e.g., the treatment interactions in the COMBINE study) for two reasons: (1) to avoid the “curse of dimensionality” problem (Hastie and Tibshirani, 1986; 1990); and (2) to provide a clear and parsimonious illustration in the figures for comparing treatment effects over time.
In the first model, we were interested in the trajectory analysis of the binary outcomes, e.g., the probability of daily drinking, the probability of daily heavy drinking (defined as ≥4 and ≥5 standard drinks/day for women and men, respectively), and the probability of daily ‘safe’ drinking (defined as ≤1 and ≤2 standard drinks/day for women and men, respectively) (Ma et al., 2006 ). The model was:
where was the logit function, I(A) was an indicator function that takes value 1 (0) if the event A is true (false); pij was the subject-specific probability of drinking measure for the ith subject on the jth day, given all observed and unobserved characteristics for subject i ; “treat” was the indicator variable (0 for placebo and 1 for medication); tij = j was the day (since randomization) for the ith subject; ft(·) and fp(·) were two unspecified smooth functions of time for the medication and placebo groups, respectively, and ai and bi were the random intercept and slope, respectively, for the ith subject. We assumed that ai and bi follow bivariate normal distribution with mean 0 and covariance matrix Σ. We would like to emphasize that model (1) is a subject-specific model, the interpretation of which is conditional on the subject-specific random effects ai and bi.
The unknown functions ft(·) and fp(·) in model (1) were estimated by splines, which are special piecewise polynomials (usually of order 3) used to approximate complex shapes through smooth curve fitting to a set of noisy observations. There are various forms of splines, such as regression splines, penalized splines, and smoothing splines (Ruppert et al., 2003; Wahba, 1990; Wu and Zhang, 2006). Good performance of regression splines depends strongly on the location of knots and the number of knots. Smoothing splines overcome these drawbacks via taking all of the distinct time points as knots but at the price of an increase of computational burden. Penalized splines pre-specify the number of knots, and use a roughness penalty to control the smoothness of resulting estimated functions. The penalty term has a smoothing parameter that controls the trade-off between fidelity to the data and smoothness of the fitted spline. Selection of the penalty term and smoothing parameter is important for good curve fitting (Wood, 2006). Ruppert et al. (2003) showed that penalized splines perform well in various settings. Therefore, we used the penalized spline method for estimation of the smooth functions ft(·) and fp(·).
Model (1) belongs to the class of generalized additive mixed models (Lin and Zhang, 1999), which could be fitted by the gamm function in the R package mgcv (R Development Core Team, 2010). The mgcv package also provided auxiliary functions for extracting individual additive effects and computed point-wise confidence intervals. The confidence intervals shown in the figure are “point-wise” intervals; that is, they are valid (given appropriate assumptions) for each time point considered in isolation. However, even when there appears to be a separation of drug vs. placebo confidence intervals over several time points, it is not appropriate to conclude that drug and placebo differ significantly; simultaneous confidence bands valid across all time points would be wider than those shown. Replication is a good method for establishing the validity of exploratory analyses showing apparent differences at some time points.
We analyzed four drinking outcomes: daily drinking, heavy drinking, safe drinking, and the log number of drinks on a drinking day. The first three were binary outcomes, while the last one was a continuous variable. For illustration, we only show the estimated trajectories of the probability of heavy drinking for topiramate and naltrexone. Other results are relegated to the supplemental materials.
For the topiramate study, there were 183 subjects in the topiramate group and 188 subjects in the placebo group.
For the COMBINE study, we analyzed the data based on three scenarios.
Case 1: Subjects who received MM but no CBI. The naltrexone group (n=302) included subjects who took naltrexone only (n=154) plus those who received naltrexone and acamprosate (n=148), whereas the corresponding placebo group (n=305) included subjects who took acamprosate only (n=152) plus placebo recipients (n=153).
Case 2: Subjects who received both MM and CBI. The naltrexone group (n=312) included subjects who took naltrexone only (n=155) plus those who got naltrexone and acamprosate (n=157), whereas the corresponding placebo group (n=307) included subjects who received acamprosate only (n=151) plus placebo recipients (n=156).
Case 3: All the subjects in the 8 combinations of the factorial design. The naltrexone group had 614 subjects, and the corresponding placebo group had 612 subjects.
In Figure 1, the plot on the top-left panel shows the estimated trajectories of the probability of daily heavy drinking in the topiramate study. The plots on the top-right and bottom-left panels are for naltrexone versus placebo among subjects who, respectively, took MM and CBI and took MM but no CBI in the COMBINE study. The bottom-right plot is for naltrexone versus placebo among subjects who took pills (all 8 combinations) in the COMBINE study.
For the topiramate study, it can be seen that the probability of heavy drinking for both treatment and placebo decreased over the entire course of the trial, but it decreased more rapidly in the treatment arm. The large apparent differences shown in the graphs suggested large effect sizes at various points. There was a separation in the trajectory of the estimated heavy drinking probabilities (i.e., separation of 95% confidence intervals at each time point) between the two arms as early as week 2, despite the fact that topiramate was not fully titrated until week 6. The drinking trajectories of the topiramate arm generally stabilized around week 11; in contrast, the trajectories of the placebo were maintained throughout the trial. Such information would suggest that 14 weeks is a sufficient duration to show an effect of topiramate compared with placebo in future pharmacotherapy trials for alcoholism. Note that the confidence intervals in the plots are point-wise rather than simultaneous. Therefore, they can be used only to infer the presence or absence of a statistically significant effect at a time point of interest that is fixed a priori.
As shown in the top-right plot of the figure, for the subjects who took both MM and CBI, the estimated trajectories for the two arms had similar trends with respect to the daily heavy drinking outcomes. We could not distinguish between the estimated trajectories of the two arms. On the other hand, for the subjects who received MM but no CBI, the two arms separated at each time point very early during the treatment period (within a day or two) and achieved most of their separation around week 7, with some additional separation continuing until the end of the trial. The estimated trajectory for the probability of heavy drinking in the naltrexone arm was flat after week 7, while it showed a slightly increasing trend in the placebo arm.
For the subjects who took pills in the COMBINE study, it can be seen that the estimated trajectories for naltrexone and placebo started to separate at each time point at about week 3. The differences in location and trend between the two treatment arms suggested that naltrexone paired with MM was efficacious at reducing the daily probability of heavy drinking.
We proposed smoothing mixed models as a flexible exploratory method to characterize the trajectories between the treatment arms over the entire study period. These models described more accurately the non-linear temporal trend in the data compared with traditional parametric models.
It seemed that topiramate reduced the probability of daily heavy drinking more rapidly than placebo. Such results were consistent with classical linear mixed effects model findings (Johnson et al., 2007) that drinking decreased more rapidly in topiramate recipients than in placebo recipients, and spline results provided a reasonable estimate of how that difference changed over time. In the COMBINE study, taking naltrexone appeared to have a beneficial effect to reduce heavy drinking if the patients were not receiving CBI (i.e., they were receiving MM alone). However, such a beneficial effect was not observed for subjects receiving CBI, suggesting a naltrexone-CBI interaction. We also applied nonparametric models to other daily drinking outcomes and obtained similar patterns. Here we only presented the results for daily heavy drinking data to illustrate the power and practical value of this new modeling technique.
Trajectory plots showed clearly the temporal trend of the treatment with medications on daily heavy drinking. This provided insight on the expectation of pharmacological effects over time. For example, the daily heavy drinking probability of topiramate improved over time and continued to do so even at the end of the trial. In contrast, we observed that the treatment effect of naltrexone reached its asymptote around week 7 and, thereafter, showed relative deterioration. There are two plausible explanations for these observations. First, this might be because of the difference in the trial design. Whilst those who received naltrexone entered the trial after a brief period (about 4 to 21 days) of abstinence, those in the topiramate study were enrolled when they were still drinking heavily. Hence, those who got naltrexone might have been more prone to regress to the mean of some drinking whilst those in the topiramate group were still showing improvement. Second, unlike naltrexone, which works through single – albeit different – neurotransmitters (i.e., opioid and glutamate, respectively), topiramate has effects on multiple neuronal systems. As a result, the treatment effect of topiramate may be less likely to wane due to neuroadaptation and, therefore, more likely to be sustained or enhanced over time (Johnson, 2008). The clinical implication of this finding is that the longer the clinical trial, the greater would be the effect of topiramate compared with naltrexone on drinking outcomes.
From the point of consideration of “grace periods” after which to measure efficacy in the design of future clinical trials, the effect of naltrexone reached its peak at about week 7; hence, measuring outcome from this period up to 16 weeks probably would optimize the likelihood of finding a positive treatment effect. This is consistent with Falk et al. (2010), who found that a 2-month grace period resulted in a relatively high and stable naltrexone treatment effect. In contrast, topiramate’s treatment effect was evident at about 2 weeks, and measuring outcome from that time point forward would optimize the likelihood of finding a positive treatment effect. Whilst the current presentation provides estimates of potential grace periods prior to the evaluation of efficacy, there are two considerations that should be borne in mind. First, depending upon the pharmacological effects of different putative therapeutic medications, estimates of the projected grace period could vary. Second, because efficacy studies are typically of short duration, longer-term trials are needed to improve the validity of the grace period estimates. Nevertheless, our paper provides a reasonable method by which estimations of the grace period for use in pharmacotherapy trials for alcohol dependence could be explored.
It also would be of interest to see whether the differences between treatment arms would still persist if we shortened the follow-up of such clinical trials. For this reason, we conducted the trajectory analysis with a shortened study period for some of the comparisons (results available upon request). For naltrexone among all subjects, significant results for the point-wise comparison for the probability of heavy drinking was achieved at the end of follow-up using model (1) if we shortened the follow-up to 5 weeks, i.e., if we used only the first 5 weeks of data. The results provided insights on the optimal duration in future clinical trial design.
In our trajectory analysis, we only assumed random intercept and slope in our models. However, just like the fixed effect (mean function) of treatment, such random effects also could vary with time in an unspecified way. More advanced models, such as functional mixed models (Guo, 2004; Zhang, 2004), could be used to give a more flexible subject-specific fit to allow for individual curves.
It would be straightforward to include other independent variables in our trajectory analysis, such as age, sex, and baseline drinking, among many others. In the current studies, all these variables were balanced by the randomization so that such an adjustment was unnecessary. We currently are examining other data sets with unbalanced potential confounding variables.
It often is valuable to conduct an inferential statistical test to compare the response profiles between the medication and placebo groups. The null hypothesis is that there would be no difference between medication and placebo in drinking outcome after the grace period. Specifically, we considered testing the equivalence of two nonlinear functions in models (1) and (2), e.g., testing the hypothesis H0:fp(t) = ft(t) for t after the grace period in model (1). Zhang and Lin (2003) had developed an inferential procedure for testing a similar hypothesis. However, ready-to-use software is not available for large data sets such as those in our studies. Developing software for this inferential test is one goal of our ongoing research.
Missing data are a major issue in longitudinal clinical studies. In this paper, we used random effects models, which can still yield consistent results for data with missing at random mechanism. If missing is not at random, more advanced models, e.g., joint random effects models (Liu, 2009; Johnson et al., 2011), could be used to investigate the relation between dropout and longitudinal drinking outcomes.
We acknowledge that further replicative studies using different samples could be used to strengthen our exploratory analysis. Indeed, such studies are planned in the future using a data set from a recently completed pharmacogenetic trial of ondansetron for the treatment of alcohol dependence (Johnson et al. 2011). Perhaps, however, a more definitive prospective validation could come from a comparison of a holdout sample with that of subjects who remained in the study to determine whether, and by how much, there was either convergence or divergence in the spline estimates.
This research was supported by NIAAA grant RC1 AA019274 and AHRQ grant R01 HS020263. Dr. Zhang also was supported by NIH grant R01 CA085848-11, R37 AI031789-20, and R01 MH084022-02. The authors thank Drs. Daniel Falk and Raye Z. Litten for helpful discussions and comments, and Dr. Simon Wood for help in using the mgcv package. Robert H. Cormier, Jr. BA, provided assistance in preparing the manuscript.
A. R Code for Fitting Model (1) Using mgcv Package fitm<-gamm(binary~treatment+s(dayp,bs=“ps”)+s(dayt,bs=“ps”),family=binomial(link=“logit”), data=datas,random=list(day=~1|id)) # “binary” is the binary response; “dayp” is the time for the placebo group; “dayt” is the time for the medication group; “day” is the time for all groups; “id” is the subject. dat<-c(1:112)/112 # transform the duration of 16 weeks (1–112 days) to (0, 1) interval. newp<-data.frame(dayp=dat,dayt=0,treatment=0) estp<-predict(fitm$gam,newp,se.fit=TRUE) fit.p<-estp$fit # prediction for the placebo group fit.p.se<-estp$se.fit # prediction error for the placebo group newt<-data.frame(dayp=0,dayt=dat,treatment=1) estt<-predict(fitm$gam,newt,se.fit=TRUE) fit.t<-estt$fit # prediction for the medication group fit.t.se<-estt$se.fit # prediction for the medication group # plot the estimated trajectories for two treatment arms with 95% confidence limits plot(dat*112/7,exp(fit.p)/(1+exp(fit.p)),type=“l”,lty=2,xlab=“Week “, ylab=“Probability of “) lines(dat*112/7,exp(fit.p+qnorm(0.975)*fit.p.se)/(1+exp(fit.p+qnorm(0.975)*fit.p.se)),type=“l”,lty=3) lines(dat*112/7,exp(fit.p-qnorm(0.975)*fit.p.se)/(1+exp(fit.p1-qnorm(0.975)*fit.p.se)),type=“l”,lty=3) lines(dat*112/7,exp(fit.t)/(1+exp(fit.t)),type=“l”) lines(dat*112/7,exp(fit.t+qnorm(0.975)*fit.t.se)/(1+exp(fit.t+qnorm(0.975)*fit.t.se)),type=“l”,lty=3) lines(dat*112/7,exp(fit.t-qnorm(0.975)*fit.t.se)/(1+exp(fit.t-qnorm(0.975)*fit.t.se)),type=“l”,lty=3) B. R Code for Fitting Model (2) Using mgcv Package fitm<-gamm(continous~treatment+s(dayp,bs=“ps”)+s(dayt,bs=“ps”), family=binomial(link=“logit”), data=datas, random=list(day=~1|id)) # “continous” is the continuous response; all other variables are defined in part A.
Dr. Johnson has served as a consultant to Johnson & Johnson (Ortho-McNeil Janssen Scientific Affairs, LLC), Transcept Pharmaceuticals, Inc., D&A Pharma, Organon, ADial Pharmaceuticals, LLC, Psychological Education Publishing Company (PEPCo), LLC, and Eli Lilly and Company. Drs. Chen, O’Quigley, Isaac, Zhang, and Liu and Mr. Wang reported no biomedical financial interests or potential conflicts of interest.
Clinical Trials Registration
The ClinicalTrials.gov identifiers were NCT00006206 for the COMBINE study and NCT00210925 for the multi-site topiramate study.