|Home | About | Journals | Submit | Contact Us | Français|
The purpose of this paper is to explore more comprehensive methods to analyze antiretroviral non-adherence data. Using illustrative data and simulations, we investigated the value of using binary logistic regression (LR; dichotomized at 0% non-adherence) versus a hurdle model (combination of LR plus generalized linear model for >0% non-adherence) versus a zero-inflated negative binomial (ZINB) model (simultaneously modeling 0% non-adherence and >0% non-adherence). In simulation studies, the hurdle and ZINB models had similar power but both had higher power in comparison to LR alone. The hurdle model had higher power than ZINB in settings where covariate effects were restricted to one or the other part of the model (0% non-adherence or degree of non-adherence). Use of the hurdle and ZINB models are powerful and valuable approaches in analyzing adherence data which yield a more complete picture than LR alone. We recommend adoption of this methodology for future antiretroviral adherence research.
Near-perfect adherence to antiretroviral (ARV) therapy is necessary to successfully suppress HIV viral load, improve immunologic response, decrease HIV-associated morbidity and mortality, and reduce risk of HIV transmission. [1–4] The assessment of adherence has long been the focus of many studies. [5–7] Numerous direct measurement methods (such as quantification of concentrations of active drug or metabolites in blood,  urine,  and hair ) and indirect methods (such as patient self-report, [5–7] pharmacy refill records, [11, 12] pill counts,  and use of MEMS caps [5, 7]) have been used; but uncertainty remains over the best approach to assess medication adherence. While considerable effort has been expended on establishing superior means of medication adherence assessment, the ideal method to analyze adherence data has received far less attention and there is little consensus regarding a standard method to operationalize this measure.
As a continuous variable, adherence data are typically highly left-skewed and have a large percentage of values at 100%. Therefore, to date, the main methods of analyzing this outcome variable have consisted of dichotomizing at arbitrary cutoffs of 90%, 95%, or 100% and using binary logistic regression (LR) or categorizing subjectively at various cut-points, such as < 70%, 70–90%, and > 90% and analyzing with ordinal LR. These methods may lead to loss of power, loss of valuable information, are usually determined post hoc, and may lead to misleading conclusions. Additionally, the notion of grouping and ultimately equating an individual with 99% adherence, in an example where the cutoff is 100%, to someone with 0% adherence has little value in a clinical setting.
A closer look at the distribution of adherence data based on self-report often reveals a substantial proportion of subjects with 100% adherence and a gradually decreasing spread of individuals with less than 100% adherence. The distribution of this gradually decreasing spread of < 100% adherence data can often be described approximately as the mirror image of a gamma distribution. Generalized Linear Models using a gamma distribution and log link (GLM-GL) can be used when the outcome of interest is continuous, non-negative, and right-skewed and the main assumption for these models is that the standard deviation of the outcome is proportional to the mean. Medication adherence data are also continuous, non-negative, and can be transformed to become right-skewed by subtracting percent adherence from 100%, describing the outcome in terms of percent non-adherence. This transformation results in a large proportion of values at zero percent and an approximate gamma distribution for those with > 0% non-adherence. Therefore, it is reasonable to analyze these data using a two-component, complementary approach: initially, dichotomizing non-adherence at a specific cut-off (such as 0%), creating a binomial variable, and utilizing binary LR; then, employing GLM-GL to analyze the non-zero portion of the data. This approach is analogous to the use of hurdle models , which are two-part processes. Generally, in the first part a binary variable that measures whether a response falls above or at/below a certain limit is modeled (in this case ARV non-adherence dichotomized at 0% is modeled using binary LR) and the second part explains the observations that fall above that limit (in this case non-adherence > 0% is modeled using GLM-GL). The use of GLM-GL plus LR will be referred to as the hurdle model throughout the text.
Alternatives to the hurdle model include the zero-inflated Poisson and negative-binomial (ZINB) models. Using percent non-adherence as the outcome, both accommodate excess zeroes and extreme right skewness in the non-zero part of the data. The Poisson model assumes variance equal to the mean, as determined by covariates, while the negative binomial model posits a quadratic variance-to-mean relationship. Although conventionally used for counts, both models can be used for nonnegative outcomes taking on non-integer values. This approach differs from the hurdle model in modeling the observed zeroes as a latent or unobservable mixture of so-called “structural zeroes” (the zero-inflated part of the distribution, representing patients who have 0% non-adherence under any circumstances) and zeroes arising randomly from the Poisson or negative binomial distribution (representing individuals who sometimes have 0% non-adherence and sometimes not). In the Discussion section of this paper, we consider whether structural zeroes make sense in this context.
In this study we applied LR, hurdle (using GLM-GL in addition to binary LR), and ZINB models to self-reported ARV non-adherence baseline data from a randomized controlled trial, then utilized simulated data with known population parameters to explore situations in which one or both of the two-component models might be preferred.
We analyzed baseline data from the Healthy Living Project (HLP) , a multi-site randomized controlled trial to determine the effect of a behavioral intervention on sexual risk behaviors in HIV-positive participants. Baseline data were collected on 2,845 HIV-positive individuals who were on ARVs. Mean ARV adherence data was assessed using the AIDS Clinical Trials Group (ACTG) measure of three-day adherence  for each ARV and averaged for a participant's complete ARV regimen. For the purpose of demonstrating an example with the use of LR, hurdle, and ZINB models, age (categorized at ≤ 34, 35–44, and ≥ 45) was used as a historical predictor of ARV non-adherence, where higher age is associated with lower levels of non-adherence. [1, 16, 17] We utilized Stata, version 11 (Statacorp, College Station, TX, USA) for all analyses.
Percent ARV adherence was transformed to percent ARV non-adherence by subtraction from 100%. In assessing age effects, data were dichotomized at a cut-off of 0% (1 = 0% non-adherence; 0 = non-adherence greater than 0%) and binary LR using the Stata command -logistic- was employed to determine the odds of 0% non-adherence between age categories in comparison to the reference category (i.e., 35–44). A 2-degree-of-freedom (2-df) Wald chi-square test with a two-sided alpha of 0.05 was used.
Similarly, for the hurdle model, LR was used to analyze the odds of 0% non-adherence using the -logistic- command; additionally, for the subset of participants who had > 0% non-adherence we used GLM-GL via the Stata command -GLM-. GLM regression was conducted using a gamma distribution and log function to link mean non-adherence to the predictor (i.e., age) in order to determine the fold-change in mean non-adherence between various categories of the predictor in comparison to the reference category (i.e., 35–44). We used 2-df Wald chi-square test in the LR and GLM-GL components of the overall model, rejecting the null hypothesis if either p-value was less than a Bonferroni-corrected 0.025, in order to correct for multiple testing. In addition, we used an omnibus 4-df test for the combined effect of age in both components of the model using the -suest- command.
Lastly, for the ZINB model, we utilized the Stata command -zinb-. We rejected the null hypothesis if the omnibus 4-df Wald chi-square p-value for the combined effect of age in both parts of the model was less than 0.05. Throughout this paper, p-values where we compare across categories of the predictor (e.g. age) will be referred to as the “overall p-value” (2-df) and p-values where we combine across components of various models will be referred to as the “omnibus p-value” (4df).
In order to further investigate conditions where the hurdle (using LR plus GLM-GL) or ZINB models would perform more or less well in comparison to binary LR alone, we conducted a simulation study using three scenarios. Our outcome variable consisted of ARV non-adherence and our predictor was a hypothetical variable with four categories (0, 1, 2, and 3). The allocation of observations in categories 0–3 was 40%, 35%, 19%, and 6%, respectively. This was based on other predictors of ARV non-adherence, such as depression. Model parameters (signifying 0% non-adherence and degree of non-adherence) were selected jointly to yield the desired distribution of data under both the hurdle and ZINB models. The zeroes were generated as a Bernoulli random draw for each simulated participant and had the expected variability across runs. Since simulated values from a gamma distribution are not constrained to be less than 100%, all values of non-adherence greater than 100% were set to 100%. This small collection of data points with 100% non-adherence was also noted in the actual HLP data because a small group of individuals had reported that they had not taken a single dose of their ARVs within the past three days.
For each of the scenarios described below, we generated 1,000 datasets under the hurdle model and 1,000 datasets under the ZINB model, then applied LR, hurdle, and ZINB analysis models to each of the resulting 2,000 datasets. We estimated power and type-I error by the proportion of datasets in which each of the models rejected the null hypothesis, using the testing procedures previously described. With 1,000 simulated datasets, margins of simulation error are approximately 1.4–3.1 percentage points.
In Scenario 1, also called the Sample Size Scenario, we compared LR, hurdle, and ZINB models while focusing on the effect of increasing sample size from 200 to 3,000. Type-I error was estimated under the case of no differences with increasing sample size. For the power evaluation, we selected a situation in which the categorical predictor had moderate effects in both components of the data-generating hurdle or ZINB models (Table 1).
In Scenarios 2 (the Gamma Distribution Scenario) and 3 (the Binomial Distribution Scenario), we assessed the power of the LR, hurdle, and ZINB models under nine settings in which the predictor had small, moderate, or large effects in one or both components (i.e., 0% non-adherence or the degree of non-adherence) of the data-generating model. In these simulations, sample size was fixed at 2850, the size of the HLP sample.
Data from the HLP included 2,844 subjects with mean age of 42 years, who were 74% male, with mean CD4+ cell count of 427 cells/mm3 and mean HIV RNA of 2.57 log10copies/mL. These data were analyzed for the association between ARV non-adherence and age (≤ 34, 35–44, and ≥ 45 years). Age was not available for one subject. In this sample, 65% of the participants had 0% non-adherence (perfectly adherent) and among the remaining 35%, non-adherence ranged from 1 to 100%.
The LR, hurdle, and ZINB models gave very similar estimates of the effects of age on ARV non-adherence (Table 1). In the LR model, the odds of 0% non-adherence for individuals ≤ 34 and ≥ 45 years of age was 1.11 (95% CI = 0.88–1.39, p = 0.39) and 1.37 (95% CI = 1.15–1.62, p < 0.001) times that of individuals 35–44 years, respectively. There was statistically significant heterogeneity across the three age groups (Wald test overall p-value = 0.002).
Expanding on this analysis by adding GLM-GL in the hurdle model for those with > 0% non-adherence, we observed that the ratio of mean values of non-adherence in individuals who were ≤ 34 and ≥ 45 years of age was 1.22 (95% CI = 1.03–1.45; p-value = 0.02; mean predicted ARV non-adherence = 30%) and 1.01 (95% CI = 0.88–1.16; p-value = 0.87; mean predicted ARV non-adherence = 25%) times that of those 35–44 (mean predicted ARV non-adherence = 25%; Wald test overall p-value = 0.05; omnibus p-value = 0.001), respectively.
In the ZINB model, in addition to the interpretation stated for the LR model, we demonstrated that the mean non-adherence ratio in those aged ≤ 34 and ≥ 45 years was 1.23 (95% CI = 1.04–1.45; p-value = 0.02) and 1.01 (95% CI = 0.87–1.16; p-value = 0.87) times that of those 35–44 (Wald test overall p-value = 0.05; omnibus p-value = 0.001), respectively. Both the hurdle and ZINB models reported essentially identical results and both suggested that the effects of age groups on 0% non-adherence and degree of non-adherence are somewhat different.
Figure 1 demonstrates the probability of type-I error with LR, hurdle, and ZINB models with varying sample sizes. In these simulated data, the probability of type-I error was generally near the nominal 0.05 level, with the exception of the omnibus test in the hurdle model, which was very liberal with sample sizes of less than 1,000. Figure 2 shows power for the same range of sample sizes. The hurdle and ZINB models had higher power in comparison to LR. The hurdle model with the omnibus test had the highest power, but at the price of inflated probability of type-I error in small to moderate samples, as shown in Figure 1. All three models performed slightly better when the analysis and the data-generating models matched.
In these scenarios, sample size (n = 2,850) was unchanged for all settings. Table 2 shows simulated power for nine settings in which the covariate effects in each component (i.e., 0% non-adherence and degree of non-adherence) of the data-generating model are small, moderate, or large. As expected, LR was considerably less powerful than either hurdle or ZINB models when the covariate had small effects on 0% non-adherence but moderate or large effects on the degree of non-adherence. In contrast, the hurdle and ZINB models were sensitive to moderate or large covariate effects on either dimension of adherence. In the upper right and lower left corners of the table, settings where covariate effects were large in one component of the model but small in the other, the hurdle model was slightly more powerful than ZINB; in particular for the scenario with small effects on 0% non-adherence and large effects on degree of non-adherence (upper right).
Our simulations also showed that the proportions with 0% non-adherence are estimated with negligible (0–3%) downward bias by both the hurdle and ZINB models. However, they did suggest slightly greater downward biases (as much as 10%) in estimating mean levels of non-adherence, in particular when the ZINB model was used with data generated under a hurdle model.
Based on this methodological study, the hurdle model (GLM-GL plus LR) using the Bonferroni alpha correction and ZINB models for the analysis of ARV non-adherence are powerful and valuable tools. This is particularly true when the covariate effect is small on the 0% non-adherence. Furthermore, these models may distinctly be advantageous when the covariate has moderate or large effects on the degree of non-adherence because in this case LR is a less powerful test. In our analysis of baseline non-adherence in the Healthy Living Project, LR, hurdle, and ZINB models gave similar results, but the hurdle and ZINB models showed an effect of age on the degree of non-adherence in addition to its effect on the odds of 0% non-adherence.
Features of using these two-component analyses are that they use actual values of all available data, often making it statistically more powerful. The combination of these two tests gives more information about the data and isolates the source of variability in the data (i.e., within the dichotomized outcome, in the degree of non-adherence, or both). Both methods are easy to use, available in most statistical software, and allow for the selection of a specific adherence cut-off (e.g. 90%, 95%, or 100%). It is important to note that, given the elevated probability of type-I error with the use of the omnibus test, we recommend using the Bonferroni-corrected alpha in the hurdle model.
Zero-inflated models might be preferred over hurdle models on substantive grounds in cases where structural zeroes are interpretable. For example, the number of fish caught by a sample of people spending the afternoon at a riverside will include structural zeroes if some people in the sample do not spend any time fishing. However, medication non-adherence does not appear to be structural in this sense, in that all individuals have the opportunity of being non-adherent and no one is structurally forced to being perfectly adherent. While the interpretations of age effects on perfect adherence in HLP using hurdle and ZINB models differ slightly, the inferences and broader interpretations are essentially indistinguishable. Therefore, even in cases where we do not believe that structural zeros exist, the ZINB model is a valuable tool that can be used for explaining zeros that cannot be explained by the distribution of the continuous observations. Other fields where ZINB models have successfully been used include research on drugs  and alcohol  use.
The use of two-component models has some drawbacks. They are somewhat more complex than LR alone. Data cannot be summarized in a single finding and therefore may require a lengthier interpretation. In addition, the two-component models may be less powerful than a simple LR for 0% non-adherence when there is little variation in the degree of non-adherence. However, our simulations showed that in more realistic scenarios where covariate effects on perfect adherence were large but effects on level of non-adherence were small, the power of the hurdle and ZINB models was comparable to the power of LR.
It is also important to note that a major disadvantage with the use of self-reported medication non-adherence data is that there tends to be a larger proportion of zero percentage in comparison to other assessment methods, meaning that respondents tend to report perfect adherence at a higher frequency. This decreases the utility of two-component models because most of the variability in the data may be explained using the dichotomized outcome variable. Future research should focus on determining the utility the hurdle and ZINB models in analyzing non-adherence data assessed based on multiple methods (such as pharmacy refill records, pill counts, etc) where the percentage of individuals with 0% non-adherence may be less pronounced.
Another hypothesis generated from this discussion is whether a potential reason for discordant results of studies of the predictors of non-adherence is due to dichotomization and lack of analysis of the full range of non-adherence data. For example, younger age has been found to be a statistically significant predictor of non-adherence in many studies [1, 16, 17], yet other research has failed to show an association [20, 21]. Re-inspection and analysis of these data using the hurdle or ZINB models may result in more uniform results.
Although novel to medication adherence data, the application of two-component approaches has been utilized in many other fields of research. Models such as the zero-inflated Poisson  and other models on count data [23, 24] have been well-described and their use in clinical research has also been assessed [25–29]. Variants of the hurdle model, that preserve the information contained in the degree of non-adherence, such as log-normal or beta models in place of the gamma model, are easily implemented and should be considered, although our results with the HLP data suggest that estimates of covariate effects may be insensitive to these choices. Proportional odds models for categorized adherence data potentially summarize covariate effects in a single set of odds ratios, without completely wasting information about degrees of non-adherence. However, addressing violations of the proportional odds assumption, which are common in our experience, usually entails results at least as hard to interpret as those given by the two-component models we considered.
When the association of medication adherence with other predictors is the central aim of a study, it is important and ideal to assess this measure using multiple methods. [5, 7] As such, it is also critical to analyze these data by several approaches in order to get a fuller picture of the variability in the data. The use of hurdle (including GLM-GL plus LR) and zero-inflated models are powerful tools which allow for the analysis of the full spectrum of ARV adherence data and should be adopted in the analysis of future medication adherence research.
We thank Estie Hudes, PhD for content review of an earlier version of this study, Samantha Dilworth, MS for data management and data support, and Rafael Dumett for his translation of the study abstract to Spanish. The project described was supported by award numbers F32MH086323, K24MH087220, U10MH057616, CA82370, and P30MH62246. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.