In any data-analysis situation, there is a multitude of decisions that must be made, from determining a statistical model for the distribution of the data to choosing the structural models that best answer the research questions. The study of problem behaviors can be a particular challenge; the behaviors represent developmental processes, requiring longitudinal analyses, and because the behaviors are rare, their distributions are highly skewed. In addition, most data measuring these behaviors are discrete (binary, ordinal, or count). We have presented a selection of alternative models for analyzing longitudinal categorical data. These approaches assume more appropriate distributions for problem-behavior data than the traditional continuous-data models, permitting researchers to have more faith in the results.
Starting with the concepts involved in the generalized linear model (GLM) for categorical data, we discussed longitudinal extensions, including the hierarchical (or longitudinal) version of the generalized linear model (HGLM), and several specifications of mixture models. While there has been a great deal of excitement recently over using mixture models to find homogeneous unobserved subgroups in the population, we showed how such approaches can also be viewed as extensions of the random-effects model (
Bauer & Curran, 2003;
B. Muthén & Asparouhov, 2006), which relax some of its strong assumptions. Because of this, they might be the most appropriate choices for certain types of data distributions, irrespective of any question of subpopulations.
Our goal in this paper was to present a group of related models for longitudinal categorical data and to offer substantive researchers a useful guide to testing and selecting between the alternative models. This is a particularly difficult issue with categorical data and complicated longitudinal models, such as mixture models, because, as we find here, there are no completely reliable fit statistics for these types of models, and no readily available measures of absolute fit. We suggested several considerations to use in model selection. The first have to do with convergence properties, measures of comparative fit, and residual analysis. We dismissed models that did not converge well (including both improper solutions and failure to repeat the best loglikelihood value). From the remaining models, we chose the most plausible ones based on the fit statistics, and then ruled out several more because they failed to adequately reproduce the observed cell frequencies or were equivalent to more parsimonious models with respect to fit. Next, we included substantive considerations in model selection. For instance, we rejected a model because its substantive meaning was both obscure, rendering the model less useful, and not a reasonable fit to theory or to prior research. We also selected one of two similar models, despite poorer fit, because it was more parsimonious. Of course, we tested a limited number of models and only a single covariate. Choice and specification of covariates are an important factor in model selection and differences in these could potentially lead to different model choices
Through this process of elimination, we selected the two models that offered the most reasonable combination of parsimony, fit to the data, and fit to theory—the hierarchical generalized linear model (HGLM) and the 3-class latent class growth analysis (LCGA). The two models also use roughly the same number of parameters: 11 and 13, respectively and the substantive findings were similar. Both showed that, while adolescents are increasingly likely to drink as they approach the end of high school, those who, in seventh grade, reported having more friends who were drinking, were more likely to be drinking, themselves, throughout the rest of the secondary school years. The choice between these two models, based on the data we have presented, is, in our opinion, largely a matter of ease of interpretation, although further investigation could turn up substantive or statistical reasons to choose one over the other, or even select a different model over these two.
This study has the strengths and the weaknesses that inevitably result from using actual data as an example. On the one hand, it shows how the models might look in an actual research situation. On the other hand, with simulated data, we could compare the model results with a known population to discover whether there are any systematic biases in the analyses. It is also possible that different approaches to measuring alcohol you could yield different results (
Feldman & Masyn, 2008). Finally, there is one more critical piece which warrants further study. Often the purpose of longitudinal studies is to predict what will happen to the young people later in life. The models may not be of much use if they fail in that respect, no matter how well they appear to characterize the data at hand. Future research needs to take distal outcomes into account in assessing the adequacy of the models.
In this paper we present and compare a fairly large set of models, but it is not an exhaustive list of potential models. There is an infinite set of possible models and, with empirical data, we cannot know what the true model is. For instance, there are models outside of the GLM framework that could be considered when growth is not linear or curvilinear, as well as two-part models, in which behavior is modeled as two related processes: (1) exhibiting the behavior versus not doing so, and (2) intensity or extent of the behavior (
Olsen & Schafer, 2001;
Tooze et al., 2002). However, although there remains some uncertainty with respect to model choice, we feel that all of the methods shown here may offer improvement over approaches that treat all data, irrespective of their actual distributions, as continuous and normally distributed.
In addition, because comparing longitudinal categorical models is not always straightforward, we have offered a systematic approach to assessing and selecting models from among several competing, equally appropriate statistical options. It is our belief that adopting this new business-as-usual approach will improve our ability to understand trajectories of problem behaviors over childhood and adolescence. With improved models, we can better determine risks and protective factors, and this information could potentially be used for early identification of those most at-risk, helping to select the best candidates for targeted preventive interventions.