Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2766273

Formats

Article sections

- Summary
- 1. Introduction
- 2. Review of Marginalized Transition Model
- 3. Marginalized Transition Models for Longitudinal Ordinal Data
- 4. Example
- 5. Conclusions and Discussion
- 6. Supplementary Materials
- Supplementary Material
- References

Authors

Related links

Biometrics. Author manuscript; available in PMC 2009 October 23.

Published in final edited form as:

PMCID: PMC2766273

NIHMSID: NIHMS143241

Keunbaik Lee, Department of Statistics, University of Florida, Gainesville, Florida 32611, U.S.A;

The publisher's final edited version of this article is available at Biometrics

See other articles in PMC that cite the published article.

Generalized linear models with serial dependence are often used for short longitudinal series. Heagerty (2002, *Biometrics* **58,** 342–351) has proposed marginalized transition models for the analysis of longitudinal binary data. In this article, we extend this work to accommodate longitudinal ordinal data. Fisher-scoring algorithms are developed for estimation. Methods are illustrated on quality-of-life data from a recent colorectal cancer clinical trial.

Quality-of-life (QOL) data in clinical trials is generally obtained repeatedly over time using questionnaires that can be completed by the patient without direct supervision. We will analyze QOL data from a recent colorectal cancer clinical trial (Goldberg et al., 2004). The main objective of this trial was to find a “better” treatment for colorectal cancer. However, given that the toxicity profiles were quite different on the three treatment arms, it was of interest to see if there was a negative impact of “better” treatments on patients’ QOL. In this article we compare the QOL on the three treatments. We will introduce a new regression model to answer this question using a full likelihood-based approach.

Generalized linear models for longitudinal data can be classified into directly specified (marginal) models and indirectly specified (conditional) models. In marginal models, the population-averaged effect of covariates on the longitudinal response is directly specified (Liang and Zeger, 1986; Fitzmaurice and Laird, 1993). In conditional models, the effect of covariates on responses is specified conditional on random effects or previous history of responses. Therefore, the population-averaged effect of covariates is indirectly specified (Zeger and Karim, 1991; Breslow and Clayton, 1993). Here, we focus on marginal models.

Marginalized likelihood-based models describe the correlation among observations by embedding the marginal mean structure within a complete multivariate probability model with dependence, modeled via random effects or a Markov structure. We will extend the previous work using a Markov structure (Azzalini, 1994; Heagerty, 2002) to ordinal categorical data. Related models for binary data can be found in Heagerty (1999) and Miglioretti and Heagerty (2004). This class of likelihood-based directly specified marginal models has several advantages over conditional models. First, the interpretation of regression coeEcients is invariant with respect to specification of the dependence in the model unlike in conditional models. In addition, they can be much less susceptible to bias resulting from random effects model misspecification (Heagerty and Zeger, 2000; Heagerty and Kurland, 2001).

Next, we provide a brief review of models for ordinal data on which our work will build. McCullagh (1980) proposed the cumulative logit model for independent ordinal data. A general overview of models for ordinal categorical data can be found in Liu and Agresti (2005). Most of the models for correlated ordinal data fall into two classes: those modeling dependence using the global odds ratio (GOR) (Dale, 1986) and those modeling dependence via random effects. The GOR is an extension of the simple odds ratio for a 2 × 2 contingency table formed when adjacent rows and columns of a *K* × *K* contingency table (*K* > 2) are collapsed into a 2 × 2 table. Molenberghs and Lesaffre (1994) specified the joint probability of multivariate observations through the first-and higher-order marginal parameters. Williamson, Kim, and Lipsitz (1995) developed marginal mean regression techniques based on the GORs as a measure of association and also considered the latent variable models and the maximum likelihood estimation (MLE) for bivariate ordinal responses. Heagerty and Zeger (1996) proposed marginal regression models for clustered ordinal data by specifying marginal means and marginal pairwise GORs and used estimating equations and alternating logistic regressions for inference. In terms of random effects, Gibbons and Hedeker (1997) developed random-effects ordinal regression models for the probit and logistic links. Todem, Kim, and Lesaffre (2006) and Liu and Hedeker (2006) proposed models for multiple ordinal outcomes in longitudinal settings using random effects.

All these approaches to modeling dependence have a goal of reducing the number of dependence parameters, thereby implicitly reducing the number of probabilities that need to be estimated. For example, a saturated model (with no covariates) for a five-category ordinal response measured at six occasions would require the estimation of 15,625 multinomial probabilities. Our approach to reducing this estimation problem will be to introduce a proportional odds model for the marginal means (probabilities) and a Markov transition structure for the temporal dependence. The use of a likelihood-based approach will have advantages for longitudinal data that are missing at random (MAR) and in particular, ignorable. Likelihood-based inference is valid under ignorability and the missing data mechanism (mdm) need not be explicitly specified. Semiparametric generalized estimating equation (GEE) approaches require explicit specification of the mdm for MAR. In addition, the re-weighting based approaches (based on the mdm) to handle MAR in GEEs only “impute” missing values at the observed data points. Likelihood-based approaches do not have this restriction and allow “imputation” of values outside the range of the observed data points via parametric distributional assumptions.

The article is arranged as follows. In Section 2, we briefly review marginalized transition models (MTMs) for longitudinal binary data. In Section 3, we consider two MTMs for the longitudinal ordinal data. Methods are illustrated in the QOL data in Section 4.

In this section, we review Heagerty’s marginalized model for the analysis of longitudinal binary data (Heagerty, 2002). Let *Y _{i}* = (

$$\text{mean}\phantom{\rule{0.16667em}{0ex}}\text{model}:\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{logit}\{{\mu}_{it}^{M}\}={x}_{it}^{T}\beta ,$$

(1)

$$\text{dependence}\phantom{\rule{0.16667em}{0ex}}\text{model}:\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{logit}\{{\mu}_{it}^{c}\}={\mathrm{\Delta}}_{it}+\sum _{j=1}^{p}{\gamma}_{\mathit{itj}}{y}_{it-j},$$

(2)

where
${\mu}_{it}^{c}=E({Y}_{it}\mid {Y}_{it-1},\dots ,{Y}_{it-p},{X}_{it})$ is the conditional mean,
${\gamma}_{\mathit{itj}}={z}_{it}^{T}{\alpha}_{j}$ are dependence parameters describing serial dependence by quantifying how strongly the immediate past predicts the present, *z _{it}* is a subset of

The mean parameter *β* describes changes in the average response as a function of covariates. Using a logistic link for the transition model, (2), implies that the parameters *α _{j}* are unconstrained. For a given mean model, (1), and dependence model, (2), the intercept

This model has several desirable features. First, the mean model is specified separately from the dependence model. As a result, the interpretation of the regression parameter *β* is invariant as we modify assumptions regarding the dependence in equation (2). This is not true for classical transition models, which parameterize
${\mu}_{it}^{c}$ directly as a function of covariates. Second, the MTM can be used with data where subjects have variable lengths of follow-up, permitting likelihood analysis in settings where data may be MAR.

Given *β* and *α*, *Δ _{it}* for an MTM(

$${\mu}_{it}^{M}=\sum _{{j}_{1},\dots ,{j}_{p}}{\mu}_{it}^{c}{\pi}_{i{j}_{1},\dots ,{j}_{p}}^{(t)},$$

where ${\pi}_{i{j}_{1},\dots ,{j}_{p}}^{(t)}=P({Y}_{it-1}={j}_{1},\dots ,{Y}_{it-p}={j}_{p})$. Further details are given in Heagerty (2002).

In this section, we propose two MTMs for longitudinal ordinal data, a simple extension of the MTM (OMTM) and an improved extension of the MTM (IOMTM).

Let *Y _{i}* = (

The *simple extension of MTM of order p*, an OMTM(*p*), is specified using the following two regressions,

$$\text{mean}\phantom{\rule{0.16667em}{0ex}}\text{model}:\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}log\frac{P({Y}_{it}\le k\mid {x}_{it})}{1-P({Y}_{it}\le k\mid {x}_{it})}={\beta}_{0k}+{\mathbf{x}}_{it}^{T}\beta ,$$

(3)

$$\begin{array}{l}\text{dependence}\phantom{\rule{0.16667em}{0ex}}\text{model}:\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}log\frac{P({Y}_{it}=k\mid {Y}_{it-1},\dots ,{Y}_{it-p},{x}_{it})}{P({Y}_{it}=K\mid {Y}_{it-1},\dots ,{Y}_{it-p},{x}_{it})}\\ ={\mathrm{\Delta}}_{\mathit{itk}}+\sum _{m=1}^{p}\sum _{l=1}^{K-1}{\gamma}_{\mathit{itml}}^{(k)}{Y}_{it-m,l},\end{array}$$

(4)

where *β* is the *r ×* 1 vector of regression coefficients, *β*_{01} < *β*_{02} < … < *β*_{0}_{K}_{−1},
${\gamma}_{\mathit{itml}}^{(k)}={z}_{it}{\alpha}_{ml}^{(k)}$, *z _{it}* is a

The dependence model (4) based on the multinomial logit link function models the probability of moving from category *k* at time *t* to category *k*′ at time *t* + 1 under *p* = 1. This is a natural and interpretable way to model the dependence for longitudinal data. An alternative approach would be to model dependence using GORs (Dale, 1986; Heagerty and Zeger, 1996) as discussed in the introduction, which also allows the interpretation of marginal mean parameters to be invariant with respect to specification of the dependence and allows for parsimonious specification of dependence. However, GOR does not directly represent the change over time for a subject, unlike the OMTM, and that is why we choose to specify the dependence using a Markov transition structure.

Because this model is a straightforward extension of the MTM for binary data, it shares many of the same advantages including having the attractive characterization of serial dependence that a transition model provides combined with a marginal regression structure. However, it does have some drawbacks. First, there are a lot of dependence parameters. For example, if *p* = 2, *K* = 4, and *q* = 1 (*z _{it}* = 1), we need

$$\begin{array}{l}log\frac{P({Y}_{it}=k\mid {Y}_{it-1},\dots ,{Y}_{it-p},{x}_{it})}{P({Y}_{it}=K\mid {Y}_{it-1},\dots ,{Y}_{it-p},{x}_{it})}\\ ={\mathrm{\Delta}}_{\mathit{itk}}+c({y}_{it-1},\dots ,{y}_{it-p};{\alpha}^{(k)}),\end{array}$$

(5)

where *c*(·) is a smooth function of *y _{it}*

$$\begin{array}{l}log\frac{P({Y}_{it}=k\mid {Y}_{i,t-1},\dots ,{Y}_{i,t-p},{x}_{it})}{P({Y}_{it}=K\mid {Y}_{i,t-1},\dots ,{Y}_{i,t-p},{x}_{it})}\\ ={\mathrm{\Delta}}_{\mathit{itk}}+\sum _{m=1}^{p}{\gamma}_{\mathit{itm}1}^{(k)}{Y}_{i,t-m}+{\gamma}_{\mathit{itm}2}^{(k)}{Y}_{i,t-m}^{2},\end{array}$$

(6)

where
${\gamma}_{\mathit{itml}}^{(k)}={z}_{it}^{T}{\alpha}_{ml}^{(k)}$ and
$\alpha ={({\alpha}_{11}^{T},{\alpha}_{12}^{T},\dots ,{\alpha}_{p1}^{T},{\alpha}_{p2}^{T})}^{T}$. We call the model given by equations (3) and (6) an *improved extension of MTM of order p*, an IOMTM(*p*). The IOMTM(*p*) has fewer dependence parameters than the OMTM(*p*). For example, an OMTM has 18 dependence parameters for *K* = 4, *p* = 2, and *q* = 1, whereas an IOMTM(2) has *p ×* 2 *×* (*K* − 1) = 12 dependence parameters. In general, for an OMTM(*p*), the number of dependence parameters increases quadratically with *K*, the number of categories, whereas in an IOMTM, the number increases linearly in *K*. The IOMTM also exploits the ordering of the categorical responses in the conditional model (6).

Clearly, for models (3)–(6) to form a coherent probability model, the *Δ _{itk}* will be constrained. Constraints are needed for (3), (4), and (6) to be a valid probability model. Specifically,

$$\begin{array}{l}P({Y}_{it}=k\mid {x}_{it})\\ =\sum _{{g}_{1}=1}^{K}\cdots \sum _{{g}_{p}=1}^{K}P({Y}_{it}=k\mid {Y}_{it-1}={g}_{1},\dots ,{Y}_{it-p}={g}_{p},{x}_{it})\\ \times \phantom{\rule{0.16667em}{0ex}}P({Y}_{it-1}={g}_{1},\dots ,{Y}_{it-p}={g}_{p}\mid {x}_{it-1}).\end{array}$$

(7)

In this article, we consider only first- and second-order (I)OMTM’s. Higher-order models are possible but require more dependence parameters and larger sample sizes.

In this subsection, we provide some details for OMTM(1) and IOMTM(1), which are often sufficient for short time series. The dependence models for marginalized first-order transition models [OMTM(1) and IOMTM(1)] are specified using the following two regressions,

$$\begin{array}{l}\text{OMTM}(1):\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}log\frac{P({Y}_{it}=k\mid {Y}_{it-1},{x}_{it})}{P({Y}_{it}=K\mid {Y}_{it-1},{x}_{it})}\\ ={\mathrm{\Delta}}_{\mathit{itk}}+\sum _{l=1}^{K-1}{\gamma}_{it1l}^{(k)}{Y}_{it-1,l},\end{array}$$

(8)

$$\begin{array}{l}\text{IOMTM}(1):\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}log\frac{P({Y}_{it}=k\mid {Y}_{i,t-1},{x}_{it})}{P({Y}_{it}=K\mid {Y}_{i,t-1},{x}_{it})}\\ ={\mathrm{\Delta}}_{\mathit{itk}}+{\gamma}_{it11}^{(k)}{Y}_{i,t-1}+{\gamma}_{it12}^{(k)}{Y}_{i,t-1}^{2}.\end{array}$$

(9)

Note that (9) is only valid for *t ≥* 2 because there are no history data available for the initial state (*t* = 1).

For both the IOMTM(1) and OMTM(1), mean and dependence parameters are orthogonal as given in the following theorem and corollary.

The marginal mean parameters, (*β*_{0}, *β*) are orthogonal to the dependence parameters, *α* in the OMTM(1).

The marginal mean parameters, (*β*_{0}, *β*), are orthogonal to the dependence parameters, *α* in the IOMTM(1) when c(y; *α*) is a polynomial in y.

Fitzmaurice and Laird (1993) showed consistency of marginal mean parameters regardless of the dependence structure. This was used by Heagerty (2002) in binary MTM’s. The following theorem gives the equivalent result for ordinal MTM’s.

The maximum likelihood estimate, (_{0,} ), remains consistent for (*β*_{0}, *β*) even if the dependence model is incorrectly specified.

Proofs of these results can be found in the Web Appendix.

The maximum likelihood algorithm for these models is described next. Define the marginal cumulative probabilities,
${P}_{\mathit{itk}}^{M}=P({Y}_{it}\le k\mid {x}_{it})$ for *i* = 1, …, *N; t* = 1, …, *n _{i}* ;

$$L(\theta ;y)=\prod _{i=1}^{N}\prod _{t=2}^{{n}_{i}}\prod _{k=1}^{K}{({P}_{\mathit{itk}}^{c})}^{{y}_{\mathit{itk}}}\times \prod _{i=1}^{N}\prod _{k=1}^{K}{({\pi}_{ik}^{(1)})}^{{y}_{i1k}},$$

where
${\pi}_{ik}^{(1)}={P}_{i1k}^{M}-{P}_{i1k-1}^{M}$. We propose a Fisher-scoring algorithm to find the MLE of the parameters of interest in the OMTM(1). For estimation of the *K* − 1 intercepts *β*_{0}* _{k}*(

$$\begin{array}{l}logL(\theta ;y)=\sum _{i=1}^{N}\sum _{t=2}^{{n}_{i}}\{\sum _{k=1}^{K-1}{y}_{\mathit{itk}}({\mathrm{\Delta}}_{\mathit{itk}}+{\gamma}_{it11}^{(k)}{y}_{it-11}\\ +\cdots +{\gamma}_{it1K-1}^{(k)}{y}_{it-1K-1})+log{P}_{\mathit{itk}}^{c}\}\\ +\sum _{i=1}^{N}\sum _{k=1}^{K-1}\{{R}_{i1k}{\phi}_{i1k}-{R}_{i1k+1}g({\phi}_{i1k})\},\\ \stackrel{\text{let}}{=}log{L}^{(2)}(\theta ;y)+log{L}^{(1)}(\theta ;y),\end{array}$$

(10)

where *R _{i1k}* =

To evaluate the likelihood, we need to evaluate both the contribution from the initial state, *L*^{(1)}, and the subsequent contribution from each transition probability, *L*^{(2)}. Let *θ* = (*β*_{0}, *β*, *α*) be the vector of parameters. We obtain

$$\frac{\partial logL}{\partial \theta}=\frac{\partial log{L}^{(2)}}{\partial \theta}+\frac{\partial log{L}^{(1)}}{\partial \theta}.$$

The *K* − 1 + *r* + (*K* − 1)^{2}*q*-dimensional likelihood equations and details on the Fisher-scoring algorithm used to solve the likelihood equations can be found in Supplementary Materials.

Estimation for the IOMTM(1) is similar.

For the initial states, we assumed a marginal model for *t* = 1, as we did for the first order models, and an OMTM(1) and IOMTM(1) for *t* = 2. The marginal models for OMTM(2) and IOMTM(2) are the same as (3) and the dependence models for the OMTM(2) and the IOMTM(2) are given by (4) and (6), respectively, with *p* = 2. Unlike the first-order OMTM and IOMTM, the orthogonality of marginal and dependence parameters no longer holds (i.e., cannot write it in canonical log-linear form). Thus, consistent estimation of the marginal mean parameters requires correct modeling of the dependence (though marginalized models of this form often offer some robustness; see Heagerty and Kurland, 2001). Details on MLE for the second-order models can be found in Supplementary Materials.

When the data are incomplete due to dropout that is associated with observed outcomes (i.e., MAR), the orthogonality of marginal mean and dependence parameters in the (I)OMTM(1) no longer holds. Thus, the consistency of the marginal mean parameters under MAR relies on correctly specifying the dependence structure; otherwise, the score functions for the mean parameters can be biased.

We conducted several simulation studies to examine robustness to misspecification of dependence model when there is ignorable (MAR) dropout. We set *T* = 8 and *N* = 500 with an approximately equal sample size for each of the three (treatment) groups. We considered two different marginal mean models, (3), one with an unstructured mean over time with design vector of the form,
${x}_{it}^{T}=(0,\dots ,1,\dots ,0,\text{group}{1}_{i},\text{group}{2}_{i})$, where the 1 is in the *t*th slot, and one with a linear trend over time,
${x}_{it}^{T}=({\text{Visit}}_{it},\text{group}{1}_{i},\text{group}{2}_{i})$, where groupj* _{i}* is an indicator of whether the subject was in group

For each simulation, we considered the following MAR dropout model,

$$\text{logit}P(\text{dropout}=t\mid \text{dropout}\ge t)=-2.5+0.6{Y}_{it-1}+0.3{Y}_{it-2}.$$

We simulated data using (9) with
${\gamma}_{it1m}^{(k)}={\alpha}_{m0}+{\alpha}_{m1}\times {\text{Visit}}_{it}$, *m* = 1, 2, (*α*_{10}*, α*_{11}*, α*_{20}*, α*_{21}) = (−1.0, 0.5, −0.5, 0.1) and fit the model assuming
${\gamma}_{it1m}^{(k)}={\alpha}_{m0}$.

Table 1 presents the point estimates and the percent relative bias. Percent relative biases in the coefficients were small, the largest being about 3%. The coefficients of time-varying covariates in the structured mean model appeared to be more robust to misspecification of dependence and MAR missingness than in the unstructured mean model.

Bias of the IOMTM(1) MLEs when data are MAR. Displayed are the average regression coefficient estimates and the percent relative bias, 100 × ( − β)/β

We performed several similar simulations using the same models for the marginal means, but simulating data under an IOMTM(2) and then fitting an IOMTM(1) with and without MAR dropout. In the presence of dependence misspecification and MAR dropout, we saw considerable bias, with relative biases as large as 74% for the unstructured mean (see Supplementary Materials for the results of these simulations). Under dependence misspecification with no dropout, the biases were quite small despite the orthogonality of the mean and dependence parameters being lost in the IOMTM(2). These simulations further emphasize the importance of correctly specifying the dependence in the presence of missing data. We also did simulations (not shown) to examine small sample bias in the MLE’s with complete data. The bias in the mean parameters was negligible. These results can be found in Lee (2007).

We use the models proposed in Section 3 to analyze QOL data from the clinical trial of metastatic colorectal cancer introduced in Section 1. A total of 795 patients with colorectal cancer were randomly assigned to one of the three treatments [FOLFOX, IFL(Control), IROX] between May 1999 and April 2001 (Goldberg et al., 2004). We focus on one QOL measure, fatigue. Fatigue is measured on a five-point ordinal scale (1: I am usually not tired at all; 2: I am occasionally rather tired; 3: There are frequently periods when I am quite tired; 4: I am usually very tired; 5: I feel exhausted most of the time). Because very few patients reported category 5, we collapsed category 4 and 5 into one category. For numerical reasons, we rescaled the response as −0.9, −0.3, 0.3, and 0.9. in the dependence model; the re-scaling was not needed for the marginal mean model.

To examine treatment differences in fatigue levels, we included type of treatment (ARM1 = 1 for FOLFOX; ARM2 = 1 for IROX) and visit number (TIME = 0.0, 0.1, …, 0.5) again re-scaled. The Akaike Information Criteria (AIC; Akaike, 1974) and likelihood ratio tests were used as the model selection criteria.

We assumed the missing responses (mostly due to dropout) were MAR in our initial analysis. In Section 4.2, we more carefully handled the dropouts related to the reason for dropping out (including death).

The Fisher-scoring algorithm is not trivial computationally due to the need to obtain an estimate of *Δ _{it}* for all subjects and at all times, within each Fisher-scoring step. However, in our data set, with 795 patients, a four-category ordinal response and six visit times, the computational burden was not high. Each Fisher-scoring step (in which all the

We fitted and compared seven models. Three were IOMTM(1)’s, three were IOMTM(2)’s, and the other was an OMTM(1). IOMTM(1)-1 and IOMTM(2)-1 were the simplest models, with time homogeneous dependence,
${\gamma}_{\mathit{itj}}^{(k)}={\alpha}_{j0}^{(k)}$. IOMTM(1)-2 and IOMTM(2)-2 allowed the dependence coefficients to depend on Time,
${\gamma}_{\mathit{itj}}^{(k)}={\alpha}_{j0}^{(k)}+{\alpha}_{j1}^{(k)}\times {\text{visit}}_{it}$. IOMTM(1)-3 and IOMTM(2)-3 had dependence parameters depending on treatment,
${\gamma}_{\mathit{itj}}^{(k)}={\alpha}_{j0}^{(k)}+{\alpha}_{j2}^{(k)}\times \text{Arm}{1}_{i}+{\alpha}_{j3}^{(k)}\times \text{Arm}{2}_{i}$. The OMTM was an OMTM(1) with time homogeneous dependence model,
${\gamma}_{\mathit{itj}}^{(k)}={\alpha}_{j0}^{(k)}$ for *j* = 1,2,3.

Table 2 presents MLEs for all seven models. Point estimates and standard errors for marginal mean parameters for the OMTM were similar to those for the IOMTM(1) and IOMTM(2). To compare the fit of the models, in particular, the dependence structure, we computed the AIC. The AIC for OMTM was 3800.700 and for IOMTM(1)-1 3876.306, indicating that IOMTM(1)-1 fit better than OMTM. Comparison of deviances for IOMTM(1)-1, IOMTM(1)-2, and IOMTM(1)-3, which are nested, yields
$\mathrm{\Delta}{D}_{12}^{(1)}=2\times (1924.153-1916.225)=15.856$, *p*-value = 0.015 on 6 d.f and
$\mathrm{\Delta}{D}_{13}^{(1)}=2\times (1924.153-1916.690)=14.926$, *p*-value = 0.246 on 12 d.f. The AIC’s for IOMTM(1)-2 and IOMTM(1)-3, which were not nested, were 3872.450 and 3885.380, respectively. These comparisons indicated that IOMTM(1)-2 was the better-fitting model among IOMTM(1)’s and OMTM. Similarly, we had IOMTM(2)-2 as better fitting among IOMTM(2)’s by comparison of deviances and the AIC’s (
$\mathrm{\Delta}{D}_{12}^{(2)}=22.156$, *p*-value = 0.036 on 12 d.f.;
$\mathrm{\Delta}{D}_{13}^{(2)}=15.123$, *p*-value = 0.917 on 24 d.f.; AIC for IOMTM(2)-2 = 3825.910, AIC for IOMTM(2)-3 = 3861.820). Again, we compared deviances for IOMTM(1)-2 and IOMTM(2)-2 (*ΔD*_{22} = 54.540, *p*-value < 0.0001 on 12 d.f.) and concluded that IOMTM(2)-2 was the best-fitting model.

Figure 1 presents conditional probabilities given previous values for the OMTM(1) given in (8). We arbitrarily chose one subject and visit because the shapes of conditional probabilities are the same across subjects (the intercepts *Δ _{itk}* only shift the trajectories up or down). The trajectories of conditional probabilities change quadratically with previous responses. Thus, the quadratic model given in (9) appears adequate to explain the change in the conditional probabilities as a function of the previous response. We were unable to get an OMTM(2) to converge to obtain a similar plot for first- and second-order dependence.

In the IOMTM(2)-2, the *β* coefficient of Visit * Arm2 was significant with an estimate of −1.847 and SE of 0.874. This indicated that the level of fatigue increased over time on the IROX arm and this was significantly greater than for patients on the IFL (control) arm. The *β* coefficient of Visit * Arm1 was not significant. This indicated that the level of fatigue did not change over time on the FOLFOX arm and there was no significant difference of fatigue over time between the IFL and the FOLFOX arms. The cumulative probabilities for the IROX arm decreased more steeply over time compared to the FOLFOX arm and the IFL arm. Thus, for this particular measure of QOL, fatigue, the worst treatment was not the one that provided the longest time to progression and best overall survival (FOLFOX).

A large number of subjects dropped out due to tumor progression or death (40.5%). Thus, the analysis in Section 4.1 is problematic as among other things, it implicitly imputes missing values for the dropouts due to death. Imputing such values for subjects who died is not a sensible approach. We would argue for QOL data, that imputing these QOL values for those who dropped out due to their disease progressing is not sensible either.

Table 3 summarizes the reasons for dropout in the three treatment arms. The dropout rates due to progression/death were marginally higher in the IFL and the IROX arms. To deal with events due to progression/death, we used a mixture model for the joint distribution of longitudinal measures and progression/death times (Hogan and Laird, 1997) similar to previous work by Pauler, McCoy, and Moinpour (2003) and Kurland and Heagerty (2005). For this data set, we had the actual progression/death times for subjects who dropped out due to progression/death and also for those who dropped out for other reasons. Table 4 summarizes the visit window in which patients progressed/died for all three treatments. Note, for some subjects, this is past the time they actually dropped out.

Define *S* to be the progression/death time. We specify the ordinal MTM models conditional on progression/death time, *S*. Due to the small sample sizes when conditioning on individual times, we assumed the parameters were the same for *S* = 1, …, 5 (those who progressed/died before the end of the study) but different for those who did not, *S* = 6. This approach implicitly assumes that for a given progression/death time (pattern), that missing data before the progression/death time is MAR (conditional on pattern); this situation occurs if dropout is not due to progression/death.

Within this framework, the probability of fatigue for patients who progressed/died beyond time *k* is given by

$$\begin{array}{l}P({Y}_{t}>c\mid S>k,\text{Tx})\\ =\frac{{\displaystyle \sum _{j>k}}P({Y}_{t}>c\mid S=j,\text{Tx})P(S=j\mid \text{Tx})}{{\displaystyle \sum _{j>k}}P(S=j\mid \text{Tx})}.\end{array}$$

(11)

where Tx is treatment and *P*(*Y _{t}* >

We proposed two MTMs for longitudinal ordinal data that directly model marginal probabilities as a function of covariates while accounting for the longitudinal correlation via a Markov structure. The IOMTM has fewer dependence parameters than the OMTM and exploits the ordering of previous responses. Parameter estimation was based on maximum likelihood using a Fisher-scoring method. We showed that the ML estimates for the (I)OMTM(1) were consistent regardless of specification of (8) when there was no missing data. Simulation studies indicated that marginal mean parameter estimates were robust to the dependence model being incorrectly specified in (I)OMTM(1)’s under MAR dropout. However, IOMTM(2)’s were much less robust to dependence model misspecification under MAR dropout.

Calculations and analyses in this article are based on first-and second-order (I)OMTMs. Extension to higher orders is also possible. However, such models introduce more dependence parameters and more complex computing and constraints; such models are workable for larger data sets. An alternative formulation would be to introduce random effects instead of, or in addition to, previous response to model the longitudinal dependence. This is ongoing work.

As discussed in Section 4, about 40% of the patients dropped out due to tumor progression or death. We adjusted for this using a pattern mixture approach with patterns defined based on the observed progression/death times. To obtain the causal effect of treatment while accounting for progression/death, a principal stratification approach (Frangakis and Rubin, 2002) could be developed, which would condition on progressing/dying past a given time on all three treatment arms. This is also ongoing work.

In (5), we specified a very general formulation for *p*th-order Markov dependence and then simplified it to an additive structure with quadratics for the individual components. Using the more general formulation is possible and would allow for more flexible dependence; for example, removing the additive restriction and allowing interactions. Computationally, this would create no additional burden. We are currently exploring such generalizations.

Web Appendices, detailed calculations, tables, and figures referenced in Section 3 are available under the Paper Information link at the *Biometrics* website http://www.tibs.org/biometrics.

Click here to view.^{(110K, pdf)}

Click here to view.^{(125K, pdf)}

We would like to thank Dr Daniel Sargent and Ms. Erin Green of the Mayo Clinic for providing the data and for their help in data collection and clarifying issues with data and the study, including dropout. The authors gratefully acknowledge the associate editor and a referee for helpful comments on the manuscript. This project was supported by NIH grants CA85295 and HL079457.

Keunbaik Lee, Department of Statistics, University of Florida, Gainesville, Florida 32611, U.S.A.

Michael J. Daniels, Department of Epidemiology and Biostatistics, Department of Statistics, University of Florida, Gainesville, Florida 32611, U.S.A.

- Akaike H. A new look at the statistical model identification. IEEE Transactions Automatic Control. 1974;19:716–723.
- Azzalini A. Logistic regression for autocorrelated data with application to repeated measures. Biometrika. 1994;81:767–775.
- Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. Journal of the American Statistical Association. 1993;88:125–134.
- Dale JR. Global cross-ratio models for bivariate, discrete, ordered responses. Biometrics. 1986;42:909–917. [PubMed]
- Fitzmaurice GM, Laird NM. A likelihood-based method for analysing longitudinal binary responses. Biometrika. 1993;80:141–151.
- Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. [PubMed]
- Gibbons R, Hedeker D. Random effects probit and logistic regression models for three-level data. Biometrics. 1997;53:1527–1537. [PubMed]
- Goldberg RM, Sargent DJ, Morton RF, Fuchs CS, Ramanathan RK, Williamson SK, Findlay BP, Pitot HC, Albets SR. A randomized controlled trial of fluorouracil plus leucovorin, irinotecan, and oxaliplatin combinations in patients with previously untreated metastatic colorectal cancer. Journal of Clinical Oncology. 2004;22:23–30. [PubMed]
- Heagerty PJ. Marginally specified logistic-normal models for longitudinal binary data. Biometrics. 1999;55:688–698. [PubMed]
- Heagerty PJ. Marginalized transition models and likelihood inference for longitudinal categorical data. Biometrics. 2002;58:342–351. [PubMed]
- Heagerty PJ, Kurland BF. Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika. 2001;88:973–985.
- Heagerty PJ, Zeger SL. Marginal regression models for clustered ordinal measurements. Journal of the American Statistical Association. 1996;91:1024–1036.
- Heagerty PJ, Zeger SL. Marginalized multilevel models and likelihood inference (with discussion) Statistical Science. 2000;15:1–26.
- Hogan JW, Laird NM. Mixture models for the joint distribution of repeated measures and event times. Statistics in Medicine. 1997;16:239–257. [PubMed]
- Kurland BF, Heagerty PJ. Directly parameterized regression conditioning on being alive: Analysis of longitudinal data truncated by deaths. Biostatistics. 2005;6:241–258. [PubMed]
- Lee K. PhD thesis. University of Florida; 2007. Marginalized regression models for longitudinal categorical data.
- Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22.
- Liu I, Agresti A. The analysis of ordered categorical data: An overview and a survey of recent developments. Test. 2005;14:1–73.
- Liu LC, Hedeker D. A mixed-effects regression model for longitudinal multivariate ordinal data. Biometrics. 2006;62:261–268. [PubMed]
- McCullagh P. Regression models for ordinal data. Journal of the Royal Statistical Society, Series B. 1980;42:109–142.
- Miglioretti D, Heagerty P. Marginal modeling of multilevel binary data with time-varying covariates. Biostatistics. 2004;5:381–398. [PubMed]
- Molenberghs G, Lesaffre E. Marginal modeling of correlated ordinal data using a multivariate Plackett distribution. Journal of the American Statistical Association. 1994;89:633–644.
- Pauler DK, McCoy S, Moinpour C. Pattern mixture models for longitudinal quality of life studies in advanced stage disease. Statistics in Medicine. 2003;22:795–809. [PubMed]
- Todem D, Kim K, Lesaffre E. Latent-variable models for longitudinal data with bivariate ordinal outcomes. Statistics in Medicine. 2006;26:1034–1054. [PubMed]
- Williamson JM, Kim K, Lipsitz SR. Analyzing bivariate ordinal data using a global odds ratio. Journal of the American Statistical Association. 1995;90:1432–1437.
- Zeger SL, Karim MR. Generalized linear model with random effects: A Gibbs sampling approach. Journal of the American Statistical Association. 1991;86:79–86.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |