Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3178337

Formats

Article sections

Authors

Related links

Test (Madr). Author manuscript; available in PMC 2011 September 22.

Published in final edited form as:

PMCID: PMC3178337

NIHMSID: NIHMS192624

The publisher's final edited version of this article is available at Test (Madr)

See other articles in PMC that cite the published article.

First, we would like to thank Joe and Geert for a carefully written review paper on longitudinal data. We would like to expand on several points discussed in this paper. Specifically, we would like to expand on 1) the interpretation of covariate effects and use of identifying restrictions with covariates in mixture models (Section 4.2.2) and 2) issues with sensitivity analyses in parametric models for the full **fix wording** and in selection models in general (Section 4.2.3).

In the following, we focus on the setting of covariates that are collected at baseline with no missingness.

In longitudinal studies, as discussed here, the main focus of inference is usually on the marginal distribution, *p*(** y**). In mixture models, the full-data model

Similarly, *E*[** Y**|

So, assessing the covariate effects on the marginal mean has to be done by averaging over patterns and needs to consider (1) whether the mean is linear in covariates; (2) whether marginal distribution of missingness depends on covariates, and (3) whether covariates effects are time-invariant. In this discussion, we will focus on the first issue.

For mixture models with an identity link, averaged covariates effects for the full-data distribution have a simple form as a weighted average over pattern-specific covariate effects and have a straightforward interpretation (Fitzmaurice et al., 2001). As an example, consider the full-data response ** Y** = (

When the link function, denoted by *g*, is non-linear, and the within-pattern *s* (*S* = *t _{s}*) mean model is

and we have in general

So it can be difficult to capture the covariate effects compactly (Fitzmaurice et al., 2001; Wilkins and Fitzmaurice, 2006). Roy and Daniels (2008) proposed to specify marginalized models and impose constraints on the conditional mean. This is in the spirit of earlier work by Azzalini (1994) and Heagerty (1999). A simple version of the model in Roy and Daniels is illustrated below.

First, the marginal mean is specified as

Second, a conditional model is specified to account for within-subject correlation and dependencies between the response and missingness pattern. We assume *Y _{ij}*, conditional on random effects

where

The conditional model has to be compatible with the marginal model. In particular, the intercepts Δ_{ij} are determined by the relationship

and are functions of other parameters including **β** in the model. Note that this is marginalized over both missingness patterns and subject-specific random effects. Serial correlation within pattern can be addressed by augmenting the conditional model with a Markov components (Heagerty, 2002).

Identifying restrictions can be problematic in pattern mixture models with baseline covariates with time-invariant coefficients. We will focus on the available case missing value (ACMV) restriction (Little, 1993; Molenberghs et al., 1998) here which corresponds to MAR. Missing at random (MAR) is often taken as a starting point for analysis of incomplete data (Troxel et al., 2004; Zhang and Heitjan, 2006).

To illustrate, consider ** Y** = (

where

for *r* = 0, 1. For the bivariate case, the ACMV restriction is

where denotes the equality in distribution. This requires that for all *Y*_{1},

which in turn implies that

This restriction identified the full data response distribution.

When there are baseline covariates with time-invariant coefficients, we have that

for *r* = 0, 1, where ** x** does not contain an intercept.

The MAR assumption requires that for all *X* and *Y*_{1},

By simple algebra, we can see that this restricts β^{(0)} to be equal to β^{(1)}. Note that both β^{(0)} and β^{(1)} are *identified* from the observed data. Therefore, the ACMV restricton/MAR assumption causes over-identification and has impact on the model fit to the observed data. This is against the principle of applying identifying restrictions (Little, 1994). Ways to remedy this (and associated problems) are explored in Wang and Daniels (working paper).

Sensitivity analysis is critical in longitudinal analysis of incomplete data with informative drop-out as stated in this paper. In the setting of missing data, the full-data model can be factored into an extrapolation model and an observed data model,

where **ω**_{E} are parameters indexing the extrapolation model and **ω**_{I} are parameters indexing the observed data model and are identifiable from observed data (Daniels and Hogan, 2008). Full-data model inference requires unverifiable assumptions about the extrapolation model *p*(*y*_{mis}|*y*_{obs}, ** r**,

Unfortunately, fully parametric selection models and shared parameter models do *not* allow sensitivity analysis as sensitivity parameters cannot be found (Daniels and Hogan, Chapter 8, 2008). Examining sensitivity to distributional assumptions, e.g., random effects, will provide *different* fits to the observed data, (*y*_{obs}, ** r**). In such cases, a sensitivity analysis cannot be done since varying the distributional assumptions does not provide equivalent fits to the observed data (Daniels and Hogan, 2008). It then becomes a problem of model selection. Next, we provide an example of the inability to find sensitivity parameters in a simple parametric selection model for binary data.

As an example, consider the situation when *Y* = (*Y*_{1}, *Y*_{2}) is a bivariate binary response with missing data only in *Y*_{2}. Let *R* = 1 if *Y*_{2} is observed and *R* = 0 otherwise.

Let be *P*(*Y*_{1} = *y*_{1}, *Y*_{2} = *y*_{2}, *R* = *r*) and be *P*(*Y*_{1} = *y*_{1}, *R* = 0). A multinomial parameterization of the full-data model of *Y* and *R* is shown in Table 1.

In this example, the set of parameters

are identified by observed data without any modeling assumption. When a selection model is fully parametric, all its parameters can be identified by the observed data. To see this, we specify a parametric model for the bivariate binary example:

Note that we assume

(1)

and

(2)

We will show that the full-data model is identified under the parametric assumptions by showing all parameters, (β_{0}, β_{1}, ϕ_{0}, τ) can be written as a function of **ω _{I}**, the identified ω’s.

First, note that

This gives

(3)

As a consequence, since τ has the interpretation that

thus it is identified by

where are identified by (3).

Further, since τ can also be expressed as

hence we have that are identified as

Therefore, in this parametric selection model, the parameters are all identified (as opposed to their sums, ).

Finally, we can show that

and

The factorization of a selection model provides a transparent way to understand the missing data mechanism. In Bayesian selection models, an intuitive prior specification assumes independence between the parameters of the missing data mechanism (**ϕ**) and the full data response (**β**)(Scharfstein et al., 2003).

However, in a Bayesian model under this prior specification, sensitivity parameters in a selection model, denoted by τ, can be (weakly) identified by the observed data, i.e. *p*(τ|*y*_{obs}, ** r**) ≠

In general, a semi-parametric selection model might specify the full data response distribution nonparametrically (or saturated if a categorical response), p(** y**;

for *j* = 1, …, *J*, where *h _{j}* is an arbitrary smooth function, and

To see the cause of the weak identification, let **θ** = {**ϕ**, **β**} and **ω _{I}** be the identified parameters. By re-parameterizing the model, we may find a mapping, indexed by τ, between

Due to the mapping, even a priori independence between τ and **θ** will yield a priori dependence between τ and **ω _{I}**, since

(4)

The Jacobian introduces the dependence.

The posterior for the sensitivity parameters **τ** can be expressed as

Thus from (4), *p*(τ|*y*_{obs}, ** r**) ≠

As a concrete example, consider a bivariate binary response with missing data only in *Y*_{2} from the previous section. A saturated selection model can be specified as

and **θ** = {β_{0}, β_{1}, β_{2}, ϕ_{0}, ϕ_{1}}. MAR holds when τ = 0. Note τ is not identified by the observed data. It can be shown that for any Δ_{τ}, there exists **Δ _{θ}**, such that

i.e (τ, **θ**) and (τ + Δ_{τ}, **θ** + **Δ _{θ}**) will yield the same law of observed data.

Let **θ**^{*} = {*e*^{α0}, *e*^{α1}, *e*^{α2}, *e*^{ϕ0}, *e*^{ϕ1}} and τ^{*} = *e*^{τ}. We can derive that

The a priori dependence of *p*(**ω _{I}**|τ) is thus introduced by . This has been pointed out in Scharfstein et al. (2003) and explored further in Wang et al. (working paper).

- Azzalini A. Logistic regression for autocorrelated data with application to repeated measures. Biometrika. 1994;81(4):767–775.
- Daniels MJ, Hogan JW. Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. Chapman & Hall/CRC; 2008.
- Fitzmaurice GM, Laird NM, Shneyer L. An Alternative Parameterization of the General Linear Mixture Model for Longitudinal Data with Non-ignorable Drop-outs. Statistics in Medicine. 2001;20(7):1009–1021. [PubMed]
- Heagerty PJ. Marginally Specified Logistic-Normal Models for Longitudinal Binary Data. Biometrics. 1999;55(3):688–698. [PubMed]
- Heagerty PJ. Marginalized Transition Models and Likelihood Inference for Longitudinal Categorical Data. Biometrics. 2002;58(2):342–351. [PubMed]
- Little RJA. Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association. 1993;88(421):125–134.
- Little RJA. A class of pattern-mixture models for normal incomplete data. Biometrika. 1994;81(3):471–483.
- Molenberghs G, Michiels B, Kenward MG, Diggle PJ. Monotone missing data and pattern-mixture models. Statistica Neerlandica. 1998;52(2):153–161.
- Roy J, Daniels MJ. A general class of pattern mixture models for nonignorable dropout with many possible dropout times. Biometrics. 2008;64:538–545. [PMC free article] [PubMed]
- Scharfstein DO, Daniels MJ, Robins JM. Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. Biostatistics. 2003;4(4):495. [PMC free article] [PubMed]
- Troxel AB, Ma G, Heitjan DF. An Index of Local Sensitivity to Nonignorability. Statistica Sinica. 2004;14(4):1221–1238.
- Wang C, Daniels MJ. A note on identifying restriction in normal mixture models with and without covariates for incomplete data. working paper. [PMC free article] [PubMed]
- Wang C, Daniels MJ, Scharfstein DO. Bayesian semiparametric selection model with application to a breast cancer prevention trial. working paper.
- Wilkins KJ, Fitzmaurice GM. A Hybrid Model for Nonignorable Dropout in Longitudinal Binary Responses. Biometrics. 2006;62(1):168–176. [PubMed]
- Zhang J, Heitjan DF. A Simple Local Sensitivity Analysis Tool for Nonignorable Coarsening: Application to Dependent Censoring. Biometrics. 2006;62(4):1260–1268. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |