In the current study, we consider a longitudinal data setting with binary outcomes, a repeated binary predictor, a repeated continuous mediator, and a continuous covariate measured at baseline. An example of such a clinical setting would be a prospective cohort study evaluating the impact of heavy alcohol consumption on HIV disease progression, defined as low CD4 cell count (e.g. <350 cells/*μ*L). Heavy alcohol consumption may influence progression of HIV, while also influencing adherence to anti-retroviral therapy (ART). Level of adherence to ART is also a predictor of HIV disease progression. In this setting there is a repeated binary independent variable of primary interest, heavy alcohol consumption (*z*_{j}), and a longitudinal binary outcome, low CD4 cell count (*Y*_{j}) signifying HIV progression. In addition, ART adherence (*M*_{j}), a continuous mediating variable, is measured repeatedly, and age (*w*) is a continuous covariate assessed at baseline. ART adherence is said to be a mediator because the primary independent variable, heavy alcohol use, may affect CD4 count directly as well as indirectly through ART adherence. We arbitrarily assume six time-points at which the predictor, outcome and mediator are measured. Time is represented by *t*_{j }with *j *= 1, 2,..., 6. In this setting, we considered measurement times to be equally spaced and the same for all individuals. We generated data with a mediated non-linear relationship between the predictor (heavy alcohol consumption) and outcome (low CD4 cell count), i.e. we allowed the mediator (ART adherence) to be directly affected by the predictor and the outcome to be directly affected by both the predictor and mediator. Both the probit and logit links were assessed. We also describe the application of these models to data from a prospective cohort study evaluating the impact of heavy alcohol use on HIV disease progression.

As described by Fitzmaurice, Laird, and Ware [

18] and others, binary outcome models can be described equivalently in two ways. The first approach would be to define a linear function of an underlying latent continuous variable (

*Y**) that when dichotomized represents the observed binary outcome (

*Y*). For example, we could define a continuous latent (unobserved) variable

*Y** such that observed

*Y *is 1 if

*Y** > 0 and 0 otherwise and write:

A second approach would be to define a non-linear model of the probability of a binary response. If we consider ϵ ~ N(0,1), the model in Equation 1 defines a univariate probit model that can be equivalently represented using the following non-linear link format:

Likewise, if we could consider the errors in Equation 1 to have standard logistic distribution (mean of 0 and variance of

) the model defines a univariate logistic model that can be represented as:

In more complex situations, such as the longitudinal data we are studying, the same equivalence between model descriptions exist and we use both model formulations for the NLMMs and SEMs that follow. The convention for binary or categorical outcomes in SEMs has been to describe binary regression models with the latent variable format while the NLMMs are often defined using the non-linear link format.

SEM

To evaluate the performance of NLMMs in a setting conducive to the use of SEMs, we generated mediated longitudinal binary outcomes using a non-linear SEM. We then fit the data with a NLMM as well as the non-linear SEM to evaluate the performance of the NLMM relative to the SEM. The non-linear SEM used to generate the data and subsequently fit to the generated data is described below.

Following the notation from above,

*x*_{j }is the independent variable of primary interest,

*M*_{j }is the continuous mediating variable,

*w *is a continuous time-invariant covariate, and

*t*_{j }represents time-point. Using the latent variable notation, we define a continuous unobserved outcome

that takes a value of 1 only if

for

*j *= 1 to 6. This model can be expressed as follows (dropping the subject index

*i *for simplicity), where:

Measurement model

Just as in the simpler models above, if we assume ϵ

_{j }~ N(0,1) this defines a probit model and if we assume ϵ

_{j }~ Logistic (

) this defines a logit model.

Structural model

For *j *= 1 to 6,

where *U*_{i1 }represents a latent intercept, *U*_{i2 }represents a latent slope, *z*_{ij }represents the repeated binary predictor and *M*_{ij }represents the repeated continuous mediator. The errors in the structural model are normally distributed with cov(*ζ*_{1}*, ζ*_{2}) = Ψ, cov(*ζ*_{3 }: *ζ*_{8}) = Θ and *ζ*_{(2+j) }~ N(0*, θ*). This model can be represented in a path diagram (Figure ), a visual display of the interrelationships between variables typically presented along with SEMs.

The parameters of the SEM defined in Equations 2 - 5 include: λ, which represents the effect of the repeated mediator on the repeated outcome; *γ*_{1}, which represents the effect the repeated primary independent variables on the repeated mediator; *γ*_{2}, which represents the effect of the continuous covariate on the repeated outcomes; and *κ*, which represents the effect of the repeated independent variable on the repeated outcome.

In this simulation study we focused on the total effect of the repeated binary predictor and the repeated binary outcome, which is represented by λ*γ*_{1 }+ *κ*. The interpretation of the parameters of this model is subject-specific since it represents the effect of a predictor on the outcome when the individual intercept, individual slope and mediator value are held constant.

When the structural model Equations (3-5) are substituted into the measurement model Equation 2, the full model can be rewritten as:

where *ω*_{j }= (*α*_{1 }+ λ *α*_{3}) + *γ*_{2}*w *+ *α*_{2 }*t*_{j }+ (*κ *+ λ*γ*_{1}) *z*_{j}. The following presents the non-linear link formats for the probit and logit SEMs where the structural equations have been substituted into the measurement equation (the subject index *i *has again been dropped for simplicity):

Logit SEM

To fit these models, Mplus uses maximum likelihood estimation when a logit link is used and weighted least squares estimation with a robust estimation of standard errors (WLSMV) when a probit link is used [

19].

Non-linear mixed effects model

The following NLMM was evaluated in comparison to the SEM:

where

*b*_{1 }is a random individual intercept and

*b*_{2 }is a random individual slope. Since the objective is to evaluate the total effect of the main independent variable, the mediator is excluded from this model [

20]. The regression coefficient associated with the primary predictor (

*β*_{3}) therefore represents its total (i.e. direct plus indirect) effect on the outcome [

14].

Probit NLMM

The probit model assumes that *e*_{j }*~ *N(0,1) and can be written as:

where *ν*_{j }= *β*_{0 }+ *β*_{1 }*w *+ *β*_{2 }*t*_{j }+ *β*_{3 }*z*_{j}.

Logit NLMM

the logit model assumes that

*e*_{j }*~ *Logistic (

) and can be written as:

These models can be fit with SAS PROC NLMIXED which estimates parameters via maximum likelihood [

21]. We note that the regression coefficients of the NLMM are interpreted conditional on the random individual intercept and random individual slope, but marginal on the residual error of the mediator (since the mediator is not included in the model).

Comparing NLMMs to SEMs

As noted previously, the SEM and NLMM condition differently on the mediating variable. Specifically, the SEM conditions on the random intercept and slope as well as on the residual variance of the mediating variable, while the NLMM conditions only on the random intercept and random individual slope. Thus estimates from the two types of models are not directly incomparable. Instead, to compare parameters from the NLMM to that of the SEM, we must first re-scale the regression coefficient from the NLMM so that it represents the effect of the primary predictor variable *z*_{j }conditional on the mediator. To determine the scaling factor, we rewrite the SEM (for both the probit and logistic models) conditional only on the variance of the random intercept and slope to mimic the conditioning in the NLMM.

Comparing probit models

For the probit SEM, we generated the data according to the model described in Equations 6 and 7. Conditioning only on the variances of the random intercept and slope (*ζ*_{1 }and *ζ*_{2}), but not on the variance of the mediator (*ζ*_{2+j}), it can be shown that:

The sum of terms on the left-hand side of the inequality do not have a standard normal distribution since λ*ζ*_{2+j }is added to ϵ_{j }which itself has a standard normal distribution. In order to express the probability in Equation 11 using the standard normal cumulative probability function, we re-scale the terms on either side of the inequality by the standard deviation of ϵ_{j }+λζ_{2+j }to create a standard normal random variable:

Conditioning on only the variance of the random individual intercept and slope, all regression coefficients are divided by the factor

. For example, the regression coefficient associated with

*z*_{j}, which was κ + λ

*γ*_{1}, is now

. Thus, the model parameters from the SEM are scaled to the variance of ϵ

_{j }+ λζ

_{2+j }which is 1 + λ

^{2 }*θ *and the model parameters from the NLMM are scaled to the variance of ϵ

_{j }which is 1, resulting in a scaling factor of

Parameter estimates from the SEM and NLMM must be on the same scale before making direct comparisons. For example, the total effect of the main independent variable from the probit NLMM,

*β*_{3}, which is also conditioned only on the random individual intercept and slope (Equation 9) should be multiplied by a factor of

before it is compared to the total effect from the probit SEM,

*κ *+ λγ

_{71}. Direct comparisons of parameter estimates from the NLMM to those from the SEM without first re-scaling would underestimate effects by a factor of

. In the current study we present the conditional total effect estimates from the SEM and compare them to scaled and unscaled NLMM estimates. Note that in the analysis of real (i.e. non-simulated) data, true parameter values are unknown and therefore must be estimated. We describe in the appendix two approaches for rescaling estimates in practice to allow direct comparisons between NLMMs and SEMs or to compute mediated effects via NLMMs only.

Comparing logistic models

Unlike the probit model, when the logit SEM is conditioned on only the random intercept and slope, the true relationship between the predictor and the outcome no longer follows a logistic model. That is, the distribution of the terms on the left-hand side of Equation 11 in a logit SEM does not follow a logistic distribution since the sum of a normal random variable (ϵ

_{j}) and logistic random variable (ζ

_{2+j}) does not follow a logistic distribution. The result of this is that the scaled coefficients from the logit NLMM only approximate the mediated relationship described in a logit SEM. A similar situation occurs, for example, when comparing a non-linear mixed model to a non-linear generalized estimating equation as noted by Fitzmaurice, Laird, and Ware [

18].

The scale factor for the logit model is created in the same was as it was for the probit model. The regression coefficient representing the total effect of the main independent variable (

*β*_{3}) from the logit NLMM, can be multiplied by the standard deviation of ϵ

_{j }+ λζ

_{2+j }and divided by the standard deviation of ϵ

_{j}. The scaling factor for the logit model is therefore:

.