Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2818753

Formats

Article sections

- SUMMARY
- 1. Introduction
- 2. Models for Bivariate Binary and Continuous Outcomes
- 3. Simulation Study
- 4. Applications
- 5. Discussion
- references

Authors

Related links

Stat Med. Author manuscript; available in PMC 2010 June 15.

Published in final edited form as:

PMCID: PMC2818753

NIHMSID: NIHMS164802

Increasingly multiple outcomes are collected in order to characterize treatment effectiveness or to evaluate the impact of large policy initiatives. Often the multiple outcomes are non-commensurate, e.g., measured on different scales. The common approach to inference is to model each outcome separately ignoring the potential correlation among the responses. We describe and contrast several full likelihood and quasi-likelihood multivariate methods for non-commensurate outcomes. We present a new multivariate model to analyze binary and continuous correlated outcomes using a latent variable. We study the efficiency gains of the multivariate methods relative to the univariate approach. For complete data, all approaches yield consistent parameter estimates. When the mean structure of all outcomes depends on the same set of covariates, efficiency gains by adopting a multivariate approach are negligible. In contrast, when the mean outcomes depend on different covariate sets large efficiency gains are realized. Three real examples illustrate the different approaches.

Often multiple outcomes are collected in health-related studies in order to characterize treatment effectiveness or associations with covariates. This observation is particularly true in psychiatric studies where the primary outcome is an abstract construct that cannot be measured directly. Instead, several variables are measured as proxies of the underlying outcome of interest. For example, in evaluating the effectiveness of a new anti-psychotic, researchers will examine several outcomes such as the positive and negative syndrome scale (PANSS) score, symptom relapse, and quality of life.

Typically the multiple outcomes are *non-commensurate*, i.e., they are measured on different scales such as continuous and binary responses. Although there has been some development of multivariate methods for non-commensurate outcomes, the usual modeling strategy is to consider each outcome separately in a univariate framework. This strategy is less efficient in the sense that such an approach ignores the extra information contained in the correlation among the outcomes. Other advantages of a multivariate setting include better control over the type I error rates in multiple tests and the ability to answer intrinsically multivariate questions. For example, we might be interested in assessing the impact of a policy change on the quality of care (underlying outcome) rather than its impact on each outcome measured as a proxy of quality of care.

The challenge for multivariate methods is the nonexistence of obvious multivariate distributions for non-commensurate variables. Two general likelihood-based multivariate approaches have been proposed to avoid the direct specification of the joint distribution of the outcomes: factorizing the joint distribution of the outcomes and introducing an unobserved (latent) variable to model the correlation among the multiple outcomes.

The main idea of the factorization method is to write the likelihood as the product of the marginal distribution of one outcome and the conditional distribution of the second outcome given the previous outcome. Cox and Wermuth [1] discussed two possible factorizations for modeling a continuous and a binary outcome as functions of covariates. Fitzmaurice and Laird [2], and Catalano and Ryan [3] extended this approach to situations of clustered data.

Several models using latent variables have been proposed to analyze multiple non-commensurate outcomes as functions of covariates. Sammel et al. [4] discussed a model where the outcomes are assumed to be a physical manifestation of a latent variable and conditional on this latent variable, the observed outcomes follow a one-parameter exponential family model. The observed outcomes are modeled as functions of fixed covariates and a subject-specific latent variable. A drawback of this model that was later addressed by the authors, is its non-robustness to misspecification of the covariance because the mean parameters depend heavily on the covariance parameters. For example, if the outcomes are not correlated, the estimates of the covariate effects may be biased [5].

Arminger and Kusters [6] considered each outcome as a manifestation of an underlying continuous latent variable that is normally distributed. Dunson [7] extended this approach to accommodate non-normal latent variables, clustered data, non-linear relationships between the observed outcome and the underlying variables, multiple latent variables for each outcome type and covariate-dependent modifications of the relationship between the latent and underlying variables. Similar approaches were presented in the context of toxicity studies where longitudinal measurements are taken regarding multiple outcomes [8; 9]. Although very general, Dunson’s approach [7] produces a non-identifiable model for the case of a bivariate, binary or continuous outcome. This fact is well known in factor analysis (see for example Reilly [10]) where each factor needs to be a combination of three or more outcomes in order for the model to be identifiable, otherwise the parameter space has to be reduced. Often this is achieved by fixing some parameters to a constant. However, in Dunson’s model, it is not clear how to constrain the parameters to make the model identifiable without misspecifying the model for the mean or covariance. Lin et al. [11] addressed a similar identifiability problem in the context of models for multiple continuous outcomes by scaling the outcomes to have the same variance.

A quasi-likelihood approach was also proposed for non-commensurate outcomes. The generalized estimating equations (GEE) described by Liang and Zeger [12] were extended by Prentice and Zhao [13], and Zhao et al. [14] for mixtures of continuous and discrete outcomes. In their approach separate equations are used for each outcome and a working correlation matrix is used to induce the correlation among the outcomes. A sandwich-type variance can then be computed for the model parameters which is robust to the misspecification of the working correlation matrix. Despite the attractive properties of this approach, we are unaware of its use in practice.

In this paper we review the different approaches to model a binary and a continuous outcome. We introduce a new latent variable model by constraining the parameters of the latent model proposed by Dunson [7] for identifiability without restrictions on the correlation. We show that this latent model is equivalent to the factorization model presented by Catalano and Ryan [3] by demonstrating that they are only different parameterizations of the same model. We also implement the GEE approach proposed by Prentice and Zhao [13]. Simulation studies are used to compare consistency, efficiency, and coverage of the multivariate approach with the univariate approach. Section 2 describes the usual univariate approach, and both likelihood-based and quasi-likelihood multivariate methods to model a continuous and a binary outcome. In Section 3 we compare estimates obtained from the latent variable model, the factorization model, and the GEE with those from the univariate approach in terms of bias and efficiency. Finally in Section 4 three real data sets illustrate our methods.

Let *y _{bi}* denote a binary outcome,

One common approach to model multiple outcomes as functions of covariates is to ignore the correlation between the outcomes and fit a separate model to each response variable. In this setting we use a probit regression for the binary response and a linear regression for the continuous response,

$$\begin{array}{cc}\hfill \mathit{probit}\left(E({y}_{bi}\mid {\mathbf{x}}_{bi})\right)=\mathit{probit}\left({\mu}_{bi}\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}& \hfill \\ \hfill {y}_{ci}\mid {\mathbf{x}}_{ci}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+{\u220a}_{i}& \hfill \end{array}$$

(1)

where *β _{b}* = (

Fitzmaurice and Laird [2] proposed a model for a correlated binary and a continuous outcome based on the factorization of the joint distribution of the outcomes, *f*(*y _{b}, y_{c}*) =

$$\begin{array}{cc}\hfill \mathit{probit}\left(E({y}_{bi}\mid {\mathbf{x}}_{bi})\right)=\mathit{probit}\left({\mu}_{bi}\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}& \hfill \\ \hfill {y}_{ci}\mid {y}_{bi},{\mathbf{x}}_{ci},{\mathbf{x}}_{bi}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+\tau ({y}_{bi}-{\mu}_{bi})+{\u220a}_{ci}& \hfill \end{array}$$

(2)

where ${\u220a}_{ci}~N(0,{\sigma}_{c}^{2})$ and τ is the parameter for the regression of *y _{ci}* on

$$\mathit{Corr}({y}_{bi},{y}_{ci}\mid {\mathbf{x}}_{bi},{\mathbf{x}}_{ci})=\{\begin{array}{cc}\frac{\mathit{sign}\left(\tau \right)}{\sqrt{1+\frac{{\sigma}_{c}^{2}}{{\tau}^{2}\mathit{Var}({y}_{bi}\mid {\mathbf{x}}_{bi})}}},\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}\tau \ne 0\hfill \\ 0,\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thickmathspace}{0ex}}\tau =0\hfill & \hfill \end{array}\phantom{\}}.$$

(3)

This factorization of the joint distribution has the convenient property that the model parameters have a marginal interpretation. *β _{bk}* is change of the

Maximum likelihood estimates for the regression parameters of the factorization method can be obtained with commonly used algorithms for maximizing the likelihood. The log-likelihood function under the factorization model (2) is

$$\begin{array}{cc}\hfill l({y}_{b},{y}_{c})=\mathit{log}\prod _{i=1}^{n}f({y}_{bi},{y}_{ci}\mid {\mathbf{x}}_{bi},{\mathbf{x}}_{ci})=\mathit{log}\prod _{i=1}^{n}f({y}_{ci}\mid {y}_{bi},{\mathbf{x}}_{ci},{\mathbf{x}}_{bi})f({y}_{bi}\mid {\mathbf{x}}_{bi})& \hfill \\ \hfill =\sum _{i=1}^{n}\left(-\frac{1}{2}\mathit{log}\left(2\pi {\sigma}_{c}^{2}\right)-\frac{1}{2{\sigma}_{c}^{2}}{({y}_{ci}-{\mu}_{ci}-\tau ({y}_{bi}-\Phi \left({\mu}_{bi}\right)\left)\right)}^{2}\right)+& \hfill \\ \hfill +\sum _{i=1}^{n}\left({y}_{bi}\mathit{log}\right(\Phi \left({\mu}_{bi}\right))+(1-{y}_{bi}\left)\mathit{log}\right(\Phi (1-{\mu}_{bi})\left)\right)& \hfill \end{array}$$

(4)

where ${\mu}_{bi}={\mathbf{x}}_{bi}^{T}{\beta}_{b}$, ${\mu}_{ci}={\mathbf{x}}_{ci}^{T}{\beta}_{c}$ and **Φ**(·) represents the *cdf* of the standard normal distribution.

The factorization of the joint distribution of *y _{bi}* and

$$\begin{array}{cc}\hfill \mathit{probit}\left(E({y}_{bi}\mid {\mathbf{x}}_{ci},{\mathbf{x}}_{bi})\right)=\mathit{probit}\left({\mu}_{bi}\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}+{\tau}^{\prime}({y}_{ci}-E({y}_{ci}\mid {\mathbf{x}}_{ci}\left)\right)& \hfill \\ \hfill {y}_{ci}\mid {\mathbf{x}}_{ci}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+{\u220a}_{ci}& \hfill \end{array}$$

(5)

where ${\u220a}_{ci}~N(0,{\sigma}_{c}^{2})$ and τ′ is the parameter for the regression of *y _{bi}* on

Sammel et al. [4] presented a latent variable model where it is assumed that the observed outcomes are physical manifestations of a latent variable. Conditional on this latent variable, the outcomes are assumed to be independent, and are modeled as functions of fixed covariates and a subject-specific latent variable. The effect of the covariate(s) is modeled through the latent variable. Let *u _{i}* denote the latent variable and

$$\begin{array}{cc}\hfill \mathit{probit}\left(E\right({y}_{bi}\mid {u}_{i}\left)\right)={\beta}_{b1}+{\beta}_{b2}{u}_{i}& \hfill \\ \hfill {y}_{ci}\mid {u}_{i}={\beta}_{c1}+{\beta}_{c2}{u}_{i}+{\u220a}_{ci}& \hfill \end{array}$$

(6)

and ${\u220a}_{ci}~N(0,{\sigma}_{c}^{2})$. Here, *β _{b}*

Another approach based on latent variables was proposed by Dunson [7]. A major difference between this approach and Sammel’s approach relates to the association between the responses and the covariates. In Dunson’s approach the covariates are not included in the model through the latent variable but rather introduced separately. For the case of a binary and a continuous outcome, Dunson’s model would be written as

$$\begin{array}{cc}\hfill \mathit{probit}\left(E\right({y}_{bi}\mid {\mathbf{x}}_{bi},{u}_{i}\left)\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}+{\lambda}_{b}{u}_{i}& \hfill \\ \hfill {y}_{ci}\mid {\mathbf{x}}_{ci},{u}_{i}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+{\lambda}_{c}{u}_{i}+{\u220a}_{ci}& \hfill \end{array}$$

(7)

where ${\u220a}_{ci}~N(0,{\sigma}_{c}^{2})$ and ${u}_{i}~N(0,{\sigma}_{u}^{2})$ is a subject-specific latent variable. The means and covariance structure are modeled through difference parameters. The latent variable shared by both outcomes induces the correlation and it is assumed that given the latent variable, the two outcomes are independent. However, λ_{b}, λ_{c}, *σ _{u}* and

$$\begin{array}{cc}\hfill {y}_{1i}\mid {\mathbf{x}}_{1i},{u}_{i}={\mathbf{x}}_{1i}^{T}{\beta}_{1}+{\lambda}_{1}{u}_{i}+{\u220a}_{1i}& \hfill \\ \hfill {y}_{2i}\mid {\mathbf{x}}_{2i},{u}_{i}={\mathbf{x}}_{2i}^{T}{\beta}_{2}+{\lambda}_{2}{u}_{i}+{\u220a}_{2i}& \hfill \end{array}$$

(8)

where ${\u220a}_{1i}~N(0,{\sigma}_{1}^{2})$, ${\u220a}_{2i}~N(0,{\sigma}_{2}^{2})$ and ${u}_{i}~N(0,{\sigma}_{u}^{2})$. The parameters associated with the variance components of the outcomes (λ_{1}, λ_{2}, *σ _{u}*,

To determine appropriate constraints to the parameters in (7) we use an idea similar to the scaled multivariate mixed model proposed by Lin and Ryan [11]. Let *y*_{1} and *y*_{2} be two continuous normally distributed outcomes associated with covariates x_{1} and x_{2}, respectively. Given the covariates, we assume that the two outcomes are correlated. We define ${y}_{1}^{\ast}=\frac{{y}_{1}}{{\sigma}_{1}}$ and ${y}_{2}^{\ast}=\frac{{y}_{2}}{{\sigma}_{2}}$ where *σ*_{1} and *σ*_{2} are scaling parameters such that

$$\begin{array}{cc}\hfill {y}_{1i}^{\ast}\mid {\mathbf{x}}_{1i},{u}_{i}=\frac{{y}_{1}}{{\sigma}_{1}}\mid {\mathbf{x}}_{1i},{\mathbf{x}}_{2i},{u}_{i}={\mathbf{x}}_{1i}^{T}{\beta}_{1}^{\ast}+{u}_{i}+{\u220a}_{1i}^{\ast}& \hfill \\ \hfill {y}_{2i}^{\ast}\mid {\mathbf{x}}_{2i},{u}_{i}=\frac{{y}_{2}}{{\sigma}_{2}}\mid {\mathbf{x}}_{1i},{\mathbf{x}}_{2i},{u}_{i}={\mathbf{x}}_{2i}^{T}{\beta}_{2}^{\ast}+{u}_{i}+{\u220a}_{2i}^{\ast}& \hfill \end{array}$$

(9)

where ${\u220a}_{1i}^{\ast}~N(0,1)$, ${\u220a}_{2i}^{\ast}~N(0,1)$ and ${u}_{i}~N(0,{\sigma}_{u}^{2})$ is a latent variable that induces the correlation between the two variables ${y}_{1i}^{\ast}$ and ${y}_{2i}^{\ast}$. We can rewrite (9) and obtain the final expression for a latent model for two continuous outcomes:

$$\begin{array}{cc}\hfill {y}_{1i}\mid {\mathbf{x}}_{1i},{u}_{i}={\mathbf{x}}_{1i}^{T}{\beta}_{1}+{\sigma}_{1}{u}_{i}+{\u220a}_{1i}& \hfill \\ \hfill {y}_{2i}\mid {\mathbf{x}}_{2i},{u}_{i}={\mathbf{x}}_{2i}^{T}{\beta}_{2}+{\sigma}_{2}{u}_{i}+{\u220a}_{2i}& \hfill \end{array}$$

(10)

where ${\beta}_{1}={\sigma}_{1}{\beta}_{1}^{\ast}$, ${\beta}_{2}={\sigma}_{2}{\beta}_{2}^{\ast}$, ${\u220a}_{1i}~N(0,{\sigma}_{1}^{2})$, ${\u220a}_{2i}~N(0,{\sigma}_{2}^{2})$ and ${u}_{i}~N(0,{\sigma}_{u}^{2})$. The correlation between the two outcomes induced by the model is $\mathit{Corr}({y}_{1},{y}_{2}\mid {\mathbf{x}}_{1},{\mathbf{x}}_{2})=\frac{{\sigma}_{u}^{2}}{1+{\sigma}_{u}^{2}}$. So, the range of correlations that we can model is [0; 1) which requires that the outcomes are positively correlated. In many practical situations the researcher can anticipate the sign of the correlation. If the outcomes are expected to be negatively correlated a possible solution is to invert the coding of the binary outcome or to multiply the continuous outcomes by −1 and this way reverse the sign of the correlation.

These considerations motivate the constraints for the model (7) as follows. Let *y _{b}* and

$${y}_{bi}=\{\begin{array}{cc}0,\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}{y}_{bi}^{\ast}\le 0\hfill \\ 1,\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}{y}_{bi}^{\ast}>0\hfill \end{array}\phantom{\}}$$

(11)

Define ${y}_{ci}^{\ast}=\frac{{y}_{ci}}{{\sigma}_{c}}$ where *σ _{c}* is a scale parameter for the continuous outcome. The regression equations for the two variables can be written as:

$$\begin{array}{cc}\hfill {y}_{bi}^{\ast}\mid {\mathbf{x}}_{bi},{u}_{i}={\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}+{u}_{i}+{\u220a}_{bi}^{\ast}& \hfill \\ \hfill {y}_{ci}^{\ast}\mid {\mathbf{x}}_{ci},{u}_{i}={\mathbf{x}}_{ci}^{T}{\beta}_{c}^{\ast}+{u}_{i}+{\u220a}_{ci}^{\ast}& \hfill \end{array}$$

(12)

with ${\u220a}_{bi}^{\ast}~N(0,1)$, ${\u220a}_{ci}^{\ast}~N(0,1)$ and ${u}_{i}~N(0,{\sigma}_{u}^{2})$. The variances of the error terms are fixed at 1 by design. This it is just a convenient standardization to obtain a common variance and does not represent a restriction of the model. Any other standardization would work as well. The latent variable *u _{i}* is introduced in both equations to induce the correlation between the outcomes. It is assumed that given

Because $E\left({y}_{ci}\right)={\sigma}_{c}E\left({y}_{ci}^{\ast}\right)$, we can write the equation for the continuous outcome as ${y}_{ci}\mid {\mathbf{x}}_{ci},{u}_{i}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+{\sigma}_{c}{u}_{i}+{\u220a}_{ci}$, where ${\beta}_{c}={\sigma}_{c}{\beta}_{c}^{\ast}$ and ${\u220a}_{ci}~N(0,{\sigma}_{c}^{2})$. The correlation between ${y}_{bi}^{\ast}$ and *y _{ci}* is a function only of

$$\begin{array}{cc}\hfill \mathit{probit}\left(P\right({y}_{bi}=1\mid {\mathbf{x}}_{bi},{u}_{i}\left)\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}+{u}_{i}& \hfill \\ \hfill {y}_{ci}\mid {\mathbf{x}}_{ci},{u}_{i}={\mathbf{x}}_{ci}^{T}{\beta}_{c}+{\sigma}_{c}{u}_{i}+{\u220a}_{ci}& \hfill \end{array}$$

(13)

The correlation between *y _{bi}* and

$$\mathit{Corr}({y}_{bi},{y}_{ci}\mid {\mathbf{x}}_{bi},{\mathbf{x}}_{ci})=\frac{{\sigma}_{u}^{2}}{\left(1+{\sigma}_{u}^{2}\right)}\frac{\varphi \left(\frac{{\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}}{\sqrt{{\sigma}_{u}^{2}+1}}\right)}{\sqrt{\Phi \left(\frac{{\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}}{\sqrt{{\sigma}_{u}^{2}+1}}\right)\left(1-\Phi \left(\frac{{\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}}{\sqrt{{\sigma}_{u}^{2}+1}}\right)\right)}}.$$

(14)

where *ϕ*(·) is the standard normal density.

The parameters ${\beta}_{b}^{\ast}$ in (13) are interpreted conditional on *u _{i}*. Given

$$P({y}_{bi}\mid {\mathbf{x}}_{bi})=\int P({y}_{bi}\mid {\mathbf{x}}_{bi},{u}_{i})f\left({u}_{i}\right)d{u}_{i}=\Phi \left(\frac{{\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}}{\sqrt{1+{\sigma}_{u}^{2}}}\right).$$

So, ${\beta}_{b}=\frac{{\beta}_{b}^{\ast}}{\sqrt{1+{\sigma}_{u}^{2}}}$ are the marginal effects associated with the covariates. For the continuous outcome, *β _{c}* is interpreted as conditional or marginal effects of the covariates.

The log likelihood for the model is written as:

$$\begin{array}{c}\hfill l({y}_{b},{y}_{c})=\mathit{log}\prod _{i=1}^{n}f({y}_{bi},{y}_{ci}\mid {\mathbf{x}}_{bi},{\mathbf{x}}_{ci})\hfill \\ \hfill =\mathit{log}\prod _{i=1}^{n}\int f({y}_{bi}\mid {u}_{i},{\mathbf{x}}_{bi})f({y}_{ci}\mid {u}_{i},{\mathbf{x}}_{ci})f\left({u}_{i}\right)d{u}_{i}\hfill \\ \hfill =\mathit{log}\prod _{i=1}^{n}{\int}_{-\infty}^{\infty}{\left[\Phi ({\mu}_{bi}+{u}_{i})\right]}^{{y}_{bi}}{\left[1-\Phi ({\mu}_{bi}+{u}_{i})\right]}^{(1-{y}_{bi})}\frac{\mathit{exp}\left(-\frac{{({y}_{ci}-{\mu}_{ci}-{\sigma}_{c}{u}_{i})}^{2}}{2{\sigma}_{c}^{2}}\right)}{\sqrt{2\pi {\sigma}_{c}^{2}}}\frac{\mathit{exp}(-\frac{{u}_{i}^{2}}{2{\sigma}_{u}^{2}})}{\sqrt{2\pi {\sigma}_{u}^{2}}}d{u}_{i}\hfill \end{array}$$

(15)

where ${\mu}_{bi}={\mathbf{x}}_{bi}^{T}{\beta}_{b}^{\ast}$ and *μ _{ci}* = x

The properties of the *probit* link allow a simplification of the likelihood for the latent variable model. The integral in (15) has a closed-form solution and solving this integral (Appendix 1) we obtain the same model as the reverse factorization (5) but with a different parameterization:

$$\begin{array}{cc}\hfill l({y}_{b},{y}_{c})=\mathit{log}\prod _{i=1}^{n}{\left[\Phi \left({\mathbf{x}}_{bi}^{T}{\beta}_{b}+\tau ({y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c})\right)\right]}^{{y}_{bi}}& \hfill \\ \hfill \times {\left[1-\Phi \left({\mathbf{x}}_{bi}^{T}{\beta}_{b}+\tau ({y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c})\right)\right]}^{1-{y}_{bi}}\varphi \left(\frac{{y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c}}{\sqrt{{\sigma}_{c}^{2}({\sigma}_{u}^{2}+1)}}\right)& \hfill \end{array}$$

(16)

where ${\beta}_{b}={\beta}_{b}^{\ast}\sqrt{\frac{{\sigma}_{u}^{2}+1}{2{\sigma}_{u}^{2}+1}}$ and $\tau =\frac{{\sigma}_{u}^{2}}{\sqrt{2{\sigma}_{u}^{2}+1}}$.

Liang and Zeger [12] introduced the methodology of generalized estimating equations (GEE) in the context of longitudinal data. In this methodology, the correlation among measurements on the same individual (or in the same cluster) is treated as a nuisance parameter. A ’working’ correlation matrix is plugged in the equations to obtain estimates for the regression parameters. These estimators are consistent even if the ’working’ correlation matrix is misspecified. The variances of the parameters estimators are obtained by correcting the ’working’ correlation matrix resulting in what became known as the sandwich estimator. The main advantage of the GEE method is this robustness to misspecification of the covariance. Prentice and Zhao[13], and Zhao et al. [14] also proposed an estimation approach for mixed continuous and discrete using the quadratic exponential and the partly exponential families, respectively. For the case of a binary and a continuous outcomes if we assume the following model for the means of the two outcomes,

$$\begin{array}{cc}\hfill \mathit{probit}\left(E({y}_{bi}\mid {\mathbf{x}}_{bi})\right)=\mathit{probit}\left({\mu}_{bi}\right)={\mathbf{x}}_{bi}^{T}{\beta}_{b}& \hfill \\ \hfill E({y}_{ci}\mid {\mathbf{x}}_{ci})={\mu}_{ci}={\mathbf{x}}_{ci}^{T}{\beta}_{c}& \hfill \end{array}$$

(17)

then the estimating equation

$$\sum _{i=1}^{n}{D}_{i}^{T}{V}_{i}^{-1}\left(\begin{array}{c}\hfill {y}_{bi}-{\mu}_{bi}\hfill \\ \hfill {y}_{ci}-{\mu}_{ci}\hfill \end{array}\right)=0$$

(18)

has a solution that is a consistent and asymptotically normal estimator for *β _{b}* and

$$\begin{array}{cc}\hfill {D}_{i}=\left(\begin{array}{cc}\hfill \frac{\partial {\mu}_{bi}}{\partial {\beta}_{b}}\hfill & \hfill \frac{\partial {\mu}_{bi}}{\partial {\beta}_{c}}\hfill \\ \hfill \frac{\partial {\mu}_{ci}}{\partial {\beta}_{b}}\hfill & \hfill \frac{\partial {\mu}_{ci}}{\partial {\beta}_{c}}\hfill \end{array}\right)\hfill & \hfill {V}_{i}=\left(\begin{array}{cc}\hfill {\sigma}_{b}^{2}\hfill & \hfill {\sigma}_{b}{\sigma}_{c}\rho \hfill \\ \hfill {\sigma}_{b}{\sigma}_{c}\rho \hfill & \hfill {\sigma}_{c}^{2}\hfill \end{array}\right)\hfill \\ \multicolumn{2}{c}{\Omega =E\left({D}_{i}^{T}{V}^{-1}\left(\begin{array}{c}\hfill {y}_{bi}-{\mu}_{bi}\hfill \\ \hfill {y}_{ci}-{\mu}_{ci}\hfill \end{array}\right){\left(\begin{array}{c}\hfill {y}_{bi}-{\mu}_{bi}\hfill \\ \hfill {y}_{ci}-{\mu}_{ci}\hfill \end{array}\right)}^{T}{V}^{-1}{D}_{i}\right)}\end{array}$$

(19)

Typically, *D _{i}* is a block-diagonal matrix because the equations for each outcome do not share the regression parameters. The solution for the estimating equation is a consistent estimator of

$${\widehat{\sigma}}_{b}^{2}=\frac{\sum _{i=1}^{n}{({y}_{bi}-{\widehat{\mu}}_{bi})}^{2}}{n}\phantom{\rule{1em}{0ex}}{\sigma}_{c}^{2}=\frac{\sum _{i=1}^{n}{({y}_{ci}-{\widehat{\mu}}_{ci})}^{2}}{n}\phantom{\rule{1em}{0ex}}\widehat{\rho}=\frac{\sum _{i=1}^{n}({y}_{ci}-{\widehat{\mu}}_{ci})({y}_{bi}-{\widehat{\mu}}_{bi})}{\sqrt{\sum _{i=1}^{n}{({y}_{bi}-{\widehat{\mu}}_{bi})}^{2}\sum _{i=1}^{n}{({y}_{ci}-{\widehat{\mu}}_{ci})}^{2}}}$$

(20)

where ${\widehat{\mu}}_{bi}$ can be obtained by, for example, running a *probit* and a linear regression as in (1). Any other consistent estimates could be used instead of (20), for example ${\widehat{\sigma}}_{b}^{2}={\widehat{\mu}}_{bi}(1-{\widehat{\mu}}_{bi})$.

We performed a Monte Carlo simulation study to investigate consistency, efficiency and coverage of 95% confidence intervals for estimates obtained by the univariate model, factorization model, latent variable model and GEE. Two different sets of simulations were considered. In the first set, two outcomes associated with a common covariate (exposure) were simulated. Different effect sizes of the covariate on the outcomes were used to simulate no effect, small effect and large effect. Data were generated from a bivariate normal distribution,

$$\left(\begin{array}{c}\hfill {y}_{bi}^{\ast}\hfill \\ \hfill {y}_{ci}\hfill \end{array}\right)~\mathit{MVN}\left(\left(\begin{array}{c}\hfill -0.5+{\beta}_{b1}{x}_{i}\hfill \\ \hfill 5+{\beta}_{c1}{x}_{i}\hfill \end{array}\right),\left(\begin{array}{cc}\hfill 1\hfill & \hfill 6\rho \hfill \\ \hfill \hfill & \hfill 36\hfill \end{array}\right)\right)$$

(21)

with *x _{i}* generated from a Bernoulli(.5). In the first simulation, the vector of coefficients associated with the covariate was chosen as (

In the second set of simulations, a different covariate was added to each outcome and data were generated from

$$\left(\begin{array}{c}\hfill {y}_{bi}^{\ast}\hfill \\ \hfill {y}_{ci}\hfill \end{array}\right)~\mathit{MVN}\left(\left(\begin{array}{c}\hfill -1+{\beta}_{b1}{x}_{i}+{\beta}_{b2}{x}_{bi}\hfill \\ \hfill 5+{\beta}_{c1}{x}_{i}+{\beta}_{c2}{x}_{ci}\hfill \end{array}\right),\left(\begin{array}{cc}\hfill 1\hfill & \hfill 6\rho \hfill \\ \hfill \hfill & \hfill 36\hfill \end{array}\right)\right)$$

(22)

with *x _{i}* generated from a Bernoulli(.5),

For each simulation, we used (11) to create the binary variable *y _{bi}* from ${y}_{bi}^{\ast}$. The covariates

The data generated from (22) were modeled using the following:

- Univariate approach (ignoring the correlation between the outcomes)$$\begin{array}{cc}\hfill \mathit{probit}\left(P\right({y}_{bi}=1\mid {x}_{i},{x}_{bi}\left)\right)={\alpha}_{b}+{\beta}_{b1}{x}_{i}+{\beta}_{b2}{x}_{bi}& \hfill \\ \hfill {y}_{ci}\mid {x}_{i},{x}_{ci}={\alpha}_{c}+{\beta}_{c1}{x}_{i}+{\beta}_{c2}{x}_{ci}+{\u220a}_{ci},& \hfill \\ \hfill {\u220a}_{ci}~N(0,{\sigma}_{c}^{2})& \hfill \end{array}$$(23)
- Factorization approach$$\begin{array}{cc}\hfill \mathit{probit}\left(P\right({y}_{bi}=1\mid {x}_{i},{x}_{bi}\left)\right)={\alpha}_{b}+{\beta}_{b1}{x}_{i}+{\beta}_{b2}{x}_{bi}& \hfill \\ \hfill {y}_{ci}\mid {x}_{i},{x}_{ci},{x}_{bi}={\alpha}_{c}+{\beta}_{c1}{x}_{i}+{\beta}_{c2}{x}_{ci}+\tau ({y}_{bi}-E({y}_{bi}\mid {x}_{i},{x}_{bi}\left)\right)+{\u220a}_{ci},& \hfill \\ \hfill {\u220a}_{ci}~N(0,{\sigma}_{c}^{2})& \hfill \end{array}$$(24)
- Latent variable approach$$\begin{array}{cc}\hfill \mathit{probit}\left(P\right({y}_{bi}=1\mid {x}_{i},{x}_{bi},{u}_{i}\left)\right)={\alpha}_{b}^{\ast}+{\beta}_{b1}^{\ast}{x}_{i}+{\beta}_{b2}^{\ast}{x}_{bi}+{u}_{i}& \hfill \\ \hfill {y}_{ci}\mid {x}_{i},{x}_{ci},{u}_{i}={\alpha}_{c}+{\beta}_{c1}{x}_{i}+{\beta}_{c2}{x}_{ci}+{\sigma}_{c}{u}_{i}+{\u220a}_{ci},& \hfill \\ \hfill {u}_{i}~N(0,{\sigma}_{u}^{2})\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{\u220a}_{ci}~N(0,{\sigma}_{c}^{2})& \hfill \end{array}$$(25)
- Generalized estimating equationsand the estimating equation as described in section 2.4.$$\begin{array}{cc}\hfill \mathit{probit}\left({\mu}_{bi}\right)={\alpha}_{b}+{\beta}_{b1}{x}_{i}+{\beta}_{b2}{x}_{bi}& \hfill \\ \hfill {\mu}_{ci}={\alpha}_{c}+{\beta}_{c1}{x}_{i}+{\beta}_{c2}{x}_{ci}& \hfill \end{array}$$(26)

For data generated from (21) similar models were used but without the terms associated with *x _{bi}* and

The models (23), (24), and (26) were fitted using PROC NLMIXED from SAS to assure the same numerical algorithms were used to maximize the likelihoods. An example of the SAS code to fit the latent variable model is presented in the Appendix. The 95% confidence intervals for the parameter estimates were computed as $\widehat{\nu}+{t}_{.975}\widehat{\mathit{SE}}\left(\widehat{\nu}\right)$, where $\widehat{\nu}$ represents the maximum likelihood estimate for parameter of interest. The GEE were solved using a program written in PROC IML from SAS. The nonlinear optimization algorithm by Nelder-Mead simplex method implemented in PROC IML was used because it was the most successful in converging to the solutions in the simulated datasets. Estimates of the parameters of the covariance matrix were obtain using the *probit* and linear regression from (23). The same estimates were used as initial values for the optimization algorithm.

All the settings produced identical point estimates (MLEs) of the parameters, despite the model used to fit the data, the effect size of the covariate or the correlation level between the two outcomes. This indicates that all the models produce consistent estimates of the regression parameters. Coverage of the confidence intervals were also close to the nominal value (95%) in all simulations. The only difference observed between the models was found on the standard errors of the estimates for some settings.

Because the MLEs were identical across all models, the differences in the mean square errors (MSE) observed in some settings are mostly due to the differences on the standard error of the estimates. Tables TablesI,I, ,IIII and III present the ratio of the MSE of the multivariate models (factorization model, latent variable model and GEE) to the univariate model in different settings depending on the effect size of the shared covariate and correlation level between the outcomes.

Mean square errors (MSE) from the simulation study with no effect of the shared covariate (*β*_{b1} = 0) on the binary outcome and a small effect on the continuous outcome (*β*_{c1} = 2; 1/3 of a SD): ratio of the MSE of the multivariate models **...**

Mean square errors (MSE) from the simulation study with small effect of the shared covariate (*β*_{b1} = 0:2; 1/5 of a SD) on the binary outcome and a small effect on the continuous outcome (*β*_{c1} = 2; 1/3 of a SD): ratio of the MSE of the multivariate **...**

Mean square errors (MSE) from the simulation study with large effect of the shared covariate (*β*_{b1} = 1; 1 SD) on the binary outcome and a large effect on the continuous outcome (*β*_{c1} = 6; 1 SD): ratio of the MSE of the multivariate models **...**

The results are summarized as follows. For the estimates of the parameters associated with the covariate shared by the two outcomes, the multivariate models produced estimates with MSE identical to the univariate model. The only exceptions were observed for the *β*_{b1} estimates when the correlation between the outcomes was large. In this case the multivariate models had lower MSE than the univariate model. The latent variable model, in this situation, produced the estimates with lowest MSE. When the true model involved different covariates associated with each outcome, the estimates of the parameters associated with the unshared covariates had a lower MSE for the multivariate models if the outcomes were correlated. For example, the latent variable model produced some estimates with approximately half the MSE than the univariate model, for a high correlation between the outcomes.

The first Example 4.1 illustrates the similar performances of the approaches when the outcomes share the same covariates and the correlation between the outcomes is low. Example 4.2 illustrates a similar situation to Example 4.1 but with strong correlation between the outcomes. Example 4.3 illustrates how inferences can change with a multivariate approach if the outcomes are associated with different covariates.

Dickey et al. [15] conducted a prospective observational study of 420 adults with schizophrenia who sought care for a psychiatric crisis. The main study objective was to compare care for patients who were and were not enrolled in managed care. Advocates for those with mental illness worried that patients who had their care managed may have worse care than those who did not. Two outcomes, one binary (whether the patient was prescribed an atypical anti-psychotic medication) and one continuous (self-reported quality of interpersonal interactions between patient and clinician) were measured for the 197 patients who had their care managed and the 223 patients whose care was not managed. Higher values for the self-reported quality represent higher quality. The means (SD) age of patients were 40 (8.5) and 41 (7.9) in the managed care and not managed care groups respectively. Seventy one percent of the patients in the managed care group received atypical anti-psychotic medication versus 68% in the not managed care group. The means (SD) self-reported quality of interpersonal interactions between patient and clinician appeared similar, 3.20 (0.67) for the managed care group and 3.21 (0.65) for the not managed group. We used the univariate (1), the factorization model(2), the latent variable model (13) and the GEE (as described in section 2.4) to estimate the marginal association of manged care and outcomes. No other covariates were used in the models. Only patients with complete data were included (n=394).

The managed care estimates and the corresponding standard errors on patient/clinician relationship and anti-psychotic prescription were identical for all the models considered (Table IV). In this example, the marginal correlation between the two outcomes was low, 0.06. For the multivariate models, it is easy to test simultaneously for an overall effect of managed care exposure on the outcomes, i.e, *H*_{0} : *β _{b}* =

The data arise from a randomized multi-center clinical trial comparing an experimental treatment (interferon-*α*) to a corresponding placebo in the treatment of patients with age-related macular degeneration. We focus on the comparison between placebo and the highest dose (6 million units daily) of interferon-*α*(Z). The full results of this trial have been reported elsewhere [16]. Patients with macular degeneration progressively loose vision. In the trial, a patients visual acuity was assessed at different time points through their ability to read lines of letters on standardized vision charts. These charts display lines letters of decreasing size which the patient must read from top (largest letters) to bottom (smallest letters). Each line with at least four letters correctly read is called one line of vision. The patients visual acuity is the total number of letters correctly read. The primary endpoint of the trial was a binary outcome defined as the loss of at least three lines of vision at 1 year compared to their baseline performance. We also consider the difference between visual acuity at 6 months and baseline as a secondary endpoint (continuous outcome). We used the univariate, the factorization model, the latent variable model and the GEE to estimate the marginal effect of interferon-*α* treatment on visual performance. Treatment was the only covariate included in the models.

A total of 190 patients (87 in the treatment arm and 103 in the placebo arm) completed the study. The correlation between the two outcomes was 0.63. For patients who received the treatment, 54% lost at least three lines of vision at 1 year versus 38% in the placebo group. The mean (SD) loss of visual acuity at 6 months were 8.4 (11.9) letters for the treatment arm and 5.5 (13.7) for the placebo arm. The results of all approaches were identical despite the high correlation between the outcomes. However, the estimate of treatment effect for the binary outcome, loss of at least three lines of vision at 1 year, was smaller in the latent variable model (Table V). All models lead to the same conclusion regarding the poor performance of the interferon-*α*. The overall effect of treatment (*H*_{0} : *β _{b}* =

Coronary disease results from lesions of fatty plaque that build up within the arterial wall. These plaque lesions may either rupture, causing a heart attack, or gradually obstruct blood flow, causing angina. Coronary stents are thin expandable metallic tubes that are delivered within the coronary artery by a catheter and are then expanded precisely at the site of an obstructive lesion. Typically up to two primary endpoints (measures of restenosis) are measured after coronary stenting. We use data from one arm of a non-inferiority randomized trial of bare-metal coronary stents. The first endpoint obtained from all patients is the incidence of clinically-driven repeat revascularization, denoted the target lesion revascularization (TLR) rate (binary outcome). TLR is designated by a clinical events committee that have access to clinical and angiographic laboratory data. The second endpoint, proportion diameter stenosis (PDS), is the degree of vessel re-narrowing and is quantified by a computer-based system (continuous outcome). The PDS is obtained on a small randomly selected subset of patients. Both TLR and PDS are measured 9 months after coronary stenting. The goal is to estimate restenosis for diabetic patients taking into account potential confounders.

From the 313 patients, 105 had both PSD and TLR measured and included in the analysis. The overall rate of TLR was 14% and the mean (SD) of PDS was 0.43 (0.17). The correlation between the two outcomes was 0.58. Fourteen patients were diabetic. The overall mean (SD) for length of lesion was 12.8 (5.2). Using a univariate approach only history of diabetes mellitus (diabetes) was significantly associated with the outcomes. For the latent model, the lesion length was also associated with TLR but not with PDS (Table VI). Note that inference for diabetic patients is the same as in the univariate approach. If lesion length is included in the equation for the outcome TLR then both outcomes would share the same covariates and the results from the latent model become identical to the univariate model (the association of lesion length would not be significant in the latent model). This is in agreement with the simulations where efficiency gains were realized for estimates of the parameters associated with the ’non-shared’ covariates.

We presented different approaches to model correlated binary and continuous outcomes. We proposed a new multivariate latent variable model that overcomes the identifiability problems of Dunson’s model and the sensitivity to misspecification of the covariance matrix of Sammel’s model. We also implemented a quasi-likelihood approach based on a GEE. Simulation results suggest that the four approaches lead to consistent estimates of the regression parameters. Two findings are noteworthy. First, we demonstrated that if the two outcomes share the same covariates, the results of a multivariate approach are identical to that of a univariate approach that ignores the correlation between the outcomes. Although counterintuitive, this result is consistent with other situations of multivariate data. In the setting of seemingly unrelated regressions with normally distributed outcomes and for the particular case of common set of covariates associated with the outcomes, the ordinary least squares estimate is still the best linear unbiased estimator (see for example Zellner [17] and Rotnitzky [18]), despite the correlation between the outcomes.

Second, we know that for binary outcomes jointly modeled with the same covariates, there is a small gain in efficiency by taking into account the correlation. This only occurs if the outcomes strongly associated. Our result for non-commensurate outcomes is a combination of these two properties. The estimates of the parameters associated with the continuous outcome have the same standard errors as the univariate approach. The estimates of the parameters associated with the binary outcome show a small gain in efficiency when compared with the univariate approach but only for high correlation between the outcomes.

Third, the efficiency gain is higher when the outcomes share a different set of covariates and with higher levels of correlation between the outcomes. This suggests that if one anticipates that different covariates maybe associated with the outcomes, the multivariate approach offers some advantages. Fitzmaurice and Laird [19] have previously shown higher gains in efficiency when compared with the univariate approach than those shown here. However, the efficiency gains observed by the authors were inflated as consequence of heteroscedasticity in the data. If data are generated under the factorization model, the variance depends on the covariate. In this case the univariate approach, assuming homoscedasticity, will lead to less efficient estimates due to misspecification of the variance.

The better performance of the latent variable model over the factorization model in our simulation study was expected because the data was generated from the latent model. Nonetheless, the factorization model was sometimes superior but never inferior to the univariate approach. This suggests that the misspecification of correlation between the outcomes will not be worse than the assumption of independence. In contrast to the factorization approach, the latent variable model presented is easily extended to several continuous and/or several binary outcomes by including additional latent variables as long as the outocomes are positively correlated. However, some of the assumptions of the model, such as the distribution of the latent variables, are not easily assessed. In the presence of missing observations in one of the outcomes, the factorization approach only uses the complete cases or it requires the EM-algorithm to include all the cases in the analysis [19]. This is not the case with the latent model. If the missing data is missing at random or missing completely at random [20] this situation can be easily accommodated due to the conditional independence of the outcomes given the latent variable. Furthermore, the latent variable model is easily fitted using standard software.

We focused on comparing the univariate and multivariate approaches using common operational characteristics such as MSE and coverage of the confidence intervals. We note that these characteristics may not fully capture the benefits of the multivariate models. Research to understand the advantage of adopting a multivariate model for joint inference of the parameters is an important next step. For example, when the outcomes represent an underlying construct and the primary research question relates to an exposure effect, joint inference may be a key task. Such situations occur in clinical trials with more than one primary endpoint or when there is simultaneous concern with safety and efficacy.

This work was supported by Grant R01-MH54693 (Teixeira-Pinto and Normand) and R01-MH61434 (Normand), both from the National Institute of Mental Health. The schizophrenia managed care data were generously provided through the efforts of Barbara Dickey, Ph.D., Harvard Medical School, Boston, MA; the bare-metal stent data by Laura Mauri, M.D., M.Sc., Harvard Clinical Research Institute, Boston, MA; and the macular degeneration data by Geert Molenberghs, Ph.D., Hasselt University, Belgium.

We show that by solving the integral in the likelihood for the latent variable model (15) we get the likelihood of the reverse factorization model (5) but with a different parameterization:

$$\begin{array}{cc}\hfill l({y}_{b},{y}_{c})=\mathit{log}\prod _{i=1}^{n}\int f({y}_{bi}\mid {u}_{i},{\mathbf{x}}_{bi})f({y}_{ci}\mid {u}_{i},{\mathbf{x}}_{ci})f\left({u}_{i}\right)d{u}_{i}& \hfill \\ \hfill \mathit{log}\prod _{i=1}^{n}{\left[\Phi \left({p}_{i}\right)\right]}^{{y}_{bi}}{\left[1-\Phi \left({p}_{i}\right)\right]}^{1-{y}_{bi}}\varphi \left(\frac{{y}_{ci}-{\mu}_{ci}}{\sqrt{{\sigma}_{c}^{2}({\sigma}_{u}^{2}+1)}}\right)& \hfill \end{array}$$

(27)

where,

$${p}_{i}=\frac{{\mu}_{bi}+\frac{{\sigma}_{u}^{2}}{({\sigma}_{u}^{2}+1)}\left(\frac{{y}_{ci}-{\mu}_{ci}}{{\sigma}_{c}}\right)}{\sqrt{\frac{2{\sigma}_{u}^{2}+1}{{\sigma}_{u}^{2}+1}}}$$

(28)

Letting ${\beta}_{b}={\beta}_{b}^{\ast}\sqrt{\frac{{\sigma}_{u}^{2}+1}{2{\sigma}_{u}^{2}+1}}$ and $\tau =\frac{{\sigma}_{u}^{2}}{\sqrt{2{\sigma}_{u}^{2}+1}}$ we get

$$l({y}_{b},{y}_{c})=\mathit{log}\prod _{i=1}^{n}{\left[\Phi \left({\mathbf{x}}_{bi}^{T}{\beta}_{b}+\tau ({y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c})\right)\right]}^{{y}_{bi}}$$

(29)

$$\times {\left[1-\Phi \left({\mathbf{x}}_{bi}^{T}{\beta}_{b}+\tau ({y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c})\right)\right]}^{1-{y}_{bi}}\varphi \left(\frac{{y}_{ci}-{\mathbf{x}}_{ci}^{T}{\beta}_{c}}{\sqrt{{\sigma}_{c}^{2}({\sigma}_{u}^{2}+1)}}\right)$$

(30)

$$=\mathit{log}\prod _{i=1}^{n}f({y}_{bi}\mid {y}_{ci},{\mathbf{x}}_{bi})f({y}_{ci}\mid {\mathbf{x}}_{ci})$$

(31)

This likelihood is the same likelihood for the reverse factorization model (5), i.e., both approaches are a different parameterization of the same model.

The SAS code below illustrates how to use the procedure PROC NLMIXED in SAS to fit the latent variable model (13) for a binary (*y*_{1}) and a continuous (*y*_{2}) outcomes associates with a common covariate (*x*_{1}).

proc nlmixed data=datasetname; parms a1=1 b1=1 a2=1 b2=1 sigmab=1 sigma2=1; bounds sigma2>0, sigmab>0; ll=y1*log(PROBNORM (a1+b1*x1+u)) +(1−y1)* log(PROBNORM(−a1−b1*x1−u))−log(sigma2)− .5*1/(sigma2**2)*(y2−a2−b2*x1−u*sigma2)**2; model y1 ~ general(ll); random u ~ normal(0,sigmab) subject=id; estimate ’marginal effect of x1’ b1/sqrt(1+sigmab); run;

[1] Cox DR, Wermuth N. Response models for binary and quantitative variables. Biometrika. 1992;79(3):441–461.

[2] Fitzmaurice GM, Laird NM. Regression models for a bivariate discrete and continuous outcome with clustering. Journal of the American Statistical Association. 1995;90:845–852.

[3] Catalano PJ, Ryan LM. Bivariate latent variable models for clustered discrete and continuous outcomes. Journal of the American Statistical Association. 1992;87:651–658.

[4] Sammel MD, Ryan LM, Legler JM. Latent variable models for mixed discrete and continuous outcomes. Journal of the Royal Statistical Society, Series B: Methodological. 1997;59:667–678.

[5] Sammel M, Lin X, Ryan L. Multivariate linear mixed models for multiple outcomes. Statistics in Medicine. 1999;18:2479–2492. [PubMed]

[6] Arminger G, Küsters U. Latent trait and latent class models. Plenum Press; New York, U.S.A.: 1988. chap. Latent trait models with indicators of mixed measurement level; pp. 51–73.

[7] Dunson DB. Bayesian latent variable models for clustered mixed outcomes. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 2000;62(2):355–366.

[8] Dunson DB, Chen Z, Harry J. A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes. Biometrics. 2003;59(3):521–530. [PubMed]

[9] Gueorguieva RV, Agresti A. A correlated probit model for joint modeling of clustered binary and continuous responses. Journal of the American Statistical Association. 2001;96(455):1102–1112.

[10] Reilly T. A necessary and sufficient condition for identification of confirmatory factor analysis models of factor complexity one. Sociological Methods and Research. 1995;23(4):421–441.

[11] Lin X, Ryan L, Sammel M, Zhang D, Padungtod C, Xu X. A scaled linear mixed model for multiple outcomes. Biometrics. 2000;56(2):593–601. [PubMed]

[12] Liang K, Zeger S. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22.

[13] Prentice RL, Zhao LP. Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics. 1991;47(3):825–839. [PubMed]

[14] Zhao LP, Prentice RL, Self SG. Multivariate mean parameter estimation by using a partly exponential model. Journal of the Royal Statistical Society. Series B. 1992;54(3):805–811.

[15] Dickey B, Normand SLT, Hermann RC, Eisen SV, Cortes DE, Cleary PD, Ware N. Guideline recommendations for treatment of schizophrenia: the impact of managed care. Arch Gen Psychiatry. 2003;60(4):340–8. [PubMed]

[16] Pharmacological Therapy for Macular Degeneration Study Group Interferon *α*-iia is ineffective for patients with choroidal neovascularization secondary to age-related macular degeneration: results of a prospective randomized placebo-controlled clinical trial. Archives of Ophthalmology. 1997;115:865–872. [PubMed]

[17] Zellner A. An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association. 1962;57(298):348–368.

[18] Rotnitzky A, Holcroft CA, Robins JM. Efficiency comparisons in multivariate multiple regression with missing outcomes. Journal of Multivariate Analysis. 1997;61:102–128.

[19] Fitzmaurice GM, Laird NM. Regression models for mixed discrete and continuous responses with potentially missing values. Biometrics. 1997;53:110–122. [PubMed]

[20] Little RJ, Rubin D. Statistical Analysis with Missing Data. John Wiley and Sons, Inc; Hoboken, New Jersey, U.S.A.: 2002.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |