Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3874234

Formats

Article sections

- SUMMARY
- 1 Introduction
- 2 Modeling Framework
- 3 Asymptotic Bias for Misspecified Random Effects
- 4 Simulation
- 5 Example
- 6 Discussion
- REFERENCES

Authors

Related links

Stat Med. Author manuscript; available in PMC 2013 December 28.

Published in final edited form as:

Published online 2011 November 14. doi: 10.1002/sim.4405

PMCID: PMC3874234

NIHMSID: NIHMS531686

Paul S. Albert, Biostatistics and Bioinformatics Branch, Division of Epidemiology, Statistics, and Prevention Research, *Eunice Kennedy Shriver* National Institute of Child Health and Human Development, Bethesda, MD 20892, USA. Email: albertp/at/mail.nih.gov;

The use of longitudinal data for predicting a subsequent binary event is often the focus of diagnostic studies. This is particularly important in obstetrics, where ultrasound measurements taken during fetal development may be useful for predicting various poor pregnancy outcomes. We propose a modeling framework for predicting a binary event from longitudinal measurements where a shared random effect links the two processes together. Under a Gaussian random effects assumption, the approach is simple to implement with standard statistical software. Using asymptotic and simulation results, we show that estimates of predictive accuracy under a Gaussian random effects distribution are robust to severe misspecification of this distribution. However, under some circumstances, estimates of individual risk may be sensitive to severe random effects misspecification. We illustrate the methodology with data from a longitudinal fetal growth study.

The use of longitudinal data for predicting a subsequent binary event is often the focus of diagnostic studies. Interest is in developing a dynamic predictor which can be repeatedly applied to an individual’s longitudinal profile to predict a subsequent binary event. The monitoring of fetal development with repeated ultrasound measurements is important for the clinical management of pregnant women, and this will serve as the motivating example for the methodological developments in this paper. Of importance in this application is using longitudinal ultrasound measurements to predict a subsequent endpoint at birth.

A natural way to formulate such a predictor is to jointly model the longitudinal measurements and a subsequent binary event with a shared random parameter model. In these models, the dependence between the two data types is induced through random effects which are shared between the two processes. The framework is introduced assuming that the random effects follow a very flexible random effects distribution. However, it is difficult to fit these models with standard statistical software. We therefore propose an approach which is much simpler to implement for the case where the random effects distribution is assumed to be Gaussian. In using the simpler approach, it is important to understand the robustness of the assumed Gaussian random effects distribution to model misspecifcation.

Examining the robustness of random effects models for longitudinal data to the Gaussian random effects assumption is an active areas of research. Various authors have examined the robustness of fixed effects estimation of linear and generalized linear mixed models to the assumed random effects distribution^{1–4}. Others have examined robustness of random effects estimation to the assumed random effects distribution^{5–6}. The robustness of fixed effect estimation to the assumed random effects distribution has been studied for joint models between longitudinal and survival data^{7–8}. The focus of this article is on the robustness of the shared random parameter model under a Gaussian random effects assumption for estimation of the individual predictive probabilities and overall measures of diagnostic accuracy for predicting a binary outcome.

In Section 2, we introduce the joint model for a general case where the random effects distribution follows a mixture of normal distributions. We show that a much simpler two-stage estimation procedure is possible when it is assumed that the random effects follow a Gaussian distribution. In Section 3, we examine the asymptotic bias when the random effects distribution is misspecified as Gaussian when in fact it is truly a two-group mixture of Gaussian distributions. We examine this bias for estimating individual predictive probabilities and for assessing the overall performance of the predictor. In Section 4, we present simulations to examine the finite sample properties of the prediction approaches under random effects misspecification. We illustrate the proposed methodology using data from a fetal growth study in Section 5. A discussion follows in Section 6.

We assume that the longitudinal measurements follow a linear mixed model,

(1)

where *X*_{ij} denotes a vector of fixed effect covariates, **β** denotes corresponding regression parameters, * Z_{ij}* denotes a vector of random effect covariates, and

(2)

where *t _{ij}* is the time of the

The association between an adverse binary event, denoted by *S _{i}*, and the longitudinal process can be introduced by random effects which are shared between the two processes. We assume that

(3)

where *h*(*b**)* is a linear function in the elements of ** b**, where

(4)

where *t*_{*} is a time point near the time of birth (e.g. at 39 weeks of gestation). In this case . The joint model (2) and (4) relates the longitudinal fetal growth pattern to the probability of an abnormal birth outcome through an individual’s predicted measurement at time *t*_{*}.

In a general form, the random effects distribution can be assumed to follow a mixture of normal distributions with different means. Slaugher et al.^{10} proposed this random effects structure for fetal growth, in which there are distinct groups of fetuses which grow at different inherent rates with heterogeneity within each group. The random effects distribution can be expressed as , where **μ**_{q} and *p _{q}* are the mean and proportion, respectively, of observations in the

(5)

where *f*(*y*|*b**)* is a univariate normal density and *f _{q}*(

Parameter estimation can be substantially simplified under the assumption that random effects are normally distributed. First note that for any random effects distribution, the likelihood can be written in the form *L = L*_{1} × *L*_{2}, where and . Under a Gaussian random effects assumption, a two-stage pseudo-likelihood approach can be used for parameter estimation. In the first stage, we maximize *L*_{1} with respect to the longitudinal model parameters by fitting the linear mixed model (1) with standard software. In the second stage, we maximize *L*_{2} given the parameter estimates obtained from stage 1. By assuming a probit link function in (3) and by noting that when *X* ~ *N*(μ,σ^{2}), we can express *P*(*S _{i}* = 1|

(6)

where _{i} is the posterior mean of the random effects (empirical Bayes estimator) for the *i*th individual and *var*(_{i}-*b*_{i})=**Σ**_{b}–*var*(_{i}) where , with , and where *I _{ni}* is an identity matrix with dimension

The parameters **η** and α characterizing the probability of the binary outcome can be estimated in the second stage with a regression calibration approach by maximizing *L*_{2}. For the case where the number of follow-up measurements and times are the same across all subjects, maximum-likelihood estimators of **η** and α can be obtained with simple probit regression. Specifically, we fit the probit regression and obtain MLEs
and *. Maximum-likelihood estimators of and α can then be obtained by noting that and . More generally, *L*_{2} can be maximized with a quasi-Newton Raphson procedure^{12}. Foulkes et al.^{18} proposed a two-stage model for prediction that does not explicitly account for the calibration error in using _{i} as a plug-in estimator for * b_{i}*.

The joint model can be used to develop a predictor of the binary outcome from longitudinally collected measurements. For example, in the fetal growth application, we are interested in predicting an adverse pregnancy outcome from a series of ultrasound measurements taken at various gestational ages. Denote *Y*^{P} = (*y*_{s1},*y*_{s2},…,*y _{sL}*)′ as a vector of longitudinal measurements taken at time points

(7)

where *f*(* y*|

(8)

where *b*^{P} = Σ_{b}**Z**^{P}V^{P−1} (*y*^{P} – *X*^{P}**β**), *V ^{P}* is the variance of

We need to assess the predictive ability of the longitudinal classifiers. One such measure is the receiver operator curve (ROC) or the area under this curve (AUC). Predictors can also be assessed by absolute measures of risk^{19} such as the mean-squared error (MSE) of prediction. For any chosen measure, there are various ways to correctly validate a predictor, which can include cross-validation as well as fitting the model on one set of data (training set) and estimating the predictive ability of the classifier on another set of data (test set). Without loss of generality, we will assume that model parameters are estimated using the joint model estimated from the training set (using data from *I _{tr}* individuals) and that the resulting estimated predictors are validated using a test data set. Denote and as values of the binary outcome and longitudinal measurements for the ith subject in the test set, where

The quality of the predictor can be evaluated using the ROC curves. The ROC curve is a plot of 1-specificity versus sensitivity for multiple cut-off values of the predictor. Specifically, we plot versus where

(9)

(10)

where *I*(*x*) is an indicator function which is equal to 1 if *x* is true and equal to 0, otherwise. Further,

Further, are obtained by fitting the joint model to the training data and by plugging the maximum-likelihood estimators into (9) or (10). Further, we denote , *n _{S}*

Interest is on examining the asymptotic bias for estimating predictive probabilities as well as in assessing the overall accuracy of the predictor for the two-stage approach which assumes Gaussian random effects. Specifically, we examine the bias that exists when we assume that the random effects distribution is a normal distribution (model M) when the true random effects distribution is a mixture of two normal distributions (T). Asymptotic bias for the estimated predictive probabilities is under the assumption that the number of individuals in the training set gets large (*I _{tr}* → ∞). Asymptotic bias for estimating the ROC, AUC, and MSE is under the additional assumption that the number of individuals in the test set is large (

For the purposes of evaluating asymptotic bias, we will assume a model similar to the one described by (2) and (4). Specifically,

(11)

(12)

where *b*_{i} = (*b _{i}*

Asymptotic bias for model parameter estimators of the second stage model, η_{0} and α under model misspecification can be obtained by maximizing the expected log-likelihood of model *M* where the expectation is taken with respect to model *T*
^{4,20}. Denote *f*_{M}(*S| Y*) as the probability of an abnormal outcome

(13)

where , and where . Further, . Finally, under misspecification of the random effects distribution, the estimated predicted value based on a vector of longitudinal measurements *Y*^{P}, (*S* = 1|*Y*_{P} ), will converge to *P**(*S* = 1|*Y*^{P}), where

(14)

, and ** g**,

Expression (7) is evaluated using multivariate Gaussian quadrature with 50 quadrature points ^{21} for each of the three dimensions (corresponding to each of the three random effects). Evaluating (13) is intractable in closed form due to the difficulty in evaluating the expectation with respect to model *T*. Instead, is approximated by , where is simulated under model *T* with *K* = 20, 000. The maximization required for evaluating (13) was conducted using a quasi-Newton-Raphson algorithm^{11}.

The asymptotic bias is calculated for different two-group normal mixture random effects distributions with a common variance and with probability of 0.5 of being in each of the two mixture groups. The different two-group mixture models (A) to (D) correspond to a mixture of normals where the separation of the two normals in the mixture is 20%, 40%, 200%, and 400% of the standard deviation, respectively. Figure 1 shows histograms of the random effects distribution under models (A–D). In evaluating asymptotic biases in both individual prediction as well as overall quality of the predictor, we chose values of the fixed effects that result in patterns which are consistent to what would be expected in fetal growth. Specifically, we chose a quadratic mean structure and variance structure parameters which would be consistent with longitudinal biomarkers and imaging data in fetal growth studies. Further, we chose *t _{j} = j* and

Histograms for the random effect intercept corresponding to two-group mixture models A, B, C, and D. The differing misspecified two-group mixture models A–D correspond to a separation between the two normals of 20%, 40%, 200%, and 400% respectively **...**

Simulated longitudinal profiles under two group mixture models (A–D). We assume (16) and (17) with β_{0} = 0, β_{1} = 3, β_{2} = −0.4, η_{0} = −1, and . The true random effects distribution is a two-group mixture **...**

Focus is often on estimating the quality of a predictor based on the receiver operator curve (ROC). When the predictor is developed under model (M) when the truth is model (T), the estimated ROC curve as computed by (9) and (10) converges to the curve characterized by plotting 1 – *Spec**(*c*) versus *Sens**(*c*) for all continuous values of *c* between 0 and 1, where *Sens**(*c*) *= E*_{T}[*P**(*S* = 1|*Y*^{P}) ≥ *c*|*S* = 1] and *Spec**(*c*) *= E*_{T}[*P**(*S* = 1|*Y*^{P}) < *c*|*S* = 0], and where *P**(*S* = 1|*Y*^{P}) is computed under the Gaussian random effects assumption as (14), and the expectation is taken under the true (*T*) joint model (* Y^{P}, S*). Under the correct model, the estimators and will converge to

The expected mean-squared error is a measure of the absolute risk of the predictor. In large samples, the MSE converges to *MSE = E*_{T}[(*P*(*S* = 1|*Y*^{P}) – *S*)^{2}] under the correct model (T) and *MSE* = E*_{T}[(*P*(*S* = 1|*Y*^{P}) – *S*)^{2}] under the misspecified Gaussian model (M). Table 2 shows asymptotic bias of the model accuracy for increasing departures from normality (A–D) when *s _{l} = l,l* = 1,2,3 and

We examine the finite sample properties of the predictors under the correctly specified and misspecified models. We simulate under model (11) and (12) with *b*_{i} following a two-group pointwise mixture with mean 0. Specifically, we simulate *b*_{i} to be (*b*_{0}
*b*_{1}
*b*_{2}) with probability *p*_{1} or with probability 1 – *p*_{1} and estimate the joint model under the correctly specified finite mixture model and the misspecified Gaussian random effects model. The two-group pointwise mixture is a special case of the two-group mixture of normals when *V*=0. We examined a two-group pointwise mixture model as compared to the two-group mixture of normal distributions examined in the asymptotic calculations, since estimating the parameters of model (11) and (12) with a two-group mixture model is substantially more computationally intensive than estimating the parameters with a two-group pointwise mixture (i.e., no multivariate numerical integration is required for the pointwise mixture). Further, the pointwise mixture is more of a departure from a Gaussian random effects distribution than the more general two-group mixture of normals, thereby providing a more extreme comparison for demonstrating robustness. The parameters for the two-group pointwise mixture is the same as those specified in (C) and (D) in Figure 2 with no between-subject variation within each of the mixture groups. The simulations were conducted based on a training set sample size of *I _{tr}* = 1000 and a test set sample size of

Table 3 shows the results of simulations which vary the random effects distribution (*b*_{0}
*b*_{1}
*b*_{2}), the mixing proportion *p*_{1}, and the strength of the relationship between the random effects and the probability of the binary endpoint(α). Over a wide range of parameter values, the average estimated AUC under the true mixture model was nearly the same as the average estimated under the misspecified Gaussian random effects model . In addition, the standard errors were very close to each other. The average estimated prediction MSE under the misspecified Gaussian random effects distribution was also very close to the MSE evaluated under the correctly specified two-group pointwise mixture distribution.

An important issue in obstetrics is how to deliver the baby. If the fetus is too large, a cesarean delivery may be necessary. Thus, the development of simple and accurate methods for predicting a large baby using longitudinal ultrasound measurements will be useful. We illustrate the two-stage approach using longitudinal and birth outcome data from the National Institute of Child Health and Human Development (NICHD) study of successive small-for-gestational-age births in Scandinavia^{22}. The study was designed to collect longitudinal ultrasound measurements at 17, 25, 33, and 37 weeks of gestational age.

To illustrate the methodology, we focus on the use of longitudinal ultrasound abdominal diameter in predicting macrosomia (defined by a birthweight larger than 4000 grams). We focus on 1203 subjects who had all four longitudinal ultrasound measurements along with birth outcome data. The median abdomen diameter (3rd quartile-4th quartile) at 17, 25, 33, and 37 weeks of gestational age are 39cm (36–41), 65cm (62–67), 92cm (89–95), and 105cm (101–108), respectively. The proportion of macrosomia was estimated as 0.175.

We split the data into a training set and test set of approximately equal lengths (600 in the training set and 603 in the test set) and (i) fit the two-stage modeling approach to the training set data with log-transformed abdomen diameter, and (ii) validated the resulting predictor using the test set data. All four longitudinal measurements were used in estimating the predictor. We then estimated the ROC curve, AUC, and MSE of prediction with only the test set data. Table 4 shows parameter estimates for the two-stage approach using models (2) and (4) fit to the training set data. Parameters for the slope and quadratic terms characterizing fetal growth are all highly significant, suggesting that there is statistically significant curvature to the mean structure on the log scale. The parameter α, which links the two processes, is positive and highly statistically significant, suggesting that fetal growth is positively related to the probability of macrosomia. Further, the between-subject variation (particularly for the intercept) is relatively high, and there is a sizable negative correlation between the random intercept and slope.

Q-Q plots for the random effects distribution^{23} showed little departure from normality (data not shown). However, caution should be used in interpreting such plots since they may fail to detect non-normality when the error distribution is large and can falsely detect normality for the random effects when the error distribution is non-normal^{5}. A scatter plot of predicted individual means versus residuals showed no systematic pattern suggesting that the quadratic mean structure and random effects provide a reasonable representation of the mean structure (data not shown).

We estimated overall assessments of diagnostic accuracy for prediction using different measurement times from the test set data. Specifically, we estimated AUC (MSE) using all four longitudinal measurements, the last three of these measurements, the last two measurements, and the final measurement as 0.76 (0.12), 0.79 (0.11), 0.79 (0.11), and 0.80 (0.11), respectively. These results suggest that, in terms of overall predictive accuracy, there is little advantage in taking additional observations above the last available measurement. Further, a single measurement at 25 weeks and 33 weeks resulted in an AUC (MSE) of 0.71 (0.12) and 0.64 (0.13), respectively. Thus, taking the single measurement as close to birth as possible is advantageous for overall prediction.

In this article, we have proposed a shared random parameter framework for predicting a binary event from longitudinal measurements. Specifically, we linked the two processes through a set of shared random effects. First, we consider a general model where the random effects distribution is assumed to follow a mixture of normal distributions. Second, a simple two-stage model which can be implemented with standard statistical software was considered. However, this approach can be implemented only under a Gaussian random effects assumption. A probit model is specified for the binary response in order to obtain a closed-form expression to account for calibration error (expression (6)). Alternatively, an approximate version of expression (6) can be developed for the logistic model using a cumulative Gaussian approximation to the logistic function^{24}.

Through asymptotic bias calculations and simulations, I demonstrate that measures of overall diagnostic accuracy such as the AUC of the ROC curve and MSE are robust to even severe misspecification of the random effects distribution. This result is consistent with McCulloch and Neuhaus^{6} who show that the MSE for random effects estimation is robust to the random effect misspecification. However, individual risk predictions may be sensitive to random effects misspecification. These results have important implications for practitioners. First, even when the random effects distribution is clearly non-normal, the simple two-stage approach will be a useful tool for assessing the overall quality of the predictor based on MSE or summaries of the ROC curve. Second, for predictors that have overall high predictive accuracy, a careful assessment of the random effects distribution is necessary for using the predictor to assess individual risk. This would involve fitting models with richer random effects distributions such as the mixture of normals discussed in Section 3.

The focus of this paper was on examining robustness to the Gaussian random effects assumption when predicting a binary event from longitudinally collected data. Of course, prediction may also be sensitive to departures from the mean structure. An advantage of the two-stage approach is that standard model diagnostic methods for checking the mean and variance structure in linear mixed modeling methodology can be applied. For example, a scatter plot of standardized residuals versus predicted values can easily be constructed to examine mean structure misspecification.

The proposed methodology focuses on predicting a binary outcome from a single longitudinally collected outcome. Although the shared random parameter model with the more general random effects model could be extended to the multivariate case, it is much simpler to extend the two-stage approach to this multivariate setting. This topic is the subject of future research.

I than the Associate Editor and two reviewers for their constructive comments on this article. I thank Dr. Jun Zhang for helpful comments on this manuscript. I thank the Center for Information Technology, National Institutes of Health, for providing acccess to the high-performance computational capabilities of the Biowulf cluster computer system. The research was supported by the Intramural Research Program of the National Institutes of Health, *Eunice Kennedy Shriver* National Institute of Child Health and Human Development.

1. Verbeke G, Lesaffre E. The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data. Computational Statistics & Data Analysis. 1997;23:541–556.

2. Neuhaus JM, Hauck WW, Kalbfleisch JD. The effects of mixture distribution misspecification when fitting mixed-effects logistic models. Biometrika. 1992;79:755–762.

3. Litiere S, Alonso A, Molenberghs G. Type I and Type II error under random-effects misspecification in generalized linear mixed models. Biometrics. 2007;63:1038–1044. [PubMed]

4. Heagerty PJ, Kerland BF. Misspecified maximum-likelihood estimates and generalized linear mixed models. Biometrika. 2001;88:973–985.

5. Verbeke G, Lesaffre E. A linear mixed-effects model with heterogeneity in the random effects population. The Journal of the American Statistical Association. 1996;91:217–221.

6. McCulloch CE, Neuhaus JM. Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics. 2010 In press at. [PMC free article] [PubMed]

7. Hsieh F, Tseng YK, Wang JL. Joint modeling of survival and longitudinal data: likelihood approach revisited. Biometrics. 2006;62:1037–1043. [PubMed]

8. Song XA, Davidian M, Tsiatis AA. A semiparametric likelihood approach to jointly modeling of longitudinal data and time-to-event data. Biometrics. 2002;58:742–753. [PubMed]

9. Deter RL. Individualized growth assessments: evaluation of growth using each fetus as its own control. Seminars in Perinatology. 2004;28:23–32. [PubMed]

10. Slaughter JC, Herring AH, Thorp JM. A Bayesian latent variable mixture model for longitudinal fetal growth. Biometrics. 2009;65:1233–1242. [PMC free article] [PubMed]

11. Evans M, Swartz T. Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford: Oxford University Press; 200.

12. Thisted RA. Elements of Statistical Computing: Numerical Computation. New York: Chapman and Hall; 1988.

13. Muthen B, Brown C, Masyn K, Jo B, Khoo S, Yang C, Wang C, Kellam S, Carlin J, Liaoj J. General growth mixture modeling for randomized prevention intervention. Biostatistics. 2002;3:459–475. [PubMed]

14. Wang CP, Brown CH, Bandeen-Roche K. Residual diagnostics for growth mixture models: examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association. 2006;100:1054–1076.

15. Laird N. Nonparametric maximum-likelihood estimation of a mixing distribution. journal of the American Statistical Association. 1978;73:805–811.

16. Carroll RJ, Spiegelman CH, Lan KKG, Abbott RD. On errors-in-variables for binary regression models. Biometrika. 1984;71:19–25.

17. Wang CY, Wang N, Wang S. Regression analysis when covariates are regression parameters of a random effects model for observed longitudinal measurements. Biometrics. 2000;56:487–495. [PubMed]

18. Foulkes AS, Azzoni L, Li X, Johnson M, Smith C, Mounzer K, Montaner LJ. Prediction-based classification for longitudinal biomarkers. Annals of Applied Statistics. 2010;4:1476–1497. [PMC free article] [PubMed]

19. Gail M, Pfeiffer RM. On criteria for evaluating models of absolute risk. Bio-statistics. 2005;6:227–239. [PubMed]

20. White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1–26.

21. Abramowitz M, Stegun I. Handbook of Mathematical Functions. New York: Dover; 1974.

22. Bakketeig LS, Jacobsen G, Hoffman HJ, Lindmark G, Bergsjo P, Moline K, Rodsten J. Pre-pregnancy risk factors of small-for-gestational age births among parous women in Scandinavia. Acta Obstetricia et Gynecologica Scandinavica. 1993;72:273–279. [PubMed]

23. Lang N, Ryan L. Assessing normality in random effects models. The Annals of Statistics. 1989;17:624–642.

24. Johnson NL, Kotz S. Distributions in Statistics, Continuous Univariate Distributions. Vol 2. Boston: Houghton-Mifflin; 1970. p. 6.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |