PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Stat Probab Lett. Author manuscript; available in PMC 2010 October 27.
Published in final edited form as:
Stat Probab Lett. 2010 August; 80(15-16): 1242–1252.
doi:  10.1016/j.spl.2010.04.002
PMCID: PMC2964941
NIHMSID: NIHMS218456

Marginal longitudinal semiparametric regression via penalized splines

Abstract

We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.

Keywords: Additive models, Best prediction, Maximum likelihood, Gibbs sampling, Nonparametric regression, Restricted maximum likelihood, Varying coefficient models

1. Introduction

The past decade has seen a great deal of interest and activity in nonparametric regression for longitudinal data. A prominent component of this research is the marginal longitudinal nonparametric regression problem in which the covariance matrix of the responses for each subject is not modelled conditionally, and instead is an unspecified parameter to be estimated.

Ruppert et al. (2009, Section 3.9) provide a summary of research on this problem up until about 2008. Whilst Zeger and Diggle (1994) is an early reference for marginal longitudinal nonparametric regression, the area started to heat up in response to Lin and Carroll (2001), where it was shown that ordinary kernel smoothers are more efficient if so-called working independence is assumed. This spawned a flurry of activity on the problem. Relevant references include: Lin and Carroll (2000, 2006), Welsh et al. (2002), Wang (2003), Linton et al. (2003), Lin et al. (2004), Carroll et al. (2004), Hu et al. (2004), Chen and Jin (2005), Wang et al. (2005) and Fan et al. (2007), Sun et al. (2007), Fan and Wu (2008) and Carroll et al. (2009a,b).

In this article, we describe a relatively simple approach to the marginal longitudinal regression problem and its semiparametric extensions. Our approach is the natural one arising from the mixed model representation of penalized splines (e.g. Brumback et al., 1999; Ruppert et al., 2003) with estimation and inference done using maximum likelihood and best prediction. There is also the option of adopting a Bayesian standpoint and calling upon Markov chain Monte Carlo to achieve approximate inference. An interesting aspect of our marginal longitudinal semiparametric regression models is that Gibbs sampling applies with draws from standard distributions. For the Bayesian version of our model, the BUGS inference engine (Lunn et al., 2000) can be used for fitting, and we provide some illustrative code.

The penalized spline/mixed model approach means that semiparametric extensions of the marginal longitudinal regression problem can be handled straightforwardly. We describe extensions to additive and varying coefficient models, although other extensions can be handled similarly.

Section 2 describes the penalized spline approach and identifies the mixed model structures required to handle marginal longitudinal semiparametric regression problems. In Section 3, we discuss fitting via maximum likelihood and best prediction. Section 4 describes Bayesian inference via Gibbs sampling and BUGS. Illustrations are provided in Section 5 and closing discussion is given in Section 6.

2. Marginal longitudinal nonparametric regression and extensions

For 1 ≤ im subjects, we observe 1 ≤ jn (n [double less-than sign] m) scalar responses yij and predictors xij. Let yi be the vector of responses for the ith subject and xi be defined similarly. The covariance matrix of a random vector v is denoted by Cov(v). The marginal longitudinal nonparametric regression model is then

E(yij)=f(xij),Cov{yif(xi)}=,1im,1jn
(1)

for some real-valued smooth function f and n × n covariance matrix Σ. The notation f(xi) means that the function f is applied element-wise to each of the entries of xi. We use Cov{yi|f(xi)} rather than Cov(yi) to allow for the possibility that f(xi) is random according to the model, although this is not a requirement.

Fig. 1 shows a simulated data set for model (1), with m = 100, n = 5,

Fig. 1
A data set simulated from a version of the marginal longitudinal nonparametric regression model (1) with m = 100, n = 5 and f and Σ as described in the text.

f(x)=1+12Φ((2x36)/5)and=[0.1220.0980.0780.0630.0500.0980.1220.0980.0780.0630.0780.0980.1220.0980.0780.0630.0780.0980.1220.0980.0500.0630.0780.0980.122],
(2)

where Φ is the standard normal distribution function. The main problem is efficient estimation of f from data such as that shown in Fig. 1. Estimation of Σ may also be of interest.

Our approach to function estimation involves spline models for f of the form

f(x)=β0+β1x+k=1Kukzk(x)
(3)

where z1, …, zK is a rich set of spline basis functions. A simple basis arises from setting zk(x) = (xκk)+ where κ1, …, κK is a dense set of knots placed over the range of the xis and r+ = max(0, r) for any real number r. However, we recommend a smoother and more numerically stable choice for zk, such as those described in Welham et al. (2007) and Wand and Ormerod (2008). The number of basis functions K has a minor effect on the efficacy of (3) and, for most signals arising in practice, K = 25 is sufficient. Li and Ruppert (2008) give some interesting asymptotics that provide support for this maxim.

To avoid over-fitting the spline coefficients uk, 1 ≤ kK, need to be penalized in some way. A convenient penalization mechanism is to treat the uk as a random sample from a distribution with mean zero and variance σ2. This permits the following linear mixed model representation of (1) and (3):

y=Xβ+Zu+ε
(4)

where

y=[y1ym],X=[1x11xm],Z=[z1(x1)zK(x1)z1(xm)zK(x1)],ε=[ε1εm],β=[β0β1]andu=[u1uK].

The random vectors on the right-hand side of (4) have mean zero and covariance matrix:

Cov[uε1ε2εm]=[σ2I000000000000]=[σ2I00Im],

where [multiply sign in circle] denotes Kronecker product. For fixed values of σ2 and Σ, we can call upon best linear unbiased prediction (e.g. Robinson, 1991) to estimate β and u and, hence, the regression function f. In practice, though, both σ2 and Σ need to be estimated and a convenient assumption for achieving this aim is

[uε]N([00],[σ2I00Im]).
(5)

From now on, we will assume that the Gaussian linear mixed model (4) and (5) is reasonably assumed. Sections 3 and 4 describe two approaches to fitting and inference. Before getting to that, we describe some semiparametric extensions of (1).

2.1. Additive models extension

Suppose now that, corresponding to each yij, several predictor variables are available. There are a number of semiparametric regression extensions of (1) that could be considered. In this section, we focus on the additive model extension. To keep the notation simple, we restrict discussion to the situation where there are two continuous predictors with the jth measurement on subject i denoted by x1ij and x2ij. The marginal longitudinal additive model for such data is

E(yij)=β0+f1(x1ij)+f2(x2ij),Cov{yif1(x1i),f2(x2i)}=,1im,1jn
(6)

where f1 and f2 are smooth functions. If each of these is modelled as a penalized spline:

f1(x1)=β11x1+k=1K1u1kz1k(x1)andf2(x2)=β21x2+k=1K2u2kz2k(x2)
(7)

with coefficients independently distributed subject to

u1ki.i.d.N(0,σ12)andu2ki.i.d.N(0,σ22)

then a Gaussian linear mixed model

y=Xβ+Zu+ε

arises. The differences between this model and that of Section 2 are that the design matrices are now

X=[1x11x211x1mx2m],Z=[z11(x11)z1K1(x11)z21(x21)z2K2(x21)z11(x1m)z1K1(x1m)z21(x2m)z2K2(x2m)]

where x1i is the n × 1 vector containing the x1ij measurements and x2i is defined similarly. The coefficient vectors are

β=[β0β11β12]andu=[u1u2]

where u1 is the K1 × 1 vector containing the u1k and u2 is defined similarly. The covariance matrix of the spline coefficients and errors is now

[u1u2ε]N([000],[σ12I000σ22I000Im]).
(8)

Fitting via maximum likelihood and best prediction is analogous to that described in Section 3. The main difference is that there are two variance parameters σ12 and σ22 (and in extensions to additive models with d smooth functions there will be d such variance components) as well as the error covariance matrix Σ. Maximum likelihood fitting, described in Section 3, requires an expression for V [equivalent] Cov(y). For the current model, this matrix takes the form

V=V(σ12,σ22,)=σ12Z[1]Z[1]T+σ22Z[2]Z[2]T+Im

where Z[1] and Z[2] correspond to the column-wise partitioning of Z according to the basis functions for f1 and f2 (i.e. Z = [Z[1]Z[2]]).

Before closing this section, we briefly mention that the model

E(yij)=β0+β1x1ij+f2(x2ij),Cov{yix1i,f2(x2i)}=,1im,1jn
(9)

is a simpler type of additive model than (6), since it only has one smooth function component. This is a bona fide semiparametric regression model since the right-hand side has the effect of the x1ijs modelled parametrically and the effect of the x2ijs modelled nonparametrically. However, the linear mixed model attached with this model is on par with that treated in Section 2. In particular, the random component structure (5) applies to (9).

2.2. Incorporation of interactions

An important extension of (6) is

E(yij)=β0+fx3ij(x1ij)+f2(x2ij),Cov{yifx3ij(x1i),f2(x2i)}=,1im,1jn
(10)

where the x3ij correspond to measurements on a categorical variable x3. The cross-sectional data version of such models are treated, using penalized splines, by Coull et al. (2001) and Ruppert et al. (2003, Chapter 12).

Interestingly, the Gaussian linear mixed model for fitting (10) takes the same form as that for fitting the additive model (6). In particular, the covariance matrix of the random effects and error vectors takes the same as that given in (8), but with two additional variance parameters. The only other difference is the form of the design matrices X and Z. The aforementioned references provide details on their construction.

2.3. Varying coefficient models

Another type of multiple-predictor semiparametric regression model is that involving varying coefficients. The simplest marginal longitudinal varying coefficient model is

E(yij)=f0(sij)+f1(sij)xij,Cov{yif0(si),f1(si)}=,1im,1jn
(11)

where sij are the longitudinal measurements on a continuous predictor variable s and xij are the measurements on a second predictor x. The modifying effect of s on the linear relationship between E(y) and x is modelled flexibly through the varying coefficients f0(s) and f1(s). Sun et al. (2007) paid particular attention to models of this type.

As with the interaction extension, the Gaussian linear mixed model for fitting the varying coefficient model (11) takes the same form as that for fitting the additive model (6). In particular, the covariance matrix of the random effects and error vectors is exactly the same as that given in (8). Only the design matrices require modification. Section 12.4 of Ruppert et al. (2003) provides the relevant details.

3. Maximum likelihood estimation and best prediction

Each of the marginal longitudinal semiparametric regression models in the previous section, and their extensions to d smooth functions, can be handled using the Gaussian linear mixed model

yuN(Xβ+Zu,Im),uN(0,blockdiag1d(σ2IK)).
(12)

Here, K[ell] corresponds to the number of spline basis functions used in the [ell]th smooth function estimate. Let σ2=(σ12,,σd2) be the vector of variance parameters. Then, the log-likelihood of y under (12) is

(β,σ2,)=12{nlog(2π)+logV+(yXβ)TV1(yXβ)}
(13)

where

V=V(σ2,)Cov(y)==1dσ2Z[]Z[]T+Im

and [Z[1] · · · Z[d]] is the partition of Z corresponding to the basis functions for each smooth function estimate. For any fixed values of σ2 and Σ, the fixed effects solution is

β(σ2,)=(XTV1X)1XTV1y.
(14)

On substitution into (13), we obtain the profile log-likelihood for (σ2, Σ) as:

P(σ2,)=12[logV+yTV1{IX(XTV1X)1XTV1}y]n2log(2π).
(15)

However, the restricted log-likelihood (Patterson and Thompson, 1971)

R(σ2,)=P(σ2,)12logXTV1X
(16)

is usually preferred, since it accounts for estimation of the fixed effects vector β. The maximizers of [ell]R(σ2, Σ) are often labelled the restricted maximum likelihood or REML estimates of σ2 and Σ.

Likelihood-based estimation of the model parameters β, σ2 and Σ thus involves:

  1. Obtain the REML estimates [sigma with hat]2 and [Sigma] by maximizing [ell]R(σ2, Σ).
  2. Obtain the maximum likelihood estimate of [beta] = beta([sigma with hat]2, [Sigma]) according to (14).

Step 1 is by far the more challenging, since it involves multivariate numerical optimization.

Lastly, there is the problem of estimating spline coefficients u. Since u is random, we cannot appeal to maximum likelihood and instead have to rely on best prediction:

u(σ2,)E(uy)=Gσ2ZTV(σ2,)1{yXβ(σ2,)}

where Gσ2=blockdiag1d(σ2IK). An appropriate estimator for u in this context is the empirical best predictor

u^=Gσ^2ZTV(σ^2,^)1{yXβ(σ^2,^)}.

It is straightforward to construct estimates of the regression function f at arbitrary locations x [set membership] R using [beta] and û.

Despite (12) being a relatively simple linear mixed model, we have not yet been successful in fitting it with standard mixed model software such as lme() (Pinheiro et al., 2009) in the R computing language (R Development Core Team, 2009). This led us to also consider the Bayesian inference version and implementation via Gibbs sampling, as the next section describes. Note, however, that the approach described in the current section can be implemented via direct programming using either Newton–Raphson or Expectation–Maximization iteration.

4. Bayesian inference

An alternative inference strategy, which permits more direct implementation in standard software, involves working with a hierarchical Bayesian version of the Gaussian linear mixed model (12). This entails treating β, σ2 and Σ as random and setting prior distributions for each of them. The most convenient choice, because of conjugacy properties, are priors of the form:

βN(0,F),σ2InverseGamma(A,B)andInverseWishart(a,B)
(17)

where A[ell], B[ell], 1 ≤ [ell]d, are positive constants and F and B both positive definite matrices. Throughout this section, let [x] denote the density function of x. Then, the notation σ2 ~ Inverse-Gamma(A, B) means that

[σ2]=BAΓ(A)(σ2)A1eB/σ2,σ2,A,B>0.

The notation Σ ~ Inverse-Wishart(a, B), where Σ is n × n, means that

[]=Cn,a1Ba/2(a+n+1)/2exp{12tr(B1)},a>0,,Bbothpositivedefinite

where Cn,a2an/2πn(n1)/4i=1nΓ(a+1i2).

Bayesian inference is based on the posterior density functions:

[βy],[uy],[σ2y]and[y].
(18)

The probability calculus required to obtain each of these is unwieldy and, in practice, either analytic or Monte Carlo approximations need to be called upon. As shown in Section 4.1, the Markov chain Monte Carlo method Gibbs sampling is straightforward to implement for the Bayesian version of (12) and the priors (17) and, upon convergence, yields samples of arbitrary size from the posterior densities (18). The software package BUGS (Lunn et al., 2000) facilitates this approach to approximate Bayesian inference and illustrative code is given in Section 4.2.

A final, albeit important, aspect of this approach to fitting and inference is choice of the hyperparameters F, A[ell], B[ell], a and B. If the analyst has specific prior beliefs about the model parameters, then there is the opportunity to choose the hyperparameters so that the prior densities reflect those beliefs. More often than not such prior beliefs are absent and vague priors should be used. Reasonable choices for the fixed effects and variance hyperparameters, assuming that the data have been suitably standardized, are:

F=108IandA=B=0.01.
(19)

Reasonable choices for the hyperparameters associated with Σ are

a=nandB=0.01In.
(20)

We recommend that the sensitivity checks be done on these hyperparameters. An example of such a check, for a different Bayesian penalized spline model, is Figure 3 of Zhao et al. (2006). Section 3.2 of Crainiceanu et al. (2007) has detailed discussion on this issue, including the consequences of failing to pre-standardize the data.

4.1. Gibbs sampling scheme

The hierarchical Bayesian model specified by (12) and (17) can be fitted using a Gibbs sampling scheme with draws from standard distributions. We give the details here.

First, we note (e.g. Robert and Casella, 2004, p. 371) that Gibbs sampling requires successive draws from the full conditional distributions for each member of a particular partition of the parameters in the model. For the present model, we use the partition:

[βu],σ12,,σd2,.

As an example, the full conditional distribution for σ12 is

σ12y,[βu],σ22,,σd2,.

We denote this by ‘ σ12rest’ for short. Let C [equivalent] [X Z] and, as before, let Gσ2blockdiag1d(σ2IK). Then, the required full conditionals for Gibbs sampling are:

[βu]restN((CT(Im1)C+Gσ2)1CT1y,(CT(I1)C+Gσ2)1),σ2restInverseGamma(A+12K,B+12||u||2),1d

and

restInverseWishart(a+m,B+(yXβZu)(yXβZu)T).

Provided that a is an integer then the Inverse-Wishart draws for Σ can be achieved by setting

=(i=1a+mviviT)1

where the vi are independent N(0, {B + (yXβZu)(yXβZu)T }− 1) random vectors.

Interestingly, the fact that Σ is unstructured means that Gibbs sampling is exact. This is not the case if Σ is structured (e.g. autoregressive) and more complicated Markov chain Monte Carlo schemes are then required.

4.2. Implementation in BUGS

The BUGS language supports implementation of our Bayesian marginal longitudinal semiparametric regression models. It is recommended that the spline basis functions be set up outside of BUGS. We do this in R and then call BUGS using the BRugs package (Ligges et al., 2009). We pass the regression data to BUGS using matrices. For example, the variable yMat is an m×n matrix with (i, j) entry containing yij. Our BUGS code for fitting the marginal longitudinal nonparametric regression model is:

model
{
 for (i in 1:m)
 {
  for (j in 1:n)
  {
   mu[i,j] <- beta0 + beta1*xMat[i,j] + inprod(u[],Z[(i-1)*n+j,])
  }
  yMat[i,1:n] ~ dmnorm(mu[i,],Omega[1:n,1:n])
 }
 for (k in 1:K)
 {
  u[k] ~ dnorm(0,tau)
 }
 beta0 ~ dnorm(0,1.0E-8); beta1 ~ dnorm(0,1.0E-8)
 tau ~ dgamma(0.01,0.01); Omega[1:n,1:n] ~ dwish(R[,],n)
 for (i in 1:n)
 {
  for (j in 1:n)
  {
   R[i,j] <- 0.01*equals(i,j)
  }
 }
 sigma <- 1/sqrt(tau); Sigma[1:n,1:n] <- inverse(Omega[,])
}

Note that BUGS uses precision matrices rather than covariance matrices in its multivariate normal distribution specification. Hence, the above code uses the variable Omega, corresponding to Ω = Σ− 1. Similarly, the precision parameter tau corresponds to τ = 1/σ2 where σ2 is the spline penalization variance component.

5. Illustrations

We tested out BUGS fitting of the four types of models presented in Section 2 on several sets of simulated data, as well as some empirical data. We now present some of these results.

5.1. Nonparametric regression and simulated data

We fitted the penalized spline model (12) to the data of Fig. 1. The yij were generated according to (2). The xij are equally spaced but with the starting positions xi1 were generated uniformly from the interval (8, 12). We used the diffuse priors given by (19) and (20). A burn-in period of 5000 was used, followed by 5000 iterations with a thinning factor of 5–resulting in samples of size 1000 being retained for inference. The BUGS program, took 5.5 min to run on the third author’s laptop computer (Mac OS X; 2.33 GHz processor, 3 GBytes of RAM).

Fig. 2 summarizes the BUGS output for the diagonal entries of Σ. The chains mix quite well with no significant autocorrelation. In addition, the true values of Σii are captured by the 95% credible intervals in four out of the five cases.

Fig. 2
Summary of MCMC-based inference for the diagonal entries of Σ in the fitted marginal longitudinal nonparametric regression model. The columns are: parameter, trace plot of MCMC sample, plot of sample against 1-lagged sample, sample autocorrelation ...

The results for the off-diagonal entries of Σ are summarized in Fig. 3. All ten of the true values of Σij are captured by the 95% credible intervals.

Fig. 3
Summary of MCMC-based inference for the off-diagonal entries of Σ in the fitted marginal longitudinal nonparametric regression model. The columns are: parameter, trace plot of MCMC sample, plot of sample against 1-lagged sample, sample autocorrelation ...

The Bayesian penalized spline estimate of f is shown in Fig. 4, with and without the data. The thick solid curves correspond to the posterior means of (3) over a grid of xs. The dashed curves are corresponding 95% credible sets. The regression function from which the data were generated is shown for comparison. The pointwise posteriors are seen to cover the true f quite well.

Fig. 4
Left panel: estimated regression function f, together with the longitudinal data on which it is based. Dashed curves correspond to pointwise 95% credible sets. The true f is shown as a thin grey curve. Right panel: estimated regression function f, with ...

The good results presented in this section are typical of the performances we observed over several runs, as well as different choices for f and Σ. An interesting future project would be a large scale simulation study that compares this approach with existing methods.

5.2. Additive/interaction model and nutritional epidemiology data

We fitted the Bayesian additive/interaction model (10) to data from a nutritional epidemiology study (source: Kipnis et al., 2003). The sample sizes are m = 294 and n = 2 and each penalized spline fit involved K = 25 basis functions. The BUGS program, with 5000 burn-in samples and 5000 retained samples (thinned by a factor of 5), took 52 min to run on the third author’s laptop computer. In the notation of (10), the variables are

  • y = logarithm of intake of protein as measured by the biomarker urinary nitrogen,
  • x1 = body mass index,
  • x2 = logarithm of intake of protein as measured by a 24-hour recall instrument,
  • x3 = gender.

Fig. 5 shows the estimates of fmale, ffemale and f2 and corresponding 95% pointwise credible sets.

Fig. 5
Estimated functions for the additive/interaction model (10) for the nutritional epidemiology data described in the text. The dashed curves correspond to 95% pointwise credible sets.

Fig. 6 summarizes the MCMC-based inference for the entries Σ. Excellent chain mixing is observed. The within-subject covariance is significantly positive, as indicated by the 95% credible set for Σ12.

Fig. 6
Summary of MCMC-based inference for the entries of Σ in the fitted additive/interaction model (10). The columns are: parameter, trace plot of MCMC sample, plot of sample against 1-lagged sample, sample autocorrelation function, kernel estimates ...

6. Discussion

It is somewhat of a quirk that the mixed model-based penalized spline approach to marginal longitudinal nonparametric regression has not been explored in depth until now. Nevertheless, as we have illustrated in the previous section, it is a viable approach that is readily implemented in standard software. Another advantage of this approach is that complications such as missingness can be handled within the same likelihood-based or Bayesian frameworks. It would be interesting to see if the asymptotic efficiency results established for other approaches (e.g. Wang et al., 2005) also apply here. Another interesting line of inquiry is the use of penalized splines in flexible marginal modelling of longitudinal data with binary and count responses. Heagerty (1999), for example, describes the parametric models of this type. We believe that such models are amenable to a penalized spline approach, with implementation in BUGS, but we have not explored this deeply.

Acknowledgments

Wand’s research was partially supported by Australian Research Council Discovery Project DP0877055. Carroll’s research was supported by a grant from the US National Cancer Institute (CA57030) and by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology, Saudi Arabia.

References

  • Brumback BA, Ruppert D, Wand MP. Comment on paper by Shively, Kohn & Wood. Journal of the American Statistical Association. 1999;94:794–797.
  • Carroll RJ, Hall P, Apanasovich TV, Lin X. Histospline method in nonparametric regression models with application to clustered/longitudinal data. Statistica Sinica. 2004;14:649–674.
  • Carroll RJ, Maity A, Mammen E, Yu K. Nonparametric additive regression for repeatedly measured data. Biometrika. 2009a;96:383–398.
  • Carroll RJ, Maity A, Mammen E, Yu K. Efficient semiparametric marginal estimation for the partially linear additive model for longitudinal/clustered data. Statistics in Biosciences. 2009b;1:10–31. [PMC free article] [PubMed]
  • Chen K, Jin Z. Local polynomial regression analysis of clustered data. Biometrika. 2005;92:59–74.
  • Coull BA, Ruppert D, Wand MP. Simple incorporation of interactions into additive models. Biometrics. 2001;57:539–545. [PubMed]
  • Crainiceanu CM, Ruppert D, Carroll RJ, Joshi A, Goodner B. Spatially adaptive Bayesian penalized splines with heteroscedastic errors. Journal of Computational and Graphical Statistics. 2007;16:265–288.
  • Fan J, Huang T, Li R. Analysis of longitudinal data with semiparametric estimation of covariance function. Journal of the American Statistical Association. 2007;102:632–641. [PMC free article] [PubMed]
  • Fan J, Wu Y. Semiparametric estimation of covariance matrixes for longitudinal data. Journal of the American Statistical Association. 2008;103:1520–1533. [PMC free article] [PubMed]
  • Heagerty PJ. Marginally specified logistic-normal models for longitudinal binary data. Biometrics. 1999;55:688–698. [PubMed]
  • Hu ZH, Wang N, Carroll RJ. Profile-kernel versus backfitting in the partially linear models for longitudinal/clustered data. Biometrika. 2004;91:251–262.
  • Kipnis V, Subar AF, Midthune D, Freedman LS, Ballard-Barbash R, Troiano R, Bingham S, Schoeller DA, Schatzkin A, Carroll RJ. The structure of dietary measurement error: results of the OPEN biomarker study. American Journal of Epidemiology. 2003;158:14–21. [PubMed]
  • Li Y, Ruppert D. On the asymptotics of penalized splines. Biometrika. 2008;95:415–436.
  • Lin X, Carroll RJ. Nonparametric function estimation for clustered data when the predictor is measured without/with error. Journal of the American Statistical Association. 2000;95:520–534.
  • Lin X, Carroll RJ. Semiparametric regression for clustered data using generalized estimating equations. Journal of the American Statistical Association. 2001;96:1045–1056.
  • Lin X, Carroll RJ. Semiparametric estimation in general repeated measures problems. Journal of the Royal Statistical Society, Series B. 2006;68:68–88.
  • Lin X, Wang N, Welsh AH, Carroll RJ. Equivalent kernels of smoothing splines in nonparametric regression for clustered/longitudinal data. Biometrika. 2004;91:177–193.
  • Linton OB, Mammen E, Lin X, Carroll RJ. Accounting for correlation in marginal longitudinal nonparametric regression. In: Lin DY, Heagerty PJ, editors. Proceedings of the Second Seattle Symposium in Biostatistics: Analysis of Correlated Data. Springer; New York: 2003. pp. 23–33.
  • Ligges U, Thomas A, Spiegelhalter D, Best N, Lunn D, Rice K, Sturtz S. BRugs 0.5-3: Analysis of graphical models using MCMC techniques. R package. 2009. http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.10.
  • Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS–a Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing. 2000;10:325–337.
  • Patterson HD, Thompson R. Recovery of inter-block information when block sizes are unequal. Biometrika. 1971;58:545–554.
  • Pinheiro J, Bates D, DebRoy S, Sarkar D. The R Core Team. nlme 3.1: linear and nonlinear mixed effects models. R package. 2009. www.R-project.org.
  • R Development Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing; Vienna, Austria: 2009. www.R-project.org.
  • Robert CP, Casella G. Monte Carlo Statistical Methods. 2. Springer-Verlag; New York: 2004.
  • Robinson GK. That BLUP is a good thing: the estimation of random effects. Statistical Science. 1991;6:15–51.
  • Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge University Press; New York: 2003.
  • Ruppert D, Wand MP, Carroll RJ. Semiparametric regression during 2003–2007. Electronic Journal of Statistics. 2009;3:1193–1256. [PMC free article] [PubMed]
  • Sun Y, Zhang W, Tong H. Estimation of the covariance matrix of random effects in longitudinal studies. The Annals of Statistics. 2007;35:2795–2814.
  • Wand MP, Ormerod JT. On semiparametric regression with O’Sullivan penalized splines. Australian and New Zealand Journal of Statistics. 2008;50:179–198.
  • Wang N. Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika. 2003;90:43–52.
  • Wang N, Carroll RJ, Lin X. Efficient semiparametric marginal estimation for longitudinal/clustered data. Journal of the American Statistical Association. 2005;100:147–157.
  • Welham SJ, Cullis BR, Kenward MG, Thompson R. A comparison of mixed model splines for curve fitting. Australian and New Zealand Journal of Statistics. 2007;49:1–23.
  • Welsh AH, Lin X, Carroll RJ. Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods. Journal of the American Statistical Association. 2002;97:482–493.
  • Zeger S, Diggle PJ. Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics. 1994;50:689–699. [PubMed]
  • Zhao Y, Staudenmayer J, Coull BA, Wand MP. General design Bayesian generalized linear mixed models. Statistical Science. 2006;21:35–51.