In this section, we consider how to improve the approximate model (
8) with a normality assumption. In addition to the previous assumptions concerning model (
3), we assume that the error term
ε is normally distributed. The normality assumption can be checked through model (
9), and it implies that, given (
i,
Wi),
Xi follows a multivariate normal distribution with mean and variance given by (
5) and (
6), respectively. This makes it possible to carry out a maximum likelihood analysis based on the following decomposition:
where
f(·|·) denotes a generic conditional density or mass function. The full likelihood for (
α,
β,
σ2) can be maximized using a numerical integration method (of dimension
m − 1) or a Monte Carlo EM algorithm. Either way, the computation can be challenging when
m is large. Our regression calibration approach based on (
7) has a pseudo-likelihood flavor in that we estimate (
α,
σ2) from the first part of the likelihood (for
i) and then substitute the estimates into the second part of the likelihood (for
Yi). Further, we take a marginal approach to the likelihood for
Yi, working with each individual
Yij and accounting for the correlation using GEEs. The regression calibration approach is computationally simple and, according to
Thoresen and Laake (2000) and
Fearn et al. (2008), may provide a good alternative to the maximum likelihood approach. For our purpose, the normality in model (
3), and hence in
equations (5) and
(6), is mainly to help characterize the calibrated model (
7). This will be done for specific link functions using existing techniques (e.g.,
Carroll et al., 1984;
Burr, 1988;
Liang and Liu, 1991;
Follman et al., 1999).
The idea may be easier to communicate for the log link. In this case,
G is the exponential function, and we can evaluate the integral in (
7) explicitly as
where

is identical to
β except for the intercept:

. This follows from the fact that E{exp(
S)} = exp(
μ +
τ2/2) if
S is normally distributed with mean
μ and variance
τ2. The above expression does resemble model (
8) (with
G = exp), the only difference being the value of the intercept. For the probit link, it can be shown as in
Carroll et al. (1984) that
where

and Φ is the standard normal distribution function. Thus the calibrated model is also a probit regression, and
βprobit is a rescaling of
β with all components attenuated by the same factor. As a result of the attenuation, |
βprobit,x| is bounded from above by {
m/(
m − 1)}
1/2/
σ, and the upper bound decreases to 0 with increasing
σ. The logit link does not admit a simple exact expression for (
7), unfortunately. In this case, the one-dimensional integral in (
7) can be approximated using numerical integration or Monte Carlo techniques, as noted by the Associate Editor. This approach could be applied to any link function and any distribution of calibration errors. An alternative approach, which is specific to the logit link but readily available in standard packages, is based on an approximation of logit
−1, the standard logistic distribution function, by Φ(·/
c) with

(
Johnson and Kotz, 1970;
Zeger et al., 1988;
Liang and Liu, 1991). Under the latter approach, we have
where

. As in the probit case,
βlogit is an attenuation of
β by a factor that further depends on
c. The approximation here is good for small or moderate
σ (
Zeger et al., 1988). For large
σ, it may be preferable to use a numerical interation or Monte Carlo method.
Each of the three models (
11)–(
13) can be fit with GEEs, using either the augmentation method or the imputation method described earlier. In each case, the “working” regression parameter vector
βlink, where the subscript could be any of the three link functions considered above, is zero if and only if
β =
0, and the same equivalence holds between the corresponding sub-vectors of
βlink and
β, except for the intercept in the log link case. Therefore, in most cases, the null hypothesis that some components of
β are zero can be tested (approximately in the logit case) by applying a standard procedure (e.g., Wald test) to the corresponding components of
βlink. Estimation of
β may involve some conversion from a GEE estimate of
βlink. For the log link, this is needed only for

. In the probit case, the conversion consists of

followed by

. Numerically, this requires that
where
probit,x is the GEE estimate and
2 is
m times the estimated error variance for model (
9). When (
14) fails, intuition suggests that one might simply set
x =
probit,x, although there are other possibilities (
Burr, 1988;
Tosteson et al., 1989). This would not alter the consistency or asymptotic normality of
![[beta]](/corehtml/pmc/pmcents/bgrcirc.gif)
. Indeed, under the stated assumptions the probability of (
14) failing tends to 0, and any modification for that case would have no effect asymptotically. The logit case could be handled in a similar way.
Under the assumed models, the estimators derived from models (
11)–(
13) are consistent (approximately in the logit case) and asymptotically normal. For the augmentation method, this follows directly from the delta method. For the imputation method, we could use a joint estimating function argument as in Section 3, followed by the delta method. Robust variance estimates from an augmented GEE analysis are directly applicable to
βz and
βx under the log link, and require some delta-method-type manipulation under the probit or logit link. In general, a bootstrap procedure with poolwise resampling can be used to obtain standard errors and confidence intervals.