Since
James and Stein (1961), many authors have explored better estimators for a covariance matrix. These have been derived from a decision-theoretic perspective (Stein, 1975;
Dey and Srinivasan, 1985;
Lin and Perlman, 1985;
Haff, 1991) or by specifying an appropriate prior for the covariance matrix and choosing an estimator based on a particular loss function (
Yang and Berger, 1994;
Daniels and Kass, 1999). We will take the latter approach in this article and build on the work of
Daniels and Kass (1999). The general approach will be to first generically stabilize an unstructured estimate of the covariance matrix and then to shrink this estimate toward a parsimonious, structured form of the matrix. In this approach, the data will determine how much shrinkage is required.
We will now provide some more details on the approach in
Daniels and Kass (1999); for a similar approach in the context of a covariance function in time series data, see
Daniels and Cressie (2001). To develop more stable estimators for the covariance matrix,
Daniels and Kass (1999) shrink the matrix toward a diagonal structure and obtain estimates (and posterior distributions) using combinations of importance sampling and Markov chain Monte Carlo (MCMC). Here we will extend this work in several ways. First, we consider shrinking toward any structure (not just diagonal), including nonstationary models, such as (structured) antedependence models (
Zimmerman and Nunez-Anton, 1997) for longitudinal data, and stationary models, such as compound symmetry, auto regressive (AR), and auto regressive moving average (ARMA) models (see early work by
Chen, 1979;
Lin and Perlman, 1985). Second, we propose estimators that can be computed by simple approximations, avoiding the fully MCMC approach, which can be computationally intensive for large matrices. These estimators will provide more stability than an unstructured estimator but will still provide robustness to misspecification of the structure. These estimators might be called empirical (or approximate) Bayes estimators (
Kass and Steffey, 1989).
Another approach to inducing stability in estimating a covariance matrix is to shrink the eigenvalues (Stein, 1975;
Dey and Srinivasan, 1985;
Haff, 1991;
Yang and Berger, 1994;
Ledoit, 1996). It is well known that the sample eigenvalues are biased; the smallest is too small and the largest is too large. We will review some of these estimators and propose another, derived from a simple prior distribution on the eigenvalues.
We will also examine the way these estimators influence covariate effects. Consider a fixed effects regression model with correlated errors, a common model for longitudinal and clustered data (
Diggle, Liang, and Zeger, 1994),
where
Yi is a
p × 1 vector and
β is a
q × 1 vector,
i = 1, . . . ,
n. When
n is small relative to
p, estimation of the covariance matrix can be unstable. The most common approach to inducing stability is to assume some true structure and then estimate the relevant parameters, which will be fewer than those in the full covafiance matrix,
p(
p + 1)/2. However, the resulting variance of

will be incorrect if the hypothesized covariance structure is incorrect. To avoid this problem, a sandwich estimator (
Huber, 1967) for the variance of

will result in asymptotically correct, but not efficient, variance estimates for

. As a robust alternative to assuming some structure and as a way to induce stability over the unstructured estimator of the covariance matrix, we will use the covariance matrix estimators proposed above. Our strategy will be to use the following procedure to estimate Σ:
- Step 1. Fit the model (1) using an unstructured covariance matrix,
. - Step 2. Shrink the eigenvalues of the unstructured estimator to obtain a more stable estimate,
. - Step 3. Fit the model (1) assuming some covariance structure,
. This might be chosen, e.g., using the Bayesian information criterion (BIC) (cf., Zimmerman and Nunez-Anton, 1997). - Step 4. Compute estimates of the parameters that determine the amount of shrinkage using
and
. - Step 5. Compute a shrinkage estimator of the covariance matrix,
using the estimates in steps 2–4. - Step 6. Refit model (1) conditional on
.
More details on the shrinkage parameters and shrinkage estimators will be given in Section 3. Steps 1–3 and 6 are easy and can be implemented in SAS proc mixed (
SAS Institute, 1999). Steps 4 and 5 will require a simple macro. In Section 4, we recommend two estimators. Both first shrink the eigenvalues together (step 2 above). Then this modified estimate is shrunk toward a chosen structure under two different parameterizations of the matrix (steps 4 and 5 above).
In Section 2, we will explore estimators that shrink the eigenvalues, and in Section 3, we will explore estimators that shrink an unstructured estimate of the covariance matrix toward some structure. In Section 4, we will do simulations to examine the risk of these estimators for estimating the sample covariance matrix and the mean squared error for estimating the fixed regression coefficients. Section 5 includes a theorem demonstrating the asymptotic properties of these estimators. In Section 6, we will show the importance of covariance matrix estimation on fixed regression coefficients in data from a sleep EEG study. We offer conclusions and extensions in Section 7.