Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2830369

Formats

Article sections

Authors

Related links

Stat Med. Author manuscript; available in PMC 2011 March 15.

Published in final edited form as:

PMCID: PMC2830369

NIHMSID: NIHMS166746

Cronbach Coefficient Alpha (CCA) is a classic measure of item internal consistency of an instrument and is used in a wide range of behavioral, biomedical, psychosocial, and health-care related research. Methods are available for making inference about one CCA or multiple CCAs from correlated outcomes. However, none of the existing approaches effectively address missing data. As longitudinal study designs become increasingly popular and complex in modern-day clinical studies, missing data has become a serious issue, and the lack of methods to systematically address this problem has hampered the progress of research in the aforementioned fields. In this paper, we develop a novel approach to tackle the complexities involved in addressing missing data (at the instrument level due to subject dropout) within a longitudinal data setting. The approach is illustrated with both clinical and simulated data.

Measurement error models play a particularly important role in psychosocial research, as many quantities of interest are latent constructs and not directly observable. Outcomes for measuring such constructs are often derived based on instruments (questionnaires) that delve into multifaceted dimensions of an individual’s mental and physical health. For most instruments, items are scored on a Likert scale such as 0, 1, 2, 3, 4 and then totaled either across all items or a subset of items from a subscale to provide an assessment of the latent construct or a sub-domain of interest. An elegant measure to ensure that the different items in an instrument or a sub-scale capture a large amount of variability of either a uni-dimensioinal construct or multi-dimensional constructs sharing a common source is the Cronbach Coefficient Alpha (CCA). CCA ranges between 0 and 1, with larger values indicating higher degrees of consistency across the different items in terms of capturing a common source of variation [1, 2].

Methods are available for making inference about a single CCA and multiple dependent CCAs, with the latter based on data from the same subjects [3]–[10]. However, as these methods are derived based on joint multivariate normal distributions of item scores, they are sensitive to departures from such parametric distributional assumptions. Since item scores from most instruments in psychosocial research are based on Likert-type scales, they are intrinsically discrete and modeling them using parametric distributions for continuous outcomes such as the multivariate normal is theoretically flawed, especially when modeling subscales of an instrument with a very limited numerical range. Further, as these methods for comparing multiple CCAs from either several instruments when applied to a common group of subjects or subscales of an instrument are developed based on statistics for omnibus tests of overall homogeneity involving all the CCAs, they do not provide explicit asymptotic joint distributions of estimates of individual CCAs. As a result, they cannot be applied to test the equality of some combination of the CCAs, which is particularly common in longitudinal studies. For example, we may want to know whether the CCA for an instrument has dropped by 30% from baseline to some follow-up visit in a longitudinal study. None of these methods can be applied to address such a hypothesis.

As longitudinal study designs become increasingly popular and complex in modern-day studies, missing data are inevitable due to loss to follow up, making inference more challenging for analysis of longitudinal CCAs. Although a variety of methods are available for addressing missing data, none applies directly to the current setting involving inference for comparing multiple CCAs within a longitudinal data setting.

In this paper, we develop a novel approach to tackle the complexities involved in addressing missing data within a longitudinal and clustered data setting. The approach is illustrated with both clinical and simulated data.

Consider an instrument consisting of *K* items, which is administered to *n* subjects in a longitudinal study with *T* assessments. Let *y _{itk}* denote the item score to the

Also, let

where **1** denotes a *K* × 1 column vector of 1’s. The Cronbach Coefficient Alpha ** α** can be expressed as:

Let

(1)

Then, it is readily checked that

(2)

is a one-sample, vector-valued U-statistic and *E* (**) = **** θ** [11, chap. 3]. By applying the theory of multivariate U-statistics [12, chap. 5], and the Delta method,

Let **Σ_{θ}** = 4

**is consistent and asymptotically normal:**(3)where →and →_{p}denote convergence in probability and distribution, respectively._{d}- A consistent estimate of the asymptotic variance is obtained by substituting consistent estimates of the respective components of
**Σ**; if_{α}is a consistent estimate of_{θ}**Σ**, a consistent estimate of_{θ}**Σ**is given by: ._{α}

The proof of Theorem 1 is not provided since these conclusions follow as a special case from the more general results concerning the missing data case to be discussed in Section 2.2.

To find a consistent estimate of **Σ_{θ}**, first note that:

(4)

Thus, a consistent estimate of **Σ_{θ}** is given by:

(5)

where *Ê* (**h*** _{ij}* |

We can use Theorem 1 to test linear contrasts of the form:

where *K* is some *p* × *T* full rank matrix of known constants. For inference, we can use the Wald statistic,
, which has an asymptotic central
distribution with *p* degrees of freedom under the null *H*_{0}.

In the presence of missing data, one approach is to apply the complete-data approach above to the subsample consisting of those subjects with complete data. However, in most longitudinal studies, missing data do not occur completely at random, as they often result from subjects’s deteriorated/improved health and other related conditions in response to the treatment. In such cases, missing data are predicted by observed responses and as a result, the missing completely at random (MCAR) assumption does not apply [13]. Thus, such a complete-data approach not only reduces power, but also likely yields biased estimates. A better alternative is to include all available data and to address the inherent missing data problem.

To this end, we assume that missing data only occurs at the instrument level and define a vector of binary variables for indicating missing (or rather observed) instrument-level response as follows:

(6)

Let

Define a new estimate of ** α** with the form as before

(7)

Note that although **h*** _{ijt}* in (7) is not defined if one of the

First, assume that **Δ*** _{i}* is known. Then, it can be shown that

Let

(8)

Then,

**is consistent and asymptotically normal,**(9)- The asymptotic variance Σ
can be estimated by the Delta method with a consistent estimate_{α}given by:_{θ}

(10)

This case with known **Δ*** _{i}* may arise if data are missing by designs, as in some multi-stage trials in which we may want to over-sample those subjects with larger outcomes at a prior stage to participate in the current stage. For example, in a two-stage study with

(11)

where *r _{i}*

In most applications, **Δ*** _{i}* is unknown and must be estimated. Under MCAR,

Let
denote the observed responses for the *i*th subject. Then, under MAR,

(12)

As
is the sub-vector of **y*** _{i}* containing the non-missing responses, Δ

We assume no missing data at baseline *t* = 1 so that *r _{i}*

(13)

To estimate Δ* _{it}*, we first model the one-step transition probability of the occurrence of missing data using logistic regression:

(14)

In many applications, the limited data may prevent us from reliably estimating ** β**, especially with a large

(15)

Now by invoking MMDP and the assumption of no missing data at *t* = 1, we obtain:

(16)

Note that by setting *β*_{t}_{1} = **0** in (14) or (15), we can use (16) to estimate Δ* _{it}* under the MCAR assumption, which is equivalent to estimating Δ

When **Δ*** _{i}* is estimated by the logistic regression model in (14),

Suppose that ** β** in the logistic model in (14) is estimated by maximum likelihood. Then,

(17)

When using the estimated **, the estimate defined in (7) becomes**

(18)

Thus, we must also include the variability of ** in the asymptotic variance of ****(****). By utilizing the properties of score equations and the projection-based U-statistics asymptotic expansion, we can derive the asymptotic variance to take into account this extra variability. We summarize the results below with a justification given in Appendix A2.**

Let

(19)

Then, under the assumptions of Theorem 2,

**is consistent and asymptotically normal, with the asymptotic variance,**(20)- A consistent estimate of Σ
_{α}_{Φ}is , whereis given in (10) and is obtained by substituting moment estimates in place of_{θ}*H*,*C*, and .

It is seen from (20) that when the weight function is estimated, the asymptotic variance of ** has an additional component Φ, which reflects the sampling variability in ****.**

Although derived from a longitudinal data setting, the results in the theorems can also be applied to cross-sectional study data. For example, most of the available methods have been developed to compare whether several CCAs either from different instruments or from multiple domains, or subscales of an instrument are identical [3]–[10]. By identifying *T* as the number of instruments or domains of an instrument, *y _{itk}* then represents the score from the

We illustrate the approach with both clinical and simulated data. We first present a clinical application and then follow up with investigations of the performance of the approach under small to moderate samples by simulation. In all of the examples, we set the statistical significance at *α* = 0.05. All analyses are carried out using a package we have developed for implementing the proposed approach using the R software platform (R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org)

A study on the benefits of total knee arthroplasty (TKA) was recently conducted in the Hospital for Special Surgery in 2008 (“*Early clinical and radiographic results of the standard posterior stabilized and high flex polyethylene inserts in the genesis II total knee replacement system: A pilot study*” was approved by the Institutional Review Board of the Hospital for Special Surgery). A modified version of the Knee Society rating System (KSS) [15] was used to report results for patients who underwent TKA at three visits including pre-operation, 6 weeks, and 4 months post-operation. For illustration purposes, we focused on a total of 117 patients who had observed data at the pre-op in the domain of the patient reported portion of the KSS that consists of 5 items measuring knee functions in walking (a 6-level ordinal scale), going up stairs (a 4-level ordinal scale), going down stairs (a 4-level ordinal scale), getting out of chair (a 4-level ordinal scale), and necessity of support for walking (a 3-level ordinal scale), respectively.

In this sample, 20 of the 117 patients were lost to follow-up, yielding a monotone missing data pattern. Let **y*** _{it}* = (

(21)

Shown in Table 1 are the estimates of parameters from the above model, their standard errors, and corresponding p-values. Missingness at 4 months post-op was dependent on the observed KSS at 6 weeks post-op (p-value=0.043). There was no evidence indicating the dependence of missingness at 6 weeks on the observed KSS at pre-op. The occurrence of missing data was not highly associated with any of the baseline covariates. Given these results, we proceeded with all subsequent analyses under MAR with inference based on Theorem 3.

Estimates of logistic regression model for missingness under MMDP for the Total Knee Arthroplasty Study.

Shown in Table 2 are the estimated *α _{t}*’s and their standard errors over time, and the Wald test statistic and associated p-value under

A limited simulation study was conducted to examine the empirical type I error rate for testing the null of equal CCAs over time based on a 5-item, Likert-scale questionnaire under a longitudinal study design with three assessments for four sample sizes–30, 50, 75, and 100–under complete, and missing data with MCAR and MAR. For each sample size, we generated observers’ratings over time **y*** _{it}* = (

We created the data **y*** _{it}* over three assessments in two steps. First, we generated continuous outcomes over time,

In the above, Σ_{1} denotes the within-visit, between-item correlation matrix, *AR _{m}* (1,

where Φ^{−1} denotes the inverse of the CDF of the standard normal.

For a given sample size *n*, let
be the Wald statistic from the *j*th MC replication and *G*_{0.95} the 95th percentile of the
distribution with 2 degrees of freedom. The empirical size is calculated as
.

For the missing data case, we assumed no missing data at baseline *t* = 1 and simulated the missing response according to a MCAR and MAR model. For MAR, we considered the MMDP model, and simulated the missing data indicators at each time *t*, *r _{it}*, from a Bernoulli distribution with the probability of success equal to

where _{i}_{(}_{t}_{−1)} denotes the mean of **y*** _{it}*. We fixed

(22)

To ensure that the missing data indicators *r _{it}* followed the MMDP, we further imposed the following restrictions,

In addition to examining type I error rates, power analysis for testing the null of equal CCAs over time was conducted in a second simulation study within the same longitudinal data setting under complete data, MCAR, and MAR. The true alpha was set at *α*_{1} = 0.7, *α*_{2} = 0.803, and *α*_{3} = 0.898 for times 1, 2, and 3, respectively. The power was estimated by
.

We computed the estimate ** of **** α** = (

Averaged and _{α}_{Φ} over 1,000 MC replications along with estimated empirical variance of and empirical type I error rate under complete data, MCAR, and MAR with the **...**

Shown in Table 4 are the averaged ** over 1,000 Monte Carlo (MC) replications, the empirical variance ***V ar* (**), the model based asymptotic variance **_{α}_{Φ}, and estimated power . Similar patterns in power were found across complete data, MCAR, and MAR. Although the estimated power was low at *n* = 30, it increased rapidly as the sample size increased.

Averaged and _{α}_{Φ} over 1,000 MC replications along with estimated empirical variance of and power under complete data, MCAR, and MAR with the true *α*_{1} = 0.7, **...**

We repeated the simulation for a larger number of items, *k* = 10. Compared to the results for *k* = 5 items, type I error rates are closer to the nominal 0.05 even for the small sample size *n* = 30. Power analysis was also improved significantly by achieving 60% power for sample size *n* = 30, which jumped up to 85% for complete, and MCAR, and 83% for MAR at *n* = 50.

We developed a U-statistics based approach to concurrently address missing data and stringent distribution assumptions when modeling the Cronbach Coefficient Alpha (CCA) within a longitudinal data setting. Unlike available methods, this approach requires no parametric distribution assumption on the item score and thus provides robust inference regardless of the data distribution. As longitudinal studies have become increasing popular in today’s research, the proposed approach fills a critical gap on the extant literature to address this timely data-analytic issue.

To demonstrate the robustness feature, we also conducted a simulation study to compare the U-statistic based approach with a popular method developed by Feldt et al [6] in a cross-sectional setting by considering the hypothesis *H*_{0}: *α* = *α*_{0} vs. *H*_{0}: *α* ≠ *α*_{0} under various sample sizes (*n* =30, 50, 100, and 200). Feldt et al [6] showed that the transformed alpha
follows a *F* distribution under the assumed ANOVA model where items by subjects interaction effects and the residual errors are independent and normally distributed with homogeneous variance. To examine its performance under likert-type item scores, we generated skewed likert-type item scores with true alpha *α*_{0} = 0.693 and *k* = 5 items. Shown in Table 5 are the estimates of CCA and type I error rates for the proposed U-statistic based approach and their F-distribution based counterpart. Although the estimates of alpha, * _{U}* (U-statistics based) and

Estimates of type I error rate and CCA for the U-statistic based and F-distribution based approaches.

Modeling CCA presents a major challenge under the popular mean response based regression paradigm as typified by the weighted generalized estimating equations (WGEE) due to the complexity in the analytic expression of this coefficient involving second-order moments. Although GEE II may be used to model the required second-order moments [12, 16, 17], it leads to more cumbersome algebra. Further, as it requires modeling more parameters, GEE II is generally not as efficient as the proposed U-statistics based approach.

The proposed IPW-based estimate may not be fully efficient since information from the subjects with missing data is only used to estimate the weight function. In many situations, reliable models may also exist for directly modeling the relationship between (missing) response and observed covariates or other auxiliary information. In such cases, we may combine the proposed IPW with this extra model to develop a *double robust* estimate, which not only ensures consistency if only one of the models–the model for missing data or the one for the auxiliary information–is correct, but also improves efficiency [14, 18].

In addition to CCA, the proposed approach is similarly applied to modeling other measures of agreement for continuous outcome such as the closely related intraclass correlation coefficient (ICC) [19]–[22]. In addition, missing data may also occur at the item level which can become complicated depending on the nature of the question and on the design of the study. For example, person-to-person interviews often result in more missing data and/or unreliable responses compared to questionnaires as interviewees may feel uncomfortable and refuse to answer certain types of questions in a face to face meeting. As missing responses arising from such sensitive questions likely follow the non-ignorable missing data rather than the missing at random (MAR) mechanism as in longitudinal studies, they are best addressed up front by study designs, since the former missingness mechanism is more difficult to model as compared to the MAR mechanism. Work is currently underway to generalize the proposed approach to address these limitations as well as to develop double robust estimates within a longitudinal setting.

This research is supported in part by NIH grant U54 RR023480. Dr. Ma was partially supported by the following grants: Center for Education and Research in Therapeutics (CERTs) (AHRQ RFA-HS-05-14) and Clinical Translational Science Center (CTSC) (UL1-RR024996). We sincerely thank Ms. Cheryl Bliss-Clark at the University of Rochester, an Editor and two anonymous reviewers for their constructive comments to improve the presentation of the manuscript.

We first establish a lemma.

It is readily checked by the iterated conditional expectation that [12, chap. 1]

(24)

It follows that *E* (**U*** _{n}*) =

Let . It then follows that

Thus, the projection of **U**_{t}_{,}* _{n}* is given by [12, chap. 5]

Since **Ũ**_{t}_{,}* _{n}* is a sum of independently and identically distributed random variables, it follows from the central limit theorem (CLT) that

(25)

By the theory of U-statistics (e.g. [12, chap. 5], **Ũ**_{t}_{,}* _{n}* and

The lemma follows by applying a similar argument to the vector **U*** _{n}*.

Let
. By an argument similar to (24), we have, *E* (**g*** _{ijt}*) =

Thus, by Slutsky’s theorem, * _{t}* =

Similarly, by considering the vector **, we obtain**

Theorem 2/(a) follows by applying the Delta method to ** = ****f** (**).**

To show (10), first note that

Further, we have

Thus, Theorem 2/(b) follows by taking the covariance between * _{it}* and

As noted in Section 2.2, ** is the solution to the score equations in (17). From the properties of score equations, we have:
**

(26)

where *H* is given in (19) and **o*** _{p}* (1) denotes the stochastic

1. Cronbach LJ, Gleser GC, Nanda H, Rajaratnam N. The Dependability of Behavioral Measurements. Wiley; New York: 1972.

2. Nunnally JC, Bernstein IH. Psychometric Theory. McGraw Hill; New York: 1994.

3. Duhachek A, Iacobucci D. Alpha’s Standard Error (ASE): An accurate and precise confidence interval estimate. Journal of Applied Psychology. 2004;89:792–808. [PubMed]

4. Feldt LS. The approximate sampling distribution of the Kuder-Richardson reliability coefficient twenty. Psychometrika. 1965;34:363–373. [PubMed]

5. Feldt LS. A test of the hypothesis that Cronbach’s alpha reliability coefficient is the same for two tests administered to the same sample. Psychometrika. 1980;45:99–105.

6. Feldt LS, Woodruff DJ, Salih FA. Statistical inference for coefficient alpha. Applied Psychological Measurement. 1987;11:93–103.

7. Feldt LS, Kim S. Testing the difference between two alpha coefficients with small samples of subjects and raters. Educational and Psychological Measurement. 2006;66:589–600.

8. Woodruff DJ, Feldt LS. Tests for equality of several alpha coefficients when their sample estimates are dependent. Psychometrika. 1986;51:393–413.

9. Alsawalmeh YM, Feldt LS. Testing the equality of independent alpha coefficients adjusted for test length. Educational and Psychological Measurement. 1999;59:373–383.

10. van Zyl JM, Neudecker H, Nel DG. On the distribution of the maximum likelihood estimator of Cronbach’s Alpha. Psychometrika. 2000;65:271–280.

11. Randels RH, Wolfe DA. Introduction to the Theory of Nonparametric Statistics. Wiley; New York: 1979.

12. Kowalski J, Tu XM. Modern Applied U Statistics. Wiley; New York: 2007.

13. Little RJA, Rubin DB. Statistical Analysis with Missing Data. Wiley; New York: 1987.

14. Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. JASA. 1995;90:106–121.

15. Insall JN, Dorr LD, Scott RD, Scott WN. Rationale of the knee society clinical rating system. Clinical Orthopaedics and Related Research. 1989;248:13–14. [PubMed]

16. Prentice RL. Correlated binary regression with covariates specific to each binary observation. Biometrics. 1988;44:321–327. [PubMed]

17. Prentice RL, Zhao LP. Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics. 1991;47:825–839. [PubMed]

18. Tsiatis AA. Semiparametric Theory and Missing Data. Springer; New York: 2006.

19. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin. 1979;86:420–428. [PubMed]

20. McGraw KO, Wong SP. Forming inferences about some Intraclass Correlation Coefficients. Psychological Methods. 1996;1:30–46.

21. Fleiss JL, Levin B, Paik MC. Statistical Methods for Rates and Proportions. 3. Wiley; New York: 2003.

22. Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977;33:363–374. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |