|Home | About | Journals | Submit | Contact Us | Français|
To re-examine proposed models of cognitive test performance that concluded separate factor structures were required for people with Alzheimer disease (AD) and older adults without dementia.
Five models of cognitive test performance were compared using multistep confirmatory factor analysis in 115 individuals with autopsy-confirmed AD and 191 research participants without clinical dementia from longitudinal studies at the Washington University AD Research Center. The models were then cross-validated using independent samples of 323 people with clinically diagnosed dementia of the Alzheimer type and 212 cognitively healthy older adults.
After controlling for Alzheimer-specific changes in episodic memory, performance on the battery of tests used here was best represented in people both with and without dementia by a single model of one general factor and three specific factors (verbal memory, visuospatial ability, and working memory). Performance by people with dementia was lower on the general factor than it was by those without dementia. Larger variances associated with the specific factors in the group with dementia indicated greater individual differences in the pattern of cognitive deficits in the stage of AD.
A hybrid model of general and specific cognitive domains simplifies cognitive research by allowing direct comparison of normal aging and Alzheimer disease performance. The presence of a general factor maximizes detection of the dementia, whereas the specific factors reveal the heterogeneity of dementia’s associated cognitive deficits.
We previously reported that cognitive test performance in individuals with and without dementia is best characterized by two distinct factor structures: a single general factor for individuals without dementia compared with three orthogonal factors representing verbal memory, visuospatial, and working memory in dementia of the Alzheimer type (DAT).1 Other reports using similar exploratory techniques report separate factors (indicating that DAT affects different aspects of cognition independently2–4), or a global deficit that produces a single general factor.5,6 This discrepancy makes it difficult to know how best to compare and contrast decrements in cognitive performance between different patient populations.
Confirmatory factor analysis (CFA) provides the ability to test competing models (i.e., general vs multiple factors) either within a single population or across multiple populations. Previous efforts using CFA in DAT support a multifactor hypothesis of cognitive deficits.7–9 Two CFA studies have compared individuals with and without dementia. One report concluded a general factor was sufficient for people with and without dementia,10 while the other reported a multifactor fit for the two groups.11 There were a number of differences in the samples, measures, and statistical models used in the two studies that may have contributed to the discrepancy. In addition, the groups with dementia in these reports did not include autopsy-confirmed cases to exclude competing causes of dementia. In the current report we used a multigroup confirmatory process to investigate the presence of both multiple domains of cognition and a general factor common to participants without dementia and those with autopsy-confirmed Alzheimer disease (AD).
Archival data from four independent groups of participants (two DAT, two controls without dementia) were selected from volunteers enrolled in a longitudinal study of healthy aging and dementia. Sample 1 with dementia included all individuals with autopsy confirmation of AD (n = 115) without other pathology that could cause dementia enrolled from 1997 through March 2005. All autopsy confirmed individuals included here also received a clinical diagnosis of CDR 0.5 (n = 17) or greater (n = 98) in life. On average the interval between death and these participants’ last time of assessment was 1.5 years. Sample 2 with dementia (n = 363) included individuals with clinical diagnoses of DAT enrolled prior to 1997 and includes individuals reported previously.1 Sample 1 without dementia (n = 191) was also enrolled between 1997 and March 2005. Sample 2 without dementia (n = 212) was enrolled prior to 1997 and includes individuals reported previously.1 The Washington University Human Studies Committee approved all procedures.
Experienced clinicians conducted semi-structured interviews with the participant and a knowledgeable collateral source (usually a spouse or adult child) at an initial visit and annually thereafter. The Clinical Dementia Rating (CDR) was used to determine the presence or absence of dementia and, if present, stage its severity.12 The CDR evaluates cognitive function in each of six categories (memory, orientation, judgment and problem solving, performance in community affairs, home and hobbies, and personal care) without reference to psychometric performance or results of previous evaluations. CDR 0 indicates no dementia, and CDR 0.5, 1, 2, or 3 correspond to very mild, mild, moderate, and severe dementia.
The CDR has high interrater reliability,13 is sensitive to clinical progression, and is highly predictive (93%) of autopsy-confirmed AD.14 The clinical diagnostic criteria for DAT used for this study are consistent with probable AD according to the National Institute of Neurological and Communicative Disorders and Stroke and AD and Related Disorders Association.15 Individuals with a CDR greater than 0 but clinical diagnoses of dementias other than DAT were excluded. Individuals with dementia with an initial CDR of 2 or greater were also excluded as these individuals often have difficulty completing psychometric assessment.
A 90-minute battery was administered annually to all participants approximately 2 weeks after clinical evaluation. This battery tests across multiple cognitive domains (i.e., semantic memory, episodic memory, working memory, and visuospatial ability). Tests include Information,16 Associate Learning,17 Boston Naming Test,18 Logical Memory,17 Benton Visual Retention Test: Form d-Copy,19 Digit Symbol,16 Trailmaking A,20 Block Design,16 Word Fluency for S and P,21 Mental Control,17 Digit Span Forward and Backward.17 Psychometricians were not informed of the results of the clinical evaluation.
Psychometric data from a single time of measurement were used for each participant. For those without dementia the first time of assessment was used provided that the CDR was also 0 at the following two assessments; otherwise the participant was excluded so as to eliminate people who potentially had a preclinical dementia. The data from the groups with dementia used in the analyses were from a time of assessment when participants had mild dementia (either their last CDR 0.5 or their first CDR 1). Prior to conducting the CFA all psychometric measures were standardized to z scores (collapsed over the four samples) and Trailmaking A was reversed scored so that a high score on all variables indicated good performance. Because age varied across patient types, age at time of assessment was regressed on every subtest and CFAs were conducted using these age-corrected residuals.
All brains were examined with a standard protocol.22 Following fixation in neutral buffered 10% formalin, tissue blocks were taken from 30 brain regions. Sections (6 μm) from paraffin-embedded tissue blocks were stained with hematoxylin-eosin, Gallyas and modified Bielschowsky silver stains, and immunohistochemical methods.22 Histologic criteria for AD were based on quantification of diffuse and neuritic amyloid deposition in five cortical regions with 10 mm2 microscopic fields in each region22 as well as National Institute on Aging (NIA)-Reagan23 neuropathologic probability estimates of AD. The two sets of criteria have excellent agreement for intermediate and high probability of AD. Cases were screened for Lewy bodies (LBs) with antibodies to alpha-synuclein and were also examined for the presence of cortical and subcortical infarcts and hemorrhages to exclude confounding dementia diagnoses. It was important for the validation sample to only be composed of cases with “pure” AD. The presence of any other neuropathology was exclusionary for this study. Thus, the AD group did not contain any LBs (neocortical, amygdala, or brainstem) or vascular lesions.
CFA (AMOS v7.0, SPSS Inc., Chicago, IL) was conducted in a multistep process. First the relative fit of four candidate hypothetical models (figure 1) based on previous studies was examined to determine domain content, factor loadings, and factor interrelationships. Model A hypothesized a single general factor. Model B tested the hypothesis that was three independent factors. Model C allowed the three factors to be correlated. A hybrid Model D allowed both three specific factors and a general first-order factor. Its purpose was to determine if a single model would satisfactorily represent performances by people with and without dementia. The factor analyses used a direct estimation approach based on a full information maximum likelihood algorithm to deal with missing data.
The empirical validity of each theoretical model (i.e., how well it fit the observed data) was assessed using goodness-of-fit indices.24 Model selection was primarily based on differences in the root mean square error of approximation (RMSEA25), which is a measure of discrepancy between predicted and observed model values; values closer to 0 indicate better fit (preferred values < 0.09). In accordance with more recent guidelines, better fitting models were accepted when the change in the RMSEA (ΔRMSEA) was greater than 0.02.26 To provide a basis for comparison of current results with CFA indices used in other reports the log likelihood ratio test statistic (-2LL; smaller values indicate better fit) and the comparative fit index (CFI, values closer to 1 indicate better fit) were also calculated. Then we evaluated critical ratios (the ratio of an estimated parameter to its standard error) to refine the selected model for each group.27 This statistic indicates the relative contribution of a particular model parameter to the overall fit of the model. We only evaluated covariances, first among subtests and then among factors. Only when a covariance parameter was both theoretically motivated and produced a significant improvement to model fit (critical ratio > 1.96 and ΔRMSEA > 0.02) was it adopted.
Next we used tests of invariance (TOI) to compare independent samples of the same patient group (i.e., sample 1 without dementia vs sample 2 without dementia; sample 1 with dementia vs sample 2 with dementia). TOI use a series of increasingly restrictive models to investigate how samples differ in factor structure and variance-covariance patterns. We began with tests of strong factorial invariance; both samples should have the same factor structure, the same factor loadings, and the same indicator intercepts (i.e., observed measures should have the same Y intercept in the two samples when the latent variable is 0). The next constraint added was the requirement of equal factor means. The final and most restrictive model added the requirements of equal factor variances and equal covariances among the factors. Lower models must be accepted before subsequent higher order solutions can be interpreted. The level at which the model fails to fit defines how the groups differ. Finally, we again used TOI to compare people with and without dementia. In this step we did not expect concordance across all levels of the model. We anticipated good fit at the measurement level, but poor fit was expected at higher levels. For example, the group with dementia should have lower factor means than the group without dementia. We did not make any predictions about the equality of the factor variances or the correlations among the specific factors across the two populations.
Demographic information and neuropsychological measures for each of the four groups are shown in table 1. The four groups were similar with respect to age and education except that the autopsy-confirmed AD group was older (t = 4.17, p < 0.001). Subtest comparisons within a similar clinical status were all nonsignificant with Bonferroni correction (all t < 2.38). All participants spoke English and lived in the greater St. Louis metropolitan area; 24 were African American, and the remainder were Caucasian.
As shown in table 2 and figure 2, Model D was the best-fitting candidate model for sample 1 without dementia (RMSEA = 0.04; −2LL = 51.4; CFI = 0.98). A general first-order factor and three correlated specific factors labeled Verbal Memory, Visuospatial Construction, and Working Memory are depicted in figure 2A with corresponding factor loadings of the 12 measures. The correlations among the three specific factors were 0.64, 0.60, and 0.28. Model D also provided a borderline acceptable fit for sample 1 with dementia (RMSEA = 0.09), although not better than Model C (three correlated specific factors), which also had a RMSEA of 0.09.
Because Model D provided the best fit for both the groups with and without dementia, we designated it as the new baseline model and used standard procedures of model respecification28 to assess whether any further model refinements would improve fit. Candidate respecifications were identified using critical ratios and accepted only if theoretically motivated and they resulted in significant improvement in overall model fit.26 Changes to Model D did not improve goodness-of-fit for the group without dementia. In the group with dementia, however, there was a high degree of correlation between the errors of the two tests of episodic memory (Logical Memory and Associate Learning; CR = 5.27). Allowing this additional error covariance in Model E (figure 2B) significantly improved the fit in group with dementia (RMSEA = 0.05; -2LL = 52.5; CFI = 0.98).
Samples 1 and 2 of the participants without dementia and then samples 1 and 2 of the participants with dementia were compared to determine if the model configuration functioned similarly in independent samples from the same population. Model D was replicated in the samples without dementia at all levels of restriction (RMSEA = 0.05; ΔRMSEA = 0; −2LL = 229.7; CFI = 0.92). Model E was similarly replicated at all levels of restriction in the samples with dementia (RMSEA = 0.04; ΔRMSEA = 0; −2LL = 223.0; CFI = 0.96).
As shown in table 3 and (figure 2), Model D and E were equivalent at the measurement level (RMSEA = 0.05; −2LL = 196.4; CFI = 0.93). The underlying constructs represented in Model D and Model E are the same for the groups with and without dementia. The factor loadings shown in Model D (figure 2A) did not differ from those observed for the group with dementia in Model E (figure 2B); the individual tests have the same relation to the four factors in people with and without dementia. The two models are equivalent in terms of factor means (RMSEA < 0.09)25 but not using more recent guidelines (ΔRMSEA = 0.02).26 The estimated means and standard deviations of the group without dementia served as the reference and were set to 0 and 1. For the sample with dementia, their mean on the general factor was significantly lower (−2.17) and the SD was slightly larger (1.19) than seen in samples without dementia, but no significant mean differences were found regarding the three specific factors. The only difference in the specific factors was the correlation between the three specific factors was between Verbal Memory and Working Memory (0.28 in group without dementia vs 0.44 in group with dementia).
A hybrid model made up of three specific factors and one general factor best characterizes the underlying cognitive structure in individuals with and without dementia. This was cross-validated in a larger independent sample. These findings permit accurate characterization of cognitive performance. The hybrid model combination of general and specific factors is consistent with findings that intellectual ability can best be described both as a general score and by specific mental abilities.23,24 The added complexity of the hybrid model accounted for more variance in the data and left less unexplained variance in the residual. This hybrid model resolves the conflicting results present in the literature2–9 and eliminates the necessity of allowing different factor structures for people with and without AD.
Models for the groups with and without dementia differed in that the two episodic memory tasks (Logical Memory and Associate Learning) were correlated in the group with dementia (Model E) but not in the group without dementia (Model D). This same pattern was found across validation samples. Tasks requiring effortful verbal encoding and immediate recall are preferentially affected by AD29,30 and variability in these scores reflect an episodic memory specific degradation in this disease process.
There was a second, more general effect of dementia reflected by the lower estimated mean for the general factor score in the group with dementia but not in the three specific factor means. The opposite was true for factor variances. AD variability was significantly larger for the specific factors compared with only a slight increase in variance in the general domain. Thus performance on the specific factors was poor for some people, but not all. These results converge with the hypothesis that the AD process is multifactorial and may have differential effects varying by individual as well as the stage of the disease.7–9 This heterogeneity suggests that within the observed overall cognitive decline in AD there may exist discrete decrement profiles that mask one another when measured together. In addition to increased variance in the specific cognitive domains in AD, there was also a changed pattern of covariance between cognitive domains. Verbal memory ability was more strongly related to working memory. This is consistent with the observation that episodic and semantic memory deficits in AD reflect deceased working memory buffer size and efficiency.31
This study has both strengths and limitations. Clinical diagnoses were rendered by history, neurologic examination, and clinical interview, independent of neuropsychological test scores. This permitted us to avoid circularity when neuropsychological data are used for both classification of cases as well as an outcome measure. Although the test battery is broad in scope, it was originally designed 30 years ago at the start of the longitudinal study and therefore did not include more modern measures of attention, executive function, and working memory. It is possible that the inclusion of sufficient measures of these other cognitive domains may reveal more than the three specific domains reported here.
There is disagreement about what constitutes sufficient sample size for CFA, especially in clinical research where data collection is intensive and sample size often limited. Although larger sample sizes than those presented here are commonly used, recent modeling work32 indicates that sample size is conditional on data and model characteristics. Good-fitting models can be identified with relatively smaller sample sizes when reliable measures are used. The sample 1 sizes (115 and 191) proved adequate consistent with Monte Carlo studies demonstrating that fit indices are sufficiently powered when sample size is greater than 100.33
The strengths of the study include the use of well-characterized longitudinal samples of participants to generate initial findings followed by replication of findings in a second independent set of well-characterized participants with similar diagnoses. The samples without dementia represent older adults who remained dementia free for at least two more assessments, lessening the possibility of misclassification. Likewise the sample with dementia is drawn from individuals with antemortem diagnoses of DAT and autopsy confirmation of AD, free from other pathologies (Lewy bodies, vascular lesions) that may influence cognitive performance.
This hybrid CFA model now allows us to explore clinical and cognitive performance across diverse clinical samples and eliminates the necessity of allowing unique factor structures for people with and without dementia. A second benefit of these models is that now we are able to perform comparisons of cognitive abilities between different neurodegenerative disorders (e.g., AD vs PD) using a common neuropsychological test battery. Since the tests used in our battery are similar to those in the Uniform Data Set34 neuropsychological battery for the National Institute on Aging AD Center Program, we may be able to apply similarly structured models (i.e., hybridized specific and general factors) to other research samples provided that the cognitive domains assessed are similar to the ones measured here.
The authors thank Dan W. McKeel, Jr., MD, and the Neuropathology Core for providing pathologic diagnoses.
Address correspondence and reprint requests to Dr. James E. Galvin, Alzheimer Disease Research Center, Washington University School of Medicine, 4488 Forest Park, Suite 130, St. Louis, MO 63108 galvinj/at/neuro.wustl.edu
Supported by National Institute on Aging grants P01 AG03991, P50 AG05681, and K08 AG20764; American Federation for Aging Research; and gifts from the Alan A. and Edith L. Wolff Charitable Trust.
Disclosure: The authors report no disclosures.
Received May 13, 2008. Accepted in final form August 20, 2008.