The data presented here are not intended to support conclusions about any one instrument. Importantly, all three measures (ADAS, CDR sum of box scores, MMSE) described here have extensive theoretical and/or practical histories; none of the three can be legitimately claimed to be the ‘truest’ measure of dementia or cognitive impairment. The correlation coefficients for the three scores within each cohort are presented in , , with the respective scatterplots they summarize. Differences in strength of association were not subjected to inference tests because our analyses sought independent replication of results in the two cohorts (irrespective of their similarities or differences).
Table presents the descriptive statistics for the demographic variables and cognitive impairment (unstandardized) test scores themselves, as well as for the pairwise differences in the standardized scores (i.e., the y axes in BA plots) and other background variables.
The data in table suggest that the NS cohort had a higher average ADAS score than the PR cohort did (this was not tested for significance); the coefficient of variation [CV (standard deviation/mean) × 100%] for the PR cohort was nearly double that of the NS cohort for ADAS scores, and the PR cohort CV for MMSE was 1.3 times the CV for the scores in the NS cohort (data not shown). Since the MMSE and CDR sum of boxes is similar in the two groups, the differential correlations of ADAS with the other scores may be explained by the difference in the group CV (these CV values themselves are not shown, but can be obtained given the data in table ).
We fit a one-factor factor analysis solution to the raw scores to determine whether it was reasonable to conclude that the same underlying construct was assessed with the three instruments. Velicer's minimum average partial procedure [
21,
22] suggested that a single factor explained the majority of variability given the correlation matrices of the three scores, suggesting that variability in all instruments can be explained by the same single underlying construct.
The one-factor solution explained 77.66% of the variance in the three scores in the PR cohort and 70.11% of the variance in the NS cohort. With three scores, no fit statistics are calculable [
24]. Just 22% of the variance in the PR cohort and 30% of the variance in the NS cohort were left unexplained by the single factor. This suggests that all three instruments share the common factor, which we interpret roughly as ‘cognitive impairment’. Loadings of total scores on this factor are shown in table .
Figure shows strong and positive correlation between baseline ADAS values (y axis in fig. ) and (reversed) baseline MMSE scores (x axis in fig. ) in the PR cohort (Spearman's r = 0.757); a similar association is reflected in the NS cohort (fig. ; Spearman's r = 0.600).
Figure shows the BA plots of ADAS and MMSE scores. The BA plots show the value of the difference between the two variables (or methods of measurement) on the y axis and the mean of the two variables on the x axis. Since the instruments are standardized, the vertical reference lines at zero in each plot show the average level of cognitive impairment in each cohort. The horizontal reference lines show the difference between the two measures with the zero point, indicating perfect agreement, flanked by lines 1 SD of difference away from the zero point (y axis).
Figure shows that for greater levels of cognitive impairment (as x increases) the spread of points around the y = 0 line increases, suggesting that MMSE and ADAS scores agree less in persons with greater than average levels of cognitive impairment. The means-difference (BA) plots for both cohorts suggest that disagreement in ADAS and MMSE scores is not symmetric, and this asymmetry is associated with the best estimate of the underlying level of cognitive impairment (i.e., the mean of two assessments, x axis).
Figure shows the scatter for standardized values of ADAS and CDR sum of box scores in the two cohorts at their respective baseline visits. These figures generally reflect a weaker association between ADAS and CDR [Spearman's r = 0.610 (PR) and 0.472 (NS)] than was observed between ADAS and MMSE scores within the same cohort. The shapes and dispersal patterns in these scatter plots suggest that for lower CDR box scores there is a tighter correspondence with ADAS. The dispersal is greater for higher levels of both scores.
The BA plots in figure reflect the same increasing disagreement between ADAS and CDR that was observed in the comparison of ADAS and MMSE as cognitive impairment increases. ADAS and CDR agree more for average or below-average levels of impairment and their agreement is worse for persons with above-average levels of impairment (i.e., x axis values).
Scatter plots of our transformed MMSE scores and CDR sum of box scores are shown in figure . The correlation coefficients are similar for these two cohorts, suggesting a strong and positive association between reversed MMSE and CDR sum of box scores [Spearman's r = 0.611 (PR) and 0.607 (NS)], stronger than that observed for the other two pair of scores. The BA plots in figure show the same pattern as was observed for the other two pairs of assessments, namely, as impairment increases (according to the average of MMSE and CDR), the agreement between these instruments decreases.
For average and below-average levels of impairment, the MMSE and CDR sum of boxes have the best agreement of the three pairs of comparisons; the majority of observed difference values fall within one standard deviation of the difference values.
Levine's test of equality of variance [
25] in the differences (y axis values) for groups defined by the average level of impairment (i.e., x axis values <0, x axis values ≥0) were consistent in both cohorts, and for all comparisons except one. After correcting p values for multiple comparisons (three per cohort), significantly greater variability (i.e., poorer agreement) was observed for higher levels (x axis values ≥0) of impairment relative to lower levels (x axis values <0). The one nonsignificant comparison was for the prednisone cohort comparison of variances in the combination of CDR and MMSE, with unadjusted p = 0.065. Thus, the agreement between standardized scores on these three cognitive and clinical assessments suggests that they cannot be considered interchangeable when greater-than-average impairment is present.