This paper presents a simple method that builds on models reported by Weintraub and colleagues [2
] to create a calculator that can provide NACC and ADC clinical researchers with a quick, efficient, and straightforward means to obtain a range of z-scores
and percentile rank estimates for performance of subjects on the neuropsychological tests of the UDS. In addition, the method we present in this paper can be easily modified so that other researchers and clinicians may conduct their own linear regressions, obtain the necessary output, and create their own norms calculator for their specific site. Furthermore, in the absence of their own available data, researchers can apply this technique to other published data to derive demographically specific norms for a given sample. A generic calculator has been provided in the supplemental materials, which can be used as a template (Additional file 2
We estimated a range of z-scores
for individual performance on UDS neuropsychological tests by utilizing coefficients (βs) for demographic variables (predictors) for multivariate (MV) and univariate (UV) linear regression models provided by Weintraub and colleagues [2
], as well as corresponding model RMSE terms for test scores of over 3,000 clinically cognitively normal subjects. In employing the RMSE, we leveraged two assumptions that are presumed when testing the significance of predictors in a regression: 1) that the distribution of the residuals around the estimate is normal and 2) that the distribution of the residuals is homoscedastic. The RMSE is an approximately unbiased estimate of the standard distribution of the residuals and, therefore, may provide a reasonable estimate of the distribution across changes in the predictor variable. For example, if one were to perform a simple linear regression and use age as the sole predictor for the MMSE score, one would assume that the error between the predicted MMSE scores and the actual MMSE scores are the same across different ages. This estimate in turn provides one with a measure of the average deviation for any age, and can be substituted for the conventional standard deviation. This approach can then be expanded to any simple or multiple regression model to provide an estimate of the standard deviation of various theoretical population means.
A point of long debate during local ADC/ADRC UDS consensus conferences is whether an individual who performs in the below average range on one or more neuropsychological tests or the MMSE has performed in the impaired range or low-average range. Since the battery contains slight variations in administration procedures, modifications to some of the original measures, and the subjects are not reflective of the national population [2
], most norms available for wide clinical use do not apply, leaving UDS researchers with few practical resources to assess performance of subjects on neuropsychological domains other than summary data and models from Weintraub et al
] and local or national summary statistics that function much like the unconditional (UC) model shown here [13
]. However, our example highlights an important point for subjects whose performance falls near the peripheries. The hypothetical subject's performance of 27 on the MMSE is estimated at the sixth percentile relative to clinically cognitively-normal subjects in the UDS, without considering the individual's sex, age, or education (that is, the UC model); it is greater than -1.5 SD and would be perceived to be in the mildly impaired range. However, if other models are used that take into consideration the individual's sex, age, and/or education, his performance is then estimated as at the 8th
percentile with sex-conditional (UVSEX
percentile with age-conditional (UVAGE
percentile with education-conditional (UVEDUCATION
model), or as high as 20th
percentile with all covariates considered (that is, the MV model). In this specific example, considering any demographic variable, (sex, age, or education), results in a change in perception of the subject's performance from being greater than 1.5 SD below the mean to falling in the range of -1.5 to 1.0 SDs. Finally, considering all demographic covariates in the MV model results in a finding that the subject has not performed in the mildly impaired range but in the low-average range of -1.0 to -0.5 SDs. The variation in clinical classification, based on which normative considerations are made, becomes even more relevant to MCI and AD diagnosis when considering performance on memory-specific measures. If, for example, a 60-year-old male subject, who is highly educated (for example, 20 years of education), recalled four story units on delayed recall after a 25-minute delay (that is, LMIIA = 4; Table ), performance estimates range from the 2nd
percentile in the UC model, representing mildly impaired performance, to estimates ranging from the < 1 to 3.4th
percentiles for the UV models, and an estimated performance at the 0.8th
percentile for the MV model, representing performance in the severely impaired range. Such differences may have important implications for cross-sectional and longitudinal classifications that are made on the basis of percentile or categorical thresholds, such as sufficiently impaired performance to meet MCI or AD criteria. In addition, use of the same model for determining performance on measures is critical for accurately modeling and assessing a patient's functioning across time (that is, to determine progression of cognitive functioning).
An example where performance falls in normal or impaired range depending on demographic adjustment/model used
The intended use of this calculator and any normative data used to inform assessment decisions is to provide objective data on an individual's performance relative to a group of people of similar backgrounds, but it does not replace the clinician's judgment, and, as with all statistical procedures, individual variability occurs. Clinical judgment should include a consideration of the objective test data, as well as the specific observations of the given individual being assessed. It is possible that the different percentiles obtained for different tests within the same domain are due to variability in the sensitivities across neuropsychological tests; it is also possible that individual variability of the examinee can produce this variability. As is the case in any statistically-derived estimate of normative performance, there is inherent error in our ability to predict performance at the individual level.
This study has several strengths and benefits that include measurement estimates representative of the NACC UDS and ADC/ADRC populations; utilization of methods and models that are straightforward, intuitive, and have been tested on a large sample of well-characterized subjects, and the provision of a simple and practically useful tool for UDS clinical researchers that builds on and complements available NACC-ADC/ADRC resources.
The study's results and approach also have several inherent limitations and caveats. First, as stated in Weintraub et al
.'s original article [2
], the majority of UDS participants are White, non-Hispanic, highly educated, and have few additional medical or psychiatric illnesses. Therefore, the application of this calculator may be best suited for individuals reflective of these characteristics. For example, if we were to compare our previous illustrative MMSE score to the MMSE normative information provided by Crum and colleagues [14
], where the mean and standard deviation for a person with 12 years of education and 80 years of age is 25 ± 2.3, we would determine that the subject had a z-score
of .87, fell in the 82nd
percentile, and has performed in the "high-average" range. Therefore, it is imperative that the context in which this calculator is used be one in which the subject shares similar demographics to those within the UDS sample.
The second potential limitation is the use of the RMSE in deriving z-scores
. Although flexible in its application, the RMSE is calculated with the assumption that error variance is homoscedastic across changes in the predictor variable. While these regressions were performed in Weintraub et al
], this assumption may not hold in all instances. For example, as a cohort's age increases, the range of the cohort's scores on certain tests (for example, TMT B) also increases; this can weaken the assumption of homoscedasticity [15
]. Therefore, z-score
estimates for individuals who fall at the ends of the age range (that is, 60 or younger and 90 or older) may be relatively less informative. For example, if a 58-year-old were to truly perform in the mildly impaired range on the Trails B task compared to same-aged peers, this relatively poor performance may be masked because the overall range of scores would be overestimated due to the inclusion of the older cohort in estimating the RMSE, leading to a less severe interpretation. Conversely, a 95-year-old's seemingly low or impaired performance on TMT B may simply be an exaggeration due to an underestimation of her performance or due to a restricted estimation of the range as a result of including the younger cohort's scores in calculating the RMSE. Due to such potential for under- or over-estimation, scores for individuals falling at the tail ends of the age range (distributions) should be interpreted with caution. It is possible to develop other models that specifically model differences in variance across covariates (for example, age) to compare covariate-specific effects on estimated norms between models. However, in this paper we aimed to make use of the best available published UDS baseline model parameters (from Weintraub et al
]) to produce an estimated norms calculator of practical use to specific researchers (that is, UDS clinician researchers) as well as methods that are simple to implement and generalizable to other datasets; in doing so we chose practicality, utility, simplicity, and generalizability over de novo
developing models with greater complexity but potentially improved accuracy. The latter can be explored in future studies by developing more complex models and leveraging additional UDS data.
Finally, these models were developed based on subjects who were deemed to be clinically cognitively-normal at their first UDS visit; yet, approximately 20% of the subjects had one or more neuropsychological test scores that were deemed impaired or lower than expected. This does not preclude that a substantial portion of these subjects, all of whom were initially deemed clinically cognitively-normal, when followed longitudinally, may ultimately manifest more clear deficits on subsequent UDS visits or meet the newly proposed Sperling and colleagues' NIA-AA research criteria [3
] for pre-clinical AD, MCI or dementia. Inclusion of these subjects would be expected to produce even more conservative estimates of "abnormality". The calculation of such "robust norms" is important and is currently underway by Ferris and colleagues (S. Ferris, oral/written communication, October, 2010). Future directions include developing a UDS norms calculator that uses age-specific standard deviations instead of the RMSE to obtain standardized scores that are more sensitive to age-related changes in the range of scores across age cohorts.