A total of 1,199 participants met inclusion criteria: 574 non-demented participants, 436 with very mild dementia (CDR = 0.5) and 189 with mild dementia (CDR = 1). The mean age of the sample was 76.4 years, mean educational attainment was 14 years, 58% were women, and 11% were African American. Of the non demented participants, one half (n = 287) were randomly assigned to the development sample, and the remainder (n = 287) to the validation sample. Demographic characteristics for each of the study samples are presented in .
Demographic characteristics of the study samples, mean ± S.D., or n(%)
Each individual PPT item was significantly correlated (p < 0.0001) with the total PPT score. Correlation coefficients ranged from 0.40 to 0.72, with items 5 (r = 0.72), 7 (r = 0.71), 8 (r = 0.72), and 9 (r = 0.63) correlating the highest with total PPT scores. None of the correlations for the other individual PPT items had a correlation coefficient above 0.51.
The ten models of 3 items and the ten models of 4 items that best predicted total PPT scores in the regression analyses had R2 values that ranged from 0.77 to 0.79 and 0.84 to 0.86, respectively. Four of the 20 models were selected for further testing based on the extent to which the items were clinically meaningful (useful to assess physical function) and practical (easy to administer in an office). Additionally, each contained at least three of the four items with the highest Pearson product moment correlations with total PPT score: item 5, picking up a penny from the floor, item 7, timed 50 foot walk, item 8, the chair rise (i.e., sitting in and rising from a chair five times) and item 9, the progressive Romberg test of standing balance (i.e., standing with feet in tandem, semi tandem and side by side positions).
A score was found for each of the candidate mini-PPTs by summing the scores on the individual items. The correlation of each of these scores with the total PPT score, and the area under the ROC curve testing the ability of each of the candidate scores to discriminate participants classified by the PPT as “functional” from those who were not, was then calculated. As shown in , each of the candidate scales was highly correlated with the total PPT score, although the 4-item candidate scale correlated more with total PPT score than each of the 3-item models (p < 0.0001). There were no significant differences between the correlations of the 3-item models with total PPT scores (p > 0.46). Likewise, the area under the ROC curve (AUC) was 0.90 or greater for each of the candidate scales (). Although these AUCs indicate that the ability of each of the scales to predict functional status was excellent, there was a statistically significant difference in discriminative ability across the three models (p < 0.0001), such that the AUC for the 4-item scale was significantly higher compared to each of the 3 item scales (p < 0.05), and for the candidate scale comprised of items 7, 8 and 9, compared to that made up of items 5, 7 and 9. The remaining two pairwise comparisons were not significant.
Correlation with total PPT score, and AUC-ROC predicting functional status on the total PPT, for each candidate scale score
When the candidate scales were cross validated in the non demented validation sample, their correlations with total PPT score and ability to classify functional status were similar to the development sample (). Like the development sample, testing in the validation sample showed that the 4-item candidate scale correlated significantly better with total PPT score than each of the 3-item candidate scales (p < 0.0001). In predicting functional status, the 4-item scale also yielded higher AUCs compared to each of the 3-item candidate scales made up of items 7, 8 and 9 (p = 0.0072); items 5, 7 and 9 (p = 0.0044); and items 5, 8 and 9 (p = 0.0562).
Because of higher correlations with the total PPT, higher AUC values predicting functional status, and the negligible effect on time of administration of having one additional item, the 4-item scale was selected for use as the mini-PPT. We then tested the ability of the mini-PPT to predict total PPT scores and functional status in the mildly demented samples. The correlation of the mini-PPT with total PPT scores was 0.90 (p < 0.0001) and 0.91 (p < 0.0001) in those with very mild (CDR = 0.5) and mild (CDR = 1) dementia, respectively. The AUC (95% confidence interval = 95% CI) value was 0.93 (0.91–0.95) for the very mild dementia sample and 0.95 (0.93–0.98) for the mild dementia sample.
The ROC curve for the non demented validation sample () was examined to identify likely cutoff values on the mini-PPT that could be used to identify individuals who would score as functional on the original PPT. Based on inspection of , as well as the ROC curves for the very mild dementia and mild dementia samples (not shown), cutoff scores of 11, 12 and 13 were thought to likely yield high values of both sensitivity and specificity in all samples, and were subjected to further testing. The prevalence values used in these analyses reflect the percentage of participants who scored 28 or above on the total PPT (i.e., functional) in each sample (non demented validation = 0.57; very mild dementia = 0.43; mild dementia = 0.28).
ROC assessing the discriminative ability of the mini-PPT in predicting functional status calculated using the original PPT in the non-demented validation sample. Numeric labels indicate the mini-PPT score at that point on the ROC curve.
As shown in , use of a cutoff value of 12 or higher to classify a participant as having functional physical status resulted in the most favorable combinations of sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) in each sample. Although cutoff values of 11 produce higher values for sensitivity and NPV, this increase is accompanied by decreased values of specificity and PPV. Likewise, cutoff values of 13 are associated with greater specificity and PPVs, but lower sensitivity and NPVs. Use of a cutoff of 12 or greater also results in the most cases being correctly classified ().
Statistics used in determining the optimal cutoff value on the mini-PPT for predicting functional status on the original PPT