|Home | About | Journals | Submit | Contact Us | Français|
To investigate whether an abbreviated 5-item functional status survey consisting of 5 activities of daily living (ADLs) reflects changes measured over time in a full 12-item functional status survey (12 ADLs).
Longitudinal evaluation with mean follow-up of 11 months.
Two managed-care organizations in the United States.
420 community-dwelling elders at moderate to high risk of death and functional decline enrolled in the Assessing Care of Vulnerable Elders (ACOVE) observational study.
Number of ADL abilities by the short (range 0–5) and full functional status surveys (range 0–12). Change in function as defined by a 1-point change in short score and 1–2 point change in full survey scores.
Changes in short functional status survey scores were highly correlated to changes in long survey scores (r=.88). On average, a 1-point change in the short survey score was associated with a 1.4 point change on the long survey score (p<.001). The short survey correctly classified 93% of those who declined by the long survey adjusting for chance agreement (kappa=.82) and was responsive to decline in function (sensitivity 82–94%, specificity 94–97%, and area under the receiver operating curve 0.91–0.93 for 1–2 point decreases in full survey ADL counts).
The short functional status survey is an efficient way to detect changes in functional status among vulnerable older populations for clinical and research purposes.
Despite the importance of functional status as a prognostic indicator 1–4 and an influential factor in creating care plans for older persons,5–7 the rates of screening community-dwelling older persons for functional impairment by doctors and health care systems are poor.8–11 To facilitate screening, a short functional status survey was developed to minimize the time required to effectively identify older individuals with instrumental and basic activity of daily living (ADL) disabilities.12 Using cross-sectional data from the 1993 Medicare Current Beneficiary Survey (MCBS), this short function survey (shopping, light housework, managing finances, bathing, and walking across a room) correctly identified 93% of the MCBS subjects with any of 11 disabilities (shopping, light housework, meal preparation, managing finances, using the telephone, bathing, walking across a room, dressing, transferring, using the toilet, and feeding).
The short function survey has been incorporated into the Vulnerable Elders-13 Survey (VES-13),13 a risk-prediction tool that includes the short function survey and 8 other items from the MCBS (age, self-rated health, and 6 physical tasks, e.g., self-report ability to lift 10 pounds or grasping small objects). In a prospective study of older managed care patients, the VES-13 predicted death and development of functional status decline at one year.14, 15 Due to its efficient screening and predictive properties, the VES-13 has subsequently been used as a baseline measure of risk in various clinical and research settings. 16–22 For those screened by the VES-13, the short functional survey included in the VES-13 may serve as the baseline measure of their overall functional status.
One additional utility of the short survey, therefore, would be to perform follow-up measurements to capture changes in functional status over time in either clinical or research settings. However, change in the short function survey score has not yet been validated as a measure of change in overall functional status. An instrument that reliably discriminates differences in health status between subjects in cross sectional studies or that predicts future change in health status may not accurately measure change in health status over time and is often subject to ceiling or floor effects.23 Although several methods to evaluate for responsiveness have been proposed (e.g., a “responsiveness index”),24, 25 we adopted a more clinically-oriented approach proposed by Deyo and Canter26: that responsiveness of functional status indices can be considered like diagnostic tests, where the proposed test should change if meaningful changes in functional status have occurred (sensitivity) and should not change if function remains stable (specificity).
Using data from the Assessing the Care of Vulnerable Elders (ACOVE) study 14, we analyzed whether a change in the short functional survey was responsive to change in function as measured by a longer 12-item instrumental and basic ADL survey that we considered as a gold standard. We also hypothesized that the short survey would have a floor effect because individuals with all 5 short survey disabilities at baseline could not decline any further on follow-up but could continue to accumulate additional disabilities on the long survey.
We used data from the Assessing the Care of Vulnerable Elders (ACOVE) study, which enrolled 420 older individuals (age ≥ 65) from 2 large managed-care organizations to participate in a 13-month measurement of quality of care.
The ACOVE target population was a group of elders who were at increased risk of functional decline and death compared to the general older population. Of 3207 community-dwelling plan members age 65 or older, 88% (n=2810) were successfully contacted by telephone and 90% of these (n=2521) agreed to be screened. Ten percent (n=243) did not meet ACOVE study inclusion criteria: not health plan members (n=54), elder or proxy unable to participate in screening due to poor health (n=18) or non-English speaking (n=122), or receiving cancer therapy (n=49). Twenty-one percent (n=475) of these elders were identified as vulnerable by VES-13 criteria13, which required a total of three or more points from any of the following items: having any of the 5 short-survey impairments (four points for any impairments, zero points if no impairments), advanced age category (1 point for age 65–74; 2 points for age ≥ 75), having difficulty with physical tasks (1 point for each task up to 2 points), and fair or poor self-rated health (1 point). Forty-two percent of the sample received points for having a functional impairment. Of the 475 identified as vulnerable, 88% (n=420) consented to participate in a comprehensive assessment of the quality of their medical care, including a baseline enrollment interview and follow-up interview (in approximately one year). The 55 who refused did not significantly differ from the 420 participants in terms of age, gender, or baseline VES-13 score.
Both baseline and follow-up interviews were administered via telephone using non-clinical personnel and included 12 instrumental and basic ADLs (bathing, walking across a room, dressing, transferring, using the toilet, feeding, shopping, light housework, meal preparation, managing finances, using the telephone, and medication management). Participants were allowed to respond by surrogate respondents.
We defined disability in any activity if the respondent reported “having difficulty and receiving help to perform the activity” or “not doing the activity due to their health”.13 We considered functional ability as all other responses (“no difficulty”, “difficulty but do not receive help”, or “not doing the activity, but for reasons other than health”).
In addition to the baseline and follow-up interviews, we also obtained mortality data from the National Death Index during an observation window that began from enrollment until the day of the last follow-up interview. Statistical analysis was performed using Stata/IC 10.0 (StataCorp, Texas).
To calculate the short survey scores, we added one point for each ability on the short survey (shopping, light housework, managing finances, bathing, and walking across a room; possible range 0–5). To calculate the full survey scores, points for all 12 ADLs were considered (possible range 0–12).
The change in abilities between baseline and follow-up was calculated by subtracting the baseline abilities from the follow-up abilities so that the result was negative if there was a decline in function and positive if there was an improvement.
We also categorized each patient according to a binary outcome (decline versus no decline) using the short and full survey. The primary definition of decline for each scale was a decrement in ADL count by 1 or more for that scale. An alternative, more “stringent” definition of functional decline (employed in prior work on functional decline 13, 15) was also considered for the full survey only: decrement in 2 or more ADLs, or decrement in any ADLs if independent of all ADLs at baseline. Next, we considered a trinomial outcome for each scale where subjects could decline (decline in ≥ 1 ADL), remain stable (no change in ADL count), or improve (improve in ≥ 1 ADL).
The relationship between the short and full survey change scores was evaluated with correlation and simple regression. We also used multivariate regression to control for age, gender, use of proxy respondent for baseline, follow-up, or both interviews.
To evaluate how well the short and full surveys agreed with respect to binomial (decline versus no decline) and trinomial (no decline, stable, versus improvement) categorical outcomes, we used percent agreement and kappa tests, which adjusted percent agreement for agreement by chance. We used a weighted kappa test for the trinomial outcome (weight of 1 for agreement, weight of .5 for a one-degree disagreement, weight of zero for a two-degree disagreement). Responsiveness to decline was evaluated using sensitivity, specificity, and area under the receiver operating curve (AUC), where the long survey outcome was considered the “gold standard” and the short survey outcome was considered the “test”. In order to calculate sensitivity and specificity (and the AUC) using the trinomial outcome we compared any change in functional status (decline or improvement) versus no change. We also tested a measure of decline that included death as part of the definition of functional decline. This commonly used approach for defining functional decline is consistent with our understanding of functional decline that occurs prior to death2, 27, 28 and allows us to consider outcomes of participants who would otherwise have been considered as lost to follow-up. Thus, the resulting test characteristics would be useful for clinicians or researchers desiring to consider death in the definition of functional decline.
Last, possible floor effects were considered by tabulating the change in full ADL counts only for the group of the individuals with limitations in all 5 short survey ADL limitations at baseline, and we considered whether death occurred in these individuals during the observation window.
Of the 420 vulnerable elders in the baseline sample, 276 were re-interviewed after a mean follow-up time of 11 months (range: 9–14 months). Proxy respondents were interviewed for 98 (23%) of the baseline interviews and 67 (24%) of the follow-up interviews. The mean age was 81 years at baseline (n=420) as well as at follow-up (n=276). The baseline sample and follow-up samples were 35% and 33% male (p<.05), respectively. Thirty-three elders died during the observation window. At baseline, half of the elders were “fully functional” (i.e., could perform all twelve ADLs at baseline) and 57% could perform all five ADLs on the short survey. There were few elders at the “floor” of the functional status scales: less than one percent could not perform any of the twelve ADLs at baseline, whereas 4% could not perform all five short survey items.
The change in full survey scores ranged from a decline of 6 to an improvement in 8 abilities, with a median decline of zero abilities. The change in short survey scores ranged from a decline of 4 to improvement in 4 abilities, also with a median decline of zero abilities. The correlation between the two change scores was r=.88 (p<.001) (Figure 1).
Using simple regression, a change in 1 ADL by the short survey was associated on average with a change in 1.4 in the number of ADLs by the long survey (p<.001). In multivariate regression predicting full survey change scores, use of a proxy respondent on either the baseline or follow-up survey, respondent age, and gender were not significant predictors and after their inclusion in the model the effect of short survey scores were unchanged.
Of the 276 elders evaluated for functional status by both baseline and follow-up interviews, the short survey identified 60 of the 66 decliners identified by the long survey. There was a 93% agreement, with excellent change classification reliability (Kappa=.82 [95% CI .73–.90]). The sensitivity of the “test” (short survey) for detecting decline by the “gold standard” (long survey) was 82% [70–90%], specificity 97% [94–99%], and the AUC was .89 [.85–.94] (Table 1). Adding deaths to the definition of decline improved the percent agreement to 94%, Kappa to .87 [.81–.93], sensitivity to 89% [81–94%], specificity to 97% [94–99%], and AUC to .93 [.90–.96]. When we increased the gold standard definition of decline to the “stringent” criteria (2-ADL decrement or 1-ADL decrement if no baseline impairments) and included deaths in the definition of decline, the sensitivity improved to 93% [85–97%] but specificity decreased to 94% [90–97%].
When considering improvement, no change, and decline as outcomes, the short survey identified 60 of the 66 decliners and 45 of the 63 who improved by the full survey. There was an 86% agreement, with good change score reliability (weighted Kappa= .79 [.73–.85]) (Table 1). The short survey was sensitive to 82% [70–90%] change in any direction on the long survey, with 97% [94–99%] specificity, and the AUC was .89 [.85–94]. Using deaths in the definition of decline improved the reliability (weighed Kappa=.83 [.78–.88] and responsiveness (sensitivity 89% [81–94%], specificity 97% [94–99%], and AUC .93 [.90–.96]).
Of 420 elders who participated in the baseline survey, there were 18 elders who were unable to do any of the five short survey items at baseline and therefore could not decline any further on the follow-up short survey. Of these 18 elders, 7 improved or remained stable (agreement), 1 underwent further decline on the long follow-up survey (misclassified), 1 underwent further decline on the long survey but died shortly afterwards during the observation window (correctly classified if death was considered as decline), 6 died but did not participate in the follow-up interview (correctly classified if death is considered as decline), and 3 survived but could not be reached for the follow-up interview (could not be classified).
In this analysis of a short functional status survey, we found that the brief survey is responsive to changes in functional status during a 9–14 month longitudinal study of vulnerable elders. The sensitivity of the short survey is modest (82%) when detecting functional changes as defined by this study (i.e., a one-ADL change) but highly specific (97%). Sensitivity was improved when death was included in the definition of decline, an approach that is based on known trajectories of functional decline prior to death.2, 27, 28 Sensitivity was also increased by increasing the definition of functional decline to a 2-ADL change in function, but at the expense of decreased specificity. Lastly, we did not find an appreciable floor effect, regardless of whether we considered death as a possible outcome.
This study adds to our understanding of this short functional survey, which was designed to increase the feasibility of including functional status items in initial screening assessments by decreasing the number of functional activity items. The short functional survey performs well when used to detect difficulties with instrumental and basic activities of daily living12 and when used within the VES-13 survey to detect future functional status decline and death. 15 The current analysis shows that the short survey can also be used to track change in functional status over time, reliably reflecting changes in ADL counts as measured by a longer 12-item survey. The reported test characteristics of the short survey can be used by those who wish to estimate sensitivity and specificity for capturing change in function over time. We estimate the time savings associated with a 5-item versus 12-item survey (2 minutes versus less than 5 minutes, based on experience with the VES-13 and other short interviews)13, 15, 29 is approximately 2 minutes, a substantial difference in light of a busy office-based practice.
The strength of this study is that various definitions of functional change were considered, which will allow adaptation to data collection efforts to track functional decline as well as improvement. We also considered the possibility that proxy respondents may report disabilities at higher rates than subject respondents, but we found no proxy effect in prediction of change scores. One limitation of our results is that our sample was selected based upon elders with VES-13 scores of 3 or higher, and therefore may only be generalizable to vulnerable elders (i.e., those who are older, have poorer self-rated health, or those with functional or physical impairments at baseline). For most clinical and research purposes, for example tracking functional changes within a clinic with older at-risk elders or in a longitudinal study of elders with baseline co-morbidities or physical impairments, we believe that this will not pose a serious limitation. We also measured change over a relatively short observation window. Lastly, because there is no accepted “gold standard” for functional status, which may also include other physical abilities or performance-based testing, our results should be interpreted in light of the limited definition of overall functional status (i.e., defined as the 12-item ADL inventory) employed in this study.
In conclusion, this short survey of functional status which can be administered in a few minutes can be used in future clinical and research data collection efforts to respond to functional changes with good overall test characteristics.
We recognize the technical assistance of Patricia Smith. We thank Robin Hertz, PhD, senior director of outcomes research/population studies at Pfizer Inc, for her support.
Dr. Min is supported by a UCLA Mentored Clinical Scientist Development Program in Geriatrics (NIA-K12) award. This paper was accepted as a paper presentation for the 2008 American Geriatrics Society Annual Scientific Meeting, Washington, D.C.
Conflict of Interest Disclosures
The Assessing Care of Vulnerable Elders Study was supported by a contract from Pfizer Inc to RAND.
The editor in chief has reviewed the conflict of interest checklist provided by the authors and has determined that the authors have no financial or any other kind of personal conflicts with this paper.
Lillian Min: concept and design, analysis and interpretation of data, drafting and revision of manuscript, final approval of published manuscript.
David Reuben: interpretation of data, revision and final approval of manuscript
Neil Wenger: acquisition of subjects and data, interpretation of data, revision and final approval of manuscript
Debra Saliba: concept and design, acquisition of subjects and data, analysis and interpretation of data, revision and final approval of manuscript.
Sponsor’s Role: The funding source had no role in the design, analysis or interpretation of the study or in the decision to submit the results for publication.