We used data from the Assessing the Care of Vulnerable Elders (ACOVE) study, which enrolled 420 older individuals (age ≥ 65) from 2 large managed-care organizations to participate in a 13-month measurement of quality of care.
The ACOVE target population was a group of elders who were at increased risk of functional decline and death compared to the general older population. Of 3207 community-dwelling plan members age 65 or older, 88% (n=2810) were successfully contacted by telephone and 90% of these (n=2521) agreed to be screened. Ten percent (n=243) did not meet ACOVE study inclusion criteria: not health plan members (n=54), elder or proxy unable to participate in screening due to poor health (n=18) or non-English speaking (n=122), or receiving cancer therapy (n=49). Twenty-one percent (n=475) of these elders were identified as vulnerable by VES-13 criteria13
, which required a total of three or more points from any of the following items: having any of the 5 short-survey impairments (four points for any impairments, zero points if no impairments), advanced age category (1 point for age 65–74; 2 points for age ≥ 75), having difficulty with physical tasks (1 point for each task up to 2 points), and fair or poor self-rated health (1 point). Forty-two percent of the sample received points for having a functional impairment. Of the 475 identified as vulnerable, 88% (n=420) consented to participate in a comprehensive assessment of the quality of their medical care, including a baseline enrollment interview and follow-up interview (in approximately one year). The 55 who refused did not significantly differ from the 420 participants in terms of age, gender, or baseline VES-13 score.
Both baseline and follow-up interviews were administered via telephone using non-clinical personnel and included 12 instrumental and basic ADLs (bathing, walking across a room, dressing, transferring, using the toilet, feeding, shopping, light housework, meal preparation, managing finances, using the telephone, and medication management). Participants were allowed to respond by surrogate respondents.
We defined disability in any activity if the respondent reported “having difficulty and receiving help to perform the activity” or “not doing the activity due to their health”.13
We considered functional ability as all other responses (“no difficulty”, “difficulty but do not receive help”, or “not doing the activity, but for reasons other than health”).
In addition to the baseline and follow-up interviews, we also obtained mortality data from the National Death Index during an observation window that began from enrollment until the day of the last follow-up interview. Statistical analysis was performed using Stata/IC 10.0 (StataCorp, Texas).
To calculate the short survey scores, we added one point for each ability on the short survey (shopping, light housework, managing finances, bathing, and walking across a room; possible range 0–5). To calculate the full survey scores, points for all 12 ADLs were considered (possible range 0–12).
The change in abilities between baseline and follow-up was calculated by subtracting the baseline abilities from the follow-up abilities so that the result was negative if there was a decline in function and positive if there was an improvement.
We also categorized each patient according to a binary outcome (decline versus no decline) using the short and full survey. The primary definition of decline for each scale was a decrement in ADL count by 1 or more for that scale. An alternative, more “stringent” definition of functional decline (employed in prior work on functional decline 13, 15
) was also considered for the full survey only: decrement in 2 or more ADLs, or decrement in any ADLs if independent of all ADLs at baseline. Next, we considered a trinomial outcome for each scale where subjects could decline (decline in ≥ 1 ADL), remain stable (no change in ADL count), or improve (improve in ≥ 1 ADL).
The relationship between the short and full survey change scores was evaluated with correlation and simple regression. We also used multivariate regression to control for age, gender, use of proxy respondent for baseline, follow-up, or both interviews.
To evaluate how well the short and full surveys agreed with respect to binomial (decline versus no decline) and trinomial (no decline, stable, versus improvement) categorical outcomes, we used percent agreement and kappa tests, which adjusted percent agreement for agreement by chance. We used a weighted kappa test for the trinomial outcome (weight of 1 for agreement, weight of .5 for a one-degree disagreement, weight of zero for a two-degree disagreement). Responsiveness to decline was evaluated using sensitivity, specificity, and area under the receiver operating curve (AUC), where the long survey outcome was considered the “gold standard” and the short survey outcome was considered the “test”. In order to calculate sensitivity and specificity (and the AUC) using the trinomial outcome we compared any change in functional status (decline or improvement) versus no change. We also tested a measure of decline that included death as part of the definition of functional decline. This commonly used approach for defining functional decline is consistent with our understanding of functional decline that occurs prior to death2, 27, 28
and allows us to consider outcomes of participants who would otherwise have been considered as lost to follow-up. Thus, the resulting test characteristics would be useful for clinicians or researchers desiring to consider death in the definition of functional decline.
Last, possible floor effects were considered by tabulating the change in full ADL counts only for the group of the individuals with limitations in all 5 short survey ADL limitations at baseline, and we considered whether death occurred in these individuals during the observation window.