|Home | About | Journals | Submit | Contact Us | Français|
Over the past two decades, there has been considerable progress in the assessment of function and disability among older persons. Tests of physical performance are now routinely included in longitudinal studies to measure functional limitations, which are considered the building blocks of functioning. In addition, new strategies have been developed to assess the presence and onset of disability and to expand the scope of disability assessments beyond traditional indicators of difficulty and dependence. Contemporary measurement technologies, such as item response theory and computer adaptive testing, show great promise in the assessment of functional status and disability, but prospective studies are needed to demonstrate their true value, particularly to identify the circumstances in which their use will improve the assessment of functional outcomes in older persons. Another high priority for future research is to validate and further refine strategies to more completely and accurately ascertain the occurrence of disability among older persons.
The defining feature of geriatric medicine is the intense focus on the preservation and restoration of function. The preeminence of function in health care has been endorsed by several diverse sources, including the Institute of Medicine, which indicated that the goal of the US health system is “to improve the health and functioning of the people of the United States” (1); the Healthy People 2010 project, which identified increasing the quality and years of life as one of its two overarching goals (2); the Agency for Healthcare Research and Quality, which underscored the need for evidence on cost-effective interventions to enhance functional status (3); and editorial writers who have suggested that functional status should be assessed routinely in clinical practice as a “sixth vital sign” (4). Hence, the assessment of function and disability is often a key element in longitudinal studies on aging.
Although several conceptual models of disability have been proposed, one of the most instructive was developed by Nagi (5) and subsequently modified by Verbrugge and Jette (6). In this model, shown in Figure 1, the pathway to disability starts with pathology, with impairments and functional limitations as intervening steps. Because of their importance to longitudinal studies on aging, the current monograph focuses on the assessment of functional limitations, which are often considered the building blocks of functioning, and disability, which represents actual behaviors, i.e. how an older person gets along in his or her real-world environment (7).
Functional limitations are useful to assess because they are often strong predictors of clinically meaningful, distal outcomes, such as disability, nursing home admission, and death. Increasingly, however, assessments of functional limitations have been included in longitudinal studies and clinical trials as more proximal outcomes. In contrast to measures of disability, measures of functional limitations are usually free of environmental influences, often focus on a specific task, such as gait speed, thereby leading to greater specificity, and offer the potential for greater responsiveness to clinically meaningful changes.
Functional limitations are assessed most commonly with tests of physical performance, in which an individual is asked to perform a specific task (or series of tasks) and is evaluated in an objective, standardized manner using predetermined criteria, which may include counting of repetitions or timing of the activity as appropriate.
One of the most widely used and validated assessment tools is the Short Physical Performance Battery (SPPB), which was originally developed at the National Institute on Aging for use in the Established Population for the Epidemiologic Studies of the Elderly (EPESE) (8). The SPPB, which assesses lower extremity functional limitations, includes timed tests of standing balance, walking speed, and repeated chair stands. For the balance test, persons are asked to maintain their feet in side-by-side, semi-tandem (heel of one foot beside the big toe of the other foot), and tandem (heel of one foot in front and touching the other foot) positions for 10 seconds each. Walking speed is assessed by asking persons to walk at their usual pace over a 4-meter course. Two walk times are recorded and the faster of the two is used to compute the walking test score. For the chair stand test, persons are asked to stand up from a sitting position with their arms folded across their chest. If able to perform this task, they are then asked to stand up and sit down five times as quickly as possible and the time to perform the test is recorded. Each of the three tests is assigned a score ranging from 0 to 4, with 0 indicating the inability to complete the test and 4 the highest level of performance. A summary score, ranging from 0 to 12, is calculated by adding the three scores. The SPPB usually takes less than 10 minutes to complete and is portable, allowing it to be completed in the home or office. The test-retest reliability of the SPPB and each of its components is high (9). The SPPB is available for use without permission or royalty fees; and the contents of a training CD, including comprehensive instructions on the administration of the battery, safety tips, a scoring sheet and background information on relevant publications, can be downloaded from www.grc.nia.nih.gov/branches/ledb/sppb/index.htm.
The SPPB is a strong predictor of mortality, nursing home admissions, and the onset of disability in activities of daily living and mobility, respectively (8,10). Although these associations are largely linear, with outcome rates increasing as scores on the SPPB decrease, an SPPB score less than 10 has commonly been used to identify an “at risk” group. Older persons having scores from 10 to 12 are relatively immune to adverse outcomes over the course of four years. In addition, the SPPB is responsive to clinically meaningful changes (9,11), with 0.5 points denoting a small change (i.e. clinically detectable, potentially important) and 1 point denoting a substantial change (clinically detectable, definitely important) (Table 1) (12).
Although each of its three components is valid (8), the predictive accuracy of the SPPB is due largely to gait speed (13). As the single best indicator of functional limitations, slow gait speed has been used to identify older persons who are physically frail in longitudinal studies and clinical trials (14,15). For a 4-meter walk test, a usual gait speed less than 0.6 meters per second (m/sec) confers high risk for adverse outcomes, while a value greater than 1 m/sec confers low risk (16). As shown in Table 1, the criterion for detecting a small meaningful change in gait speed using a 4-meter walk test is 0.05 m/sec, while the value for a substantial meaningful change is 0.10 m/sec. A related test, which has been used most commonly in studies of cardiovascular, pulmonary, and peripheral vascular disease, is the 6-minute walk, in which the distance walked over six minutes at one’s usual pace is measured (17). The criterion for detecting a small meaningful change using the 6-minute walk test is 20 meters, while the value for a substantial meaningful change is 50 meters.
Other tests that have been used to assess lower extremity functional limitations include turning 360 degrees and climbing a flight of stairs, each of which is included in the Physical Performance Test (18), one of the earliest composite measures of physical performance. To assess upper extremity functional limitations, several performance-based tests have been used (18–21), including (among others) writing a sentence, picking up small objects, buttoning a shirt, pegboard, and functional reach. Although not a pure measure of functional limitations, grip strength, as assessed by a handheld dynamometer, is a robust predictor of disability and other clinically relevant outcomes (22).
To reduce potential ceiling effects, especially among high functioning older persons, more challenging tests of physical performance have been developed for inclusion in longitudinal studies. For example, in the Health ABC Study, which included nondisabled persons aged 70–79 years, a long-distance corridor walk was developed, which assesses walking speed over 20 meters, distance covered in 2 minutes, and the time to walk 400 meters; and the SPPB was modified by extending the times for the three standard balance tests from 10 to 30 seconds and adding a single leg stand test (23). Performance on the long-distance corridor walk was subsequently shown to be strongly associated with total mortality, cardiovascular disease, and mobility disability (24).
Although functional limitations may also be assessed through self-report or proxy report, relatively few instruments focus exclusively on functional limitations. The best contemporary measure may be the function component of the Late-life Function and Disability Instrument (25), which includes 32 items across three domains: basic lower extremity function (e.g. reach overhead while standing), advanced lower extremity function (e.g. walk several blocks), and upper extremity function (e.g. unscrew lid without assistive device). While this comprehensive measure has high reliability and validity, its responsiveness to clinically meaningful changes is uncertain.
In the absence of a gold standard, the assessment of disability can vary considerably from one longitudinal study to another. There are several reasons for this variability. First, disability assessments may include a different set of self-care, instrumental, and mobility-related tasks (Table 2). Second, disability can be defined on the basis of having difficulty with a task or needing (or receiving) help with a task. Third, help can be provided by another person or by special equipment, such as a cane for ambulation or a tub bench for bathing. Fourth, personal assistance can be operationalized as needing help or receiving help. Fifth, personal assistance may or may not include supervision for a task. Sixth, disability assessments may or may not include a preamble, such as, “Because of a health or physical problem, do you need …”. Finally, there is no uniform frame of reference (i.e. “at the present time”, “during the last month”, “since we last spoke”) for assessing disability. Prior work has demonstrated that the prevalence of disability can differ considerably depending on these operational characteristics (26). Hence, it is essential that published reports include the specific question(s) that were used to assess disability. An important consideration is whether the specific question(s) can be administered and answered reliably (27), especially to proxy respondents, who must often be enlisted at some point during a longitudinal study on aging.
To illustrate some of the complexities and opportunities involved in assessing disability, four topics are discussed below.
While most longitudinal studies have operationalized disability either as having difficulty with a task or as dependence, i.e. “requiring help from another person”, strong evidence exists that questions about difficulty and dependence provide complementary information that together can more fully depict the continuum of disability among older persons. For example, in one study that focused on self-care activities of daily living (28), participants were categorized into three distinct groups: independent without difficulty, independent with difficulty, and dependent. Additional baseline information was collected on several measures of higher level function and physical performance. Follow-up information was collected on regular home care visits and activities of daily living at one and three years and on hospitalizations, admissions to skilled nursing facilities, and deaths over four years. Participants who were independent but had difficulty had functional profiles, physical performance scores, and rates of health care utilization and death that were intermediate to those of persons who were independent without difficulty and persons who were dependent. For example, for participants who were independent without difficulty, independent with difficulty, and dependent, the rates of hospitalization and regular home care visits were 46%, 57%, and 72% and 17%, 30%, and 49%, respectively. Furthermore, among participants who were independent at baseline, those who had difficulty were significantly more likely to develop dependence over three years than those who did not have difficulty, with a relative risk of 1.7.
Based on these findings, which have been replicated in subsequent studies (29), clinicians and investigators can depict the continuum of disability more fully by including questions about both difficulty and dependence in their clinical practice and longitudinal studies, respectively.
In most longitudinal studies, the frame of reference for assessing disability is “at the present time”, and an incident episode of disability is noted when a nondisabled person reports disability at a subsequent follow-up interview (26). Prior research has demonstrated, however, that incident episodes of disability are often not ascertained by longitudinal studies with assessment intervals longer than three to six months (30). An alternative strategy for ascertaining incident episodes of disability is to ask participants to recall whether they have had disability “at any time” since the prior assessment.
Using data from a unique longitudinal study that includes monthly assessments of disability, Gill and colleagues found that up to half of the incident disability episodes, which would otherwise have been missed, can be ascertained if participants are asked to recall whether they have had disability “at any time” since the prior assessment; that these disability episodes, which are ascertained by participant recall, confer high risk for the subsequent development of chronic disability, even after accounting for potential confounders; and that participant recall for the absence of disability becomes increasingly inaccurate as the duration of the assessment interval increases, with 2.2%, 6.0%, 6.9% and 9.1% of participants having inaccurate recall at 1, 3, 6, and 12 months, respectively (31).
Based on these results and those of other studies (32,33), the following strategy might be considered to enhance the ascertainment of disability in future longitudinal studies. If a participant is not disabled at the present time, ask whether s/he has had disability at any time since the prior assessment. If no, probe further using a standard protocol, focusing on major illnesses or injuries that have occurred since the prior assessment. Most episodes of disability, especially in persons who are not physically frail, are precipitated by an intervening event that leads to either hospitalization (most commonly) or restricted activity (32,33). Special attention may be warranted for participants with low levels of education and perhaps those who are not cognitively intact since these factors have been linked to inaccurate recall of disability (34). Another way to possibly enhance the ascertainment of disability would be to adapt a calendar approach, which has been used successfully to ascertain falls (35). Participants might be instructed, for example, to complete a disability calendar monthly (or weekly), by marking a “D” (for disabled) under a specific activity of daily living if they needed help from another person or were unable to perform the task during the prior month (or week) and an “I” (for independent) if they performed the task without personal assistance.
To expand the ascertainment of disability beyond task difficulty (and dependence), two strategies have been proposed. In the first, which has been successfully implemented in the Women’s Health and Aging Study II, participants who report no difficulty with a task are asked whether they have modified the method or changed the frequency with which they perform the task. Among older persons without difficulty in mobility, those who report task modification, which includes a decrease in frequency, have slower walking speed and poorer strength, balance, and exercise tolerance (36). Furthermore, task modification, which is often considered an indicator of preclinical disability (37), is a strong, independent predictor of incident disability, defined on the basis of task difficulty (38).
In the second strategy, which has been successfully implemented in the Health ABC Study, participants who report no difficult with a task are asked, “How easy is it for you to [complete the task].” Response categories include “very easy”, “somewhat easy”, and “not that easy”. In a high functioning cohort of older persons, asking about ease of performing common functional tasks modestly improved discrimination of functional capacity, as assessed by tests of physical performance (23).
To address concerns that have been raised about traditional disability assessments, including problems with floor and ceiling effects, inherent tradeoffs between the breadth versus depth (i.e. precision) of measurement, and the inability to compare results across different instruments, some investigators have advocated the use of item response theory (IRT) coupled with computer adaptive testing (CAT) (39,40). IRT is used to create hierarchically ordered item banks, which include a set of items (e.g. functional tasks) that measure the same concept (e.g. functional limitations), while CAT employs an algorithm that selects items directly tailored to the participant, and shortens or lengthens the test to achieve the desired precision. This type of “adaptive” testing reduces the amount of time required to complete an assessment, while maintaining sufficient content breadth and measurement precision across multiple settings (e.g. community, nursing home, etc.). Importantly, all scores are on the same scale, allowing comparisons over time and across subsets of participants with different functional levels. These latter attributes are particularly relevant for longitudinal studies on aging given the high likelihood of changes to residential status and function over time.
As part of the PROMIS (Patient-Reported Outcomes Measurement Information System) initiative, sponsored by the National Institutes of Health, IRT and CAT have been used to construct and evaluate a preliminary item bank for physical function (41). As compared with existing fixed length instruments, a 10-item CAT extended the range of measurement substantially, thereby reducing potential floor and ceiling effects, and improved measurement precision over a wide range of function. In a more recent study (42), Jette and colleagues demonstrated that CAT scores of the Late-Life Function and Disability Instrument were highly comparable to those obtained from the full-length instrument with only a small loss in accuracy, precision, and sensitivity to change, but a substantial reduction in the time of administration.
Over the past two decades, there has been considerable progress in the assessment of function and disability among older persons. Tests of physical performance are now routinely included in longitudinal studies, and new strategies have been developed to assess the presence and onset of disability and to expand the scope of disability assessments beyond traditional indicators of difficulty and dependence. While item response theory and computer adaptive testing show great promise in the assessment of functional status and disability, prospective studies are needed to demonstrate the true value of these contemporary measurement technologies, particularly to identify the circumstances in which their use will improve the assessment of functional outcomes in older persons. Another high priority for future research should be to validate and further refine strategies to more completely and accurately ascertain the occurrence of disability among older persons. Informed by this new knowledge, clinicians and investigators will be better positioned to develop and rigorously evaluate interventions designed to prevent the onset and progression of disability and to restore independent function among older persons who become newly disabled.
The work for this report was funded by grants from the National Institute on Aging (R37AG17560, R01AG022993, K24AG021507). The work was conducted at the Yale Claude D. Pepper Older Americans Independence Center (P30AG21342).
Sponsor's Role: The NIH had no role in the preparation of this paper.
Conflict of Interest: Dr. Gill acknowledges research support from NIH Grants R37AG17560, R01AG022993 and K24AG021507 from the National Institute on Aging.
Author Contributions: The author is the sole contributor to this paper.