|Home | About | Journals | Submit | Contact Us | Français|
We compared the metric properties of the University of California, Los Angeles (UCLA) activity scale, the Tegner score, and the Activity Rating Scale for assessment of activity levels in 105 patients undergoing THA (48 women; mean age, 63.4 years) and 100 patients undergoing TKA (61 women; mean age, 66.5 years). We assessed construct validity by correlating these scales with the International Physical Activity Questionnaire and different traditional patient self-reporting outcome measures. Test-retest reliability, feasibility, and floor and ceiling effects also were determined. The UCLA scale showed the strongest correlations with the other measures (r = −0.35 to 0.56 for THA; r = −0.55 to 0.23 for TKA) and was the only scale that discriminated between insufficiently and sufficiently active patients undergoing THA and TKA. The UCLA scale had the best reliability, provided the highest completion rate, and showed no floor effects. It seems to be the most appropriate scale for assessment of physical activity levels in patients undergoing total joint arthroplasty.
Level of Evidence: Level III, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
Total joint arthroplasty (TJA) is one of the most successful interventions to reduce pain, improve joint function, and improve health-related quality of life in end-stage arthritic joints . Pain relief and restoration of joint function still are the primary goals of surgery. However, patient expectations are increasing regarding physical activity and sports participation after surgery. In 1994, Wright et al.  reported participation in recreational activities was the third most important expectation of patients undergoing THA. In two other studies, Mont et al. [27, 28] reported some of their patients underwent TJA specifically to be able to continue playing tennis. Furthermore, participation in regular physical activity is important to improve cardiorespiratory fitness and reduce the morbidity and mortality associated with many chronic diseases .
However, despite widespread knowledge regarding the beneficial effects of physical activity on quality of life and general health enhancement , physical activity levels typically are not assessed in outcomes after TJA [4, 14]. Assessing physical activity levels is important given the negative consequences of activity in patients undergoing TJA: Kilgus et al. , and more recently Schmalzried et al. , described associations between physical activity levels and the risk of earlier implant failure. Flugsrud et al.  reported men with intermediate to intense physical activity during leisure time had a fourfold increased risk of aseptic cup loosening compared with their counterparts with sedentary lifestyles. Unfortunately, as of now, there are no clear guidelines regarding what level of physical activity is sufficient to attain adequate cardiorespiratory and muscular fitness without compromising implant durability. Thus, measurement of patient activity is crucial in the evaluation of outcomes after TJA.
Determining patient activity is challenging and the assessment ideally should include the types of activities performed and the frequency, duration, and intensity of participation. There currently is no single approach addressing all these issues. Numerous physical activity questionnaires have been developed [7, 21, 26], but some lack validity  and others impose a heavy burden on patients and clinicians owing to their length and complexity [23, 37]. For assessment of physical activity, single rating scales asking the patient to rate her or his activity level on a numeric scale are shorter and more practical than the long questionnaires commonly used to assess habitual physical activity [23, 37]. For this reason, they easily can be used together with other traditional self-reported outcome measures in patients undergoing TJA. Activity rating scales, however, are not the ideal instruments to assess all issues relevant to physical activity. Nonetheless, as long as no gold standard exists, they probably should be used in the routine outcome assessment of TJA to address the above-mentioned issues of physical activity in this group of patients.
The purposes of our study were (1) to evaluate the validity of the UCLA activity scale , the Tegner score , and the Activity Rating Scale (ARS)  in patients undergoing TJA by correlating these scales with the International Physical Activity Questionnaire (IPAQ)  and different traditional outcome measures; (2) to compare the metric properties of these scales regarding reliability, feasibility, and floor and ceiling effects; and (3) to assess gender- and age-related effects on score values and metric properties.
The study sample comprised 105 consecutive patients (48 women, 57 men) undergoing THA and 100 consecutive patients (61 women, 39 men) undergoing TKA between October and November 2007. Mean age of the THA group was 63.4 ± 11.0 years (range, 33–88 years), with no differences (p = 0.16) between women (65.1 ± 10.5 years) and men (62.0 ± 11.2 years). Mean body mass index (BMI) of the THA group was 26.3 ± 3.7 kg/m2 (range, 19.1–38.4 kg/m2). Mean age of the TKA group was 66.5 ± 9.1 years (range, 46–88 years), with no differences (p = 0.22) between women (67.6 ± 8.3 years) and men (65.2 ± 10.3 years). Mean BMI of the TKA group was 28.1 ± 3.9 kg/m2 (range, 19.5–41.2 kg/m2). All patients were sent a questionnaire set 1 week before surgery accompanied by a letter of explanation and instructions to complete the forms at home and bring them on the day of clinic admission. We changed the order of the activity scales in the questionnaire set to avoid repetitiveness bias. After receipt of the first questionnaire set, 43 patients (21 women, 22 men; mean age, 63.4 ± 10.3 years; mean BMI, 26.0 ± 3.7 kg/m2) from the THA group and 36 patients (18 women, 18 men; mean age, 67.5 ± 8.9 years; mean BMI, 27.8 ± 4.0 kg/m2) from the TKA group volunteered to complete a second questionnaire set to assess reliability. Evaluation of the activity scales was part of a large validation study approved by the local ethical committee. All participating patients provided written informed consent.
We used the UCLA scale , Tegner score , and ARS . The UCLA scale is a simple scale ranging from 1 to 10. The patient indicates her or his most appropriate activity level, with 1 defined as “no physical activity, dependent on others” and 10 defined as “regular participation in impact sports.” The Tegner score is similar to the UCLA scale, a simple scale ranging from 0 to 10, and the patient has to indicate the most appropriate activity level, with 0 defined as “no physical activity, disabled” and 10 defined as “participation in competitive soccer—national and international elite.” The ARS consists of four questions asking about the frequency the patient performs activities such as “running, cutting, decelerating, and pivoting.” Each question can be scored from 0 (less than one time per month) to 4 (four times per week or more often), so the total ARS can range from 0 to 16 points.
No gold standard questionnaire exists to assess physical activity. We chose the short “last 7 days” version of the International Physical Activity Questionnaire (IPAQ; www.ipaq.ki.se) because it is considered valid  and its use in scientific studies is widespread [1, 31, 32, 35, 40]. This IPAQ version consists of seven questions assessing the frequency and duration of participation in vigorous, moderate, and walking activity, and the time spent sitting during the last week. The total score can be expressed in metabolic equivalents (METs). One MET is approximately the metabolic rate (oxygen consumption) of an individual sitting quietly for 1 minute (3.5 mL/kg/minute). To estimate total METs, we multiplied total minutes spent in each category per week by a factor of 8 for vigorous, 4 for moderate, and 3.3 for walking activity, as proposed by the IPAQ guidelines. Data truncation and removal of outliers also were performed, as proposed by the IPAQ guidelines; 19% of the questionnaires were truncated, and one outlier (undergoing THA) had to be removed.
To assess construct validity, we used two approaches: convergent evidence  and known-group technique . Convergent evidence was determined by correlating the activity rating scales with total METs calculated from the IPAQ. Because our goal was to evaluate physical activity rating scales in patients undergoing THA or TKA, we presumed convergent evidence between the scores of the activity rating scales and the scores derived from generic and disease-specific questionnaires traditionally used to assess surgical outcomes in patients undergoing TJA: WOMAC , Oxford Hip Score  (OHS), SF-12 , and Harris hip score  (HHS) in patients undergoing THA; and WOMAC, Oxford Knee Score  (OKS), SF-12, and Knee Society score  (KSS) with knee and function subscores in patients undergoing TKA. These relations were analyzed using the Spearman rank correlation coefficient. We anticipated there would be weak to moderate correlations (r = 0.3–0.5) between the activity scales and the generic and disease-specific measures.
The second procedure we used for gathering construct-related evidence was the known-group difference technique in which two groups assumed to differ on the construct being measured are compared. Therefore, according to the guidelines set out by the IPAQ executive committee, patients were classified as insufficiently active, moderately active, or vigorously active. We calculated the mean score values of the activity scales for patients classified as insufficiently active and sufficiently active (moderate and vigorous activity levels according to the IPAQ classification), and we then assessed if these values differed between the groups using the Mann-Whitney U test.
The reliability of the three activity rating scales (UCLA, Tegner, ARS) was examined using the weighted Cohen’s kappa kw . The coefficient k indicates whether observed agreement is greater than or equal to chance agreement. Before calculation, each cell of the contingency table was assigned weights. Cells with exact agreement were assigned the maximum of 1. The quadratic weights from these maximum values ranged down to 0 for cells with a maximal possible disagreement . The strength of agreement for the kw coefficient was interpreted according to Fleiss et al. : kw of 0.75 or more indicates excellent agreement for most purposes and kw of 0.40 or less indicates poor agreement. The test-retest reliability of IPAQ was examined using the intraclass correlation coefficient [ICC(2,1): two-way random-effects model with single measure (absolute agreement)] .
We calculated the rate of fully completed activity scales. Furthermore, each scale had to be rated by patients for its difficulty on a visual analog scale from 1 (very easy) to 10 (extremely difficult), and patients stated the time they required for completion. We compared all three rating scales regarding difficulty and time consumption using the Kruskal-Wallis test with post hoc Mann-Whitney U tests. Both parameters also were correlated with patients’ age using Spearman rank correlation coefficients.
For all three scales, the distribution of floor and ceiling effects was calculated. Floor effects occur when patients rate their activity level as the lowest possible on the specific scale (1 for the UCLA scale, 0 for the ARS and Tegner scale). Consequently, worsening in activity levels cannot be assessed. Vice versa, ceiling effects occur when patients rate their activity level as the highest possible on the specific scale (10 for the UCLA and Tegner, 16 for the ARS). Ceiling effects therefore prohibit observing any improvement in activity levels.
Before all statistical tests, we tested the data for normal distribution using the Shapiro-Wilk W test. Unless otherwise stated, descriptive results are shown as mean ± standard deviation. For the reliability values, we indicated 95% confidence intervals (CIs). All statistical tests were performed using the software package SPSS (Version 13; SPSS Inc, Chicago, IL).
The three scales showed weak to moderate correlations with the other scores used in the expected directions; we found the strongest correlations between activity scales and IPAQ, WOMAC, OHS, OKS, HHS, KSS, and SF-12 values for the UCLA scale in patients undergoing THA and those undergoing TKA (Table 1). We observed the weakest correlations for the ARS (Table 1). The UCLA scale discriminated between insufficiently and sufficiently active patients as classified by the IPAQ in patients undergoing THA and those undergoing TKA (Table 2). The ARS could discriminate only between insufficiently and sufficiently active patients undergoing THA but not in patients undergoing TKA (Table 2). Similarly, the Tegner could discriminate only between insufficiently and sufficiently active patients in the THA group but not in the TKA group (Table 2).
The UCLA scale had excellent reliability with higher kw values than the ARS and Tegner score in patients undergoing THA (Table 3). The reliability of all three scales was excellent in patients undergoing TKA (Table 3). The ICCs for the IPAQ were 0.76 [95% CI, 0.57–0.87] in patients undergoing THA and 0.87 [95% CI, 0.74–0.94] in patients undergoing TKA.
The UCLA scale provided the highest completion rate in patients undergoing THA and those undergoing TKA (Table 4). We observed no major differences between all scales concerning difficulty and time required for completion (Table 4). No floor effects were observed for the UCLA scale in either the THA or the TKA group. Ceiling effects occurred in 4% of the patients undergoing THA and in 1% of the patients undergoing TKA. We observed large floor effects for the ARS: 56% for patients undergoing THA and 55% for patients undergoing TKA. Ceiling effects occurred in 1% and 2%, respectively. Floor effects occurred in 6% for the patients undergoing THA and 3% for the patients undergoing TKA for the Tegner scale. The Tegner scale showed no ceiling effects.
According to all three scales, men were more active (p = 0.000004–0.006) than women (Table 5). In patients undergoing THA, we observed no correlation (p = 0.08–0.95) between age and the activity scales or IPAQ values. Similarly, in patients undergoing TKA, we observed no correlation (p = 0.25–0.97) between age and the activity scales or IPAQ values. In patients undergoing THA, the time required to complete the UCLA scale and ARS increased (r = 0.21, p = 0.046, and r = 0.29, p = 0.01, respectively) with age. We observed no association (p = 0.10) between time and age for completion of the Tegner scale. Difficulty in completing the scales was not correlated (p = 0.11–0.57) with age in patients undergoing THA. For patients undergoing TKA, the time required to complete the ARS was correlated (r = 0.28, p = 0.02) with age. We found no association between time and age for the UCLA scale (p = 0.09) and Tegner scale (p = 0.10). The difficulty for patients undergoing TKA to complete the ARS and Tegner scale increased (r = 0.24, p = 0.05, and r = 0.25, p = 0.03, respectively) with age. We observed no association (p = 0.68) between difficulty and age for completion of the UCLA scale.
Assessing physical activity levels in patients undergoing TJA is important considering the associations between physical activity levels and the risk of earlier implant failure [17, 22, 34]. However, physical activity levels are not routinely determined in the outcome assessment after TJA [4, 14]. Using activity rating scales is one possibility to assess activity levels in patients undergoing TJA, and as long as no gold standard exists, they probably should be used in the routine outcome assessment after THA or TKA. Our purposes were to evaluate the validity of the UCLA activity scale , the Tegner score , and the ARS  in patients undergoing TJA; to compare the metric properties of these scales regarding reliability, feasibility, and floor and ceiling effects; and to assess gender- and age-related effects on score values and metric properties.
When interpreting the results, one must consider we investigated only three different activity rating scales, and we used the IPAQ as a reference for physical activity and no objective measure, such as a pedometer or accelerometer.
Our data showed, in patients undergoing TJA, the UCLA activity scale correlated better with the other measures, and provided better reliability and completion rate than the Tegner scale and the ARS. The UCLA scale showed consistent construct validity and it was the only scale discriminating between insufficiently and sufficiently active patients, although the 95% CIs overlapped in the TKA group. Therefore, of these three scales, the UCLA seems most appropriate for assessing activity levels in patients undergoing TJA. Our data are consistent with those of two previous studies reporting correlation coefficients between the UCLA scale, HHS, and the SF-12 [4, 43]. The correlation coefficients we observed between the UCLA scale and the IPAQ compared well with those reported between the IPAQ and accelerometer data , indicating construct validity of the UCLA scale. The UCLA scale and Tegner score seemed to perform better in patients undergoing THA than in patients undergoing TKA, considering higher correlation coefficients with the IPAQ. However, we attribute this observation not to the UCLA scale or Tegner score but to the IPAQ: according to all three activity scales used in this study, patients undergoing THA had higher activity levels than patients undergoing TKA, but in contrast, total METs derived from the IPAQ were higher in patients undergoing TKA, and more patients undergoing TKA than THA were classified as sufficiently active. This contradictory pattern can explain the lower correlation coefficients found in the TKA group, and might indicate overreporting of habitual physical activity in the TKA group using the IPAQ. The possible problem of overreporting with the IPAQ was highlighted by others [1, 33, 40]. Whether overreporting physical activity is responsible for the observed differences and why this seemed to occur only in patients undergoing TKA should be investigated in future studies using the IPAQ or other habitual physical activity questionnaires together with objective measures, such as pedometers or accelerometers, in patients undergoing THA and TKA.
We observed a high rate of floor effects using the ARS. Floor effects were observed in greater than 50%. Moreover, the ARS had the lowest completion rate and the weakest correlations with the other measures used in this study. The ARS originally was developed to assess activity levels in patients with knee injuries . Such patients are younger and might have different demands on their knee function. This may explain floor effects and the low completion rate; however, our results indicate this scale should not be used in patients undergoing TJA. Similar to the ARS, the Tegner scale was developed to rate activity levels in patients with knee injuries . The better metric properties achieved by the Tegner scale in comparison to the ARS might be explained by its design as a simple scale from 0 to 10, which is similar to the UCLA scale design. Despite the similar design, the Tegner scale had lower correlation coefficients, lower reliability values, and lower completion rates than the UCLA scale. This might be related to a too sports-oriented content for the group of patients undergoing TJA; Levels 8 to 10 are related to competitive sports participation. With few exceptions, the majority of patients do not undergo joint arthroplasty to achieve competitive sports levels [24, 42]. The rate of floor effects we found for the Tegner scale was less than 10%, which might be considered an acceptable proportion, bearing in mind these patients were undergoing TJA.
All three activity scales showed men had higher activity levels than women. These gender-related differences were in line with previous reports. Ainsworth et al.  and Tehard et al.  reported men were more active than women according to the IPAQ. Using a pedometer to assess physical activity, Sequeira et al.  reported Swiss women were less active than men. We found no age-dependent decrease in physical activity levels, contrary to the findings of Zahiri et al.  using the UCLA scale. This observation seems to indicate older adults undergoing TJA are still healthy and active, which underlines the importance to assess physical activity levels in the outcome assessment of joint arthroplasty. Otherwise, this observation also might be related to geographic differences between US and Swiss patients because Zahiri et al.  also investigated patients undergoing THA.
The UCLA scale has been criterion-validated examining the correlations between UCLA scores and objective measures of physical activity obtained with pedometers . Criterion validity has, to our knowledge, not yet been proven for the ARS or Tegner scale. Nevertheless, one major weakness of the UCLA scale is this scale, similar to the Tegner scale and ARS, does not assess frequency, duration, and intensity of activities. Considering the benefits of health-enhancing physical activity levels and the risks of repetitive load cycles applied on a prosthesis during physical activities, assessment of frequency, duration, and intensity of activities is of utmost importance. There is increasing evidence prosthetic wear is not a function of time but of use . Moreover, patient expectations concerning physical activity levels after TJA are increasing. In 1994, Wright et al.  reported the wish to return to recreational activities such as golfing or gardening was the third most important concern for patients undergoing TJA. This concern was cited by 85% of the patients. Similarly, Mancuso et al.  reported return to sports was a major expectation of surgery in their patients undergoing knee surgery. Mont et al. [27, 28] investigated the ability to play tennis after THA and TKA and reported some of their patients underwent surgery specifically to be able to continue playing tennis. The majority of patients will participate in recreation and sports after surgery [8, 9, 19]. The proportion of physically and sports-active patients after TJA varies among reports but may exceed 90% [29, 30]. Therefore, assessment of activity levels in patients undergoing hip and knee arthroplasty is important to respect patients’ concerns and expectations and to cover possible risks derived from increased physical activity and sports concerning implant failure and wear production.
The UCLA scale is a reliable, feasible, and valid instrument for assessment of activity levels in patients undergoing TJA. The Tegner scale has inferior metric properties compared with the UCLA scale, and the ARS should not be used in this group of patients. Nevertheless, simple measures are needed to assess the frequency, duration, and intensity of performed activities. Future studies should investigate whether simple activity measures such as the UCLA scale allow for assessment of long-term relationships between activities and implant failure.
We thank Markus Loibl and Marc Sieverding for assistance with this study.
Each author certifies that he or she has no commercial associations (eg, consultancies, stock ownership, equity interest, patent/licensing arrangements, etc) that might pose a conflict of interest in connection with the submitted article.
Each author certifies that his or her institution has approved the human protocol for this investigation, that all investigations were conducted in conformity with ethical principles of research, and that informed consent for participation in the study was obtained.