Paragraph 20 The WHI Physical Activity Questionnaire demonstrated moderate to substantial test-retest reliability in a racially diverse sample of post-menopausal women. The reliability estimates observed in this sample are similar to reliability measures from other self-reported questionnaires designed for women (6
) and for older adults (36
). Additionally, the physical activity in this population generally paralleled activity patterns observed in the US population of adults (7
Paragraph 21 The most consistent difference in the test-retest reliability estimates appeared to be lower reliability in the mild exercise or activity measures. Although it is possible that the lower reliability observed in the mild intensity questions may be an artifact of reduced precision, it is consistent with other research (27
). Activities of mild intensity are less memorable and less likely to be recalled, and are consequently less well captured by self-report questionnaires. Another potential explanation for the weaker performance of the mild activity measures may be a result of the questionnaire design. Mild walking, a popular recreational activity in this population, was assessed separately from other mild-intensity activities, and showed higher reliability than mild activity. Therefore, if walking had been included in the mild activity measure, instead of assessed separately, mild activity might have shown higher reliability.
Paragraph 22 Differences in test-retest reliability were not observed when reducing the sample to only women who reported at least one episode of any exercise or recreational activity. Interestingly, there were also no meaningful differences in reliability observed across race/ethnic groups. Previous studies have been mixed in their reporting of differences in reliability by race/ethnicity (5
). However, it is also important to consider the wide confidence intervals in the race/ethnicity estimates, as stratifying the data resulted in a loss of precision.
Paragraph 23 Although we did not observe differences in reliability between the different race/ethnic groups, or by level of activity, some patterns were observed by age and length of time between test and retest. Women who were 65 years or younger demonstrated better test-retest reliability than women who were older. Variability of physical activity in older women may be influenced by a number of factors, such as changing health status, (e.g., fatigue, injury, disease progression), retirement, or loss of a spouse (4
). Any of these changes within the study period could impact questionnaire reliability as women's activity patterns are affected. Additionally, aging is associated with cognitive decline that can impact memory and could in turn affect reliability (26
Paragraph 24 Not surprisingly, a slightly higher pattern was observed in some measures among the sample of women who repeated tests within a three-month time period compared to women who experienced more than three months between the tests. One explanation could be because tests repeated within a shorter time frame are more likely to be given in the same season or comparable time of year with regards to weather. Furthermore, a change in activity (either increase or decrease) could have occurred after the administration of the first questionnaire, such that the reliability estimates would be lower.
Paragraph 25 While reliability could be explored with this data, validation of the WHI physical activity questionnaire could not be assessed. However, the questionnaire's validity was recently explored among 74 women enrolled in the Women's Healthy Eating and Living Study (17
). In this convenience sample of women, the WHI physical activity questionnaire was correlated with both the accelerometer (Actigraph 7164) and 7-day physical activity recall (r=0.73, 0.88, respectively). Although the WHI questionnaire had 100% sensitivity for identifying women who met the physical activity guidelines, the specificity was only 60%. The questionnaire tended to underestimate moderate activities and overestimate vigorous activities.
Paragraph 26 Despite the diverse and large sample, this study had several limitations. The WHI sample was not population-based and may not be representative of a specific source population. White women comprised a larger sample than other racial/ethnic groups. Because of the small sample sizes representing Hispanic, African American, and Asian/Pacific Islander women, the bounds of the lower confidence interval were estimated below zero in several of the stratified analyses. Additionally the level of education in our sample was very high and we were unable to examine variation in test-retest reliability by education. Another limitation to this study was that participants were not randomized to the two forms and some differences were observed between the two groups.
Paragraph 27 Several other considerations should be made when using the questionnaire. While the WHI physical activity assessment included a measure of yard and household activity, it was not a comprehensive measure of women's potential activities. Several domains of activity such as non-motorized transportation (active travel), child or elder care activity, and work or occupational physical activity were not included in the WHI physical activity questionnaire.
Paragraph 28 Reliable and valid questionnaires are a cost-effective and useful method for collecting physical activity information in large cohort studies, such as in the WHI Observational Study. However, measurement of physical activity is challenging as many questionnaires do not collect detailed information on types of activities and use terminology many women do not identify with (2
). The WHI Physical Activity Questionnaire is one of the first questionnaires to examine different types of physical activity in a large, multiethnic sample of women. This analysis shows that the different domains of physical activity behavior, such as recreational, yard, and household activity, can be reliably estimated in an ethnically diverse sample of post-menopausal women.