|Home | About | Journals | Submit | Contact Us | Français|
Sedentary time (too much sitting) is increasingly being recognized as a distinct health risk behavior. This paper reviews the reliability and validity of self-reported and device-based sedentary time measures and provides recommendations for their use in population-based studies. The focus is on instruments that have been used in free-living, population-based research in adults. Data from the 2003–2006 National Health and Nutrition Examination Survey are utilized to compare the descriptive epidemiology of sedentary time that arises from the use of different sedentary time measures. A key recommendation from this review is that, wherever possible, population-based monitoring of sedentary time should incorporate both self-reported measures (to capture important domain- and behavior-specific sedentary time information) and device-based measures (to measure both total sedentary time and patterns of sedentary time accumulation).
Sedentary behaviors are those pursuits undertaken while awake that involve sitting or reclining and that result in little or no physical activity energy expenditure – typically 1 to 1.5 times the resting metabolic rate.1,2 Common sedentary behaviors include sitting or lying down while watching TV, using a computer, or driving. Sedentary time can be measured in three ways: (1) in terms of these specific behaviors (e.g., TV viewing time); (2) the amount of sedentary time occurring in a specific domain (e.g., work, leisure, domestic, transport); and, (3) the overall sedentary time across the day. As the term “sedentary” encompasses both sitting and reclining, the broader term sedentary is used in this article, except when sitting is specifically measured.
This paper provides an overview of current methods used to measure sedentary time in free-living, population-based research in adults. The first section provides information on the reliability and validity of self-reported measures, and extends from previous reviews3 to encompass multiple domains of sedentary time. The second section describes device-based measures, with a particular focus on the interpretation and validity of data from the Actigraph activity monitor. The final section uses data from the U.S. National Health and Nutrition Examination Survey (NHANES) to provide an example of how the descriptive epidemiology of sedentary time may differ depending on how it is measured.
Overall sedentary time can be assessed with either a single item (sometimes asked separately for weekend and weekdays), or by summing responses for the various behaviors or domains (composite measure). Key self-reported methods used are questionnaires (self-administered or interviewer-administered), behavioral logs, and short-term recalls. Questionnaires are a popular method3 because they can be implemented on a large scale, are relatively inexpensive, and do not alter the behavior under investigation.4 However, as with physical activity assessment,4, 5 questionnaires that seek to assess habitual levels of sedentary behavior are susceptible to random and systematic reporting errors.
Short-term recalls (e.g., 24-hour recall) and behavioral logs4 can reduce some of these reporting errors, such as long-term averaging. Traditionally, the disadvantages of behavioral logs (participant burden, systematic reporting errors and administration costs) have limited their use in population-based research. However, new approaches and technologies can reduce costs. For example, the National Cancer Institute has developed, and is currently testing, an Internet-based instrument for population surveillance of both active and sedentary behaviors.6
The usefulness of a self-reported measure is dictated to a large extent by the properties of test–retest reliability and criterion validity.7 A summary of test–retest reliability8-33 and criterion validity8, 9, 11, 15, 17-22, 28, 29, 34-40 findings for self-reported measures of overall and domain-specific sedentary time is provided in Tables 1 and and2.2. Depending on the available information, the intra-class correlation (ICC), Spearman’s rho (ρ) or Pearson’s correlation coefficient (r) are reported. Systematic differences between self-reported and criterion measures, when reported,8, 19-22, 34-37 are summarized in the text.
Reliability studies have varied in terms of recall period (from 3 days9, 10, 23, 25 to 3 months24), administration method (telephone or interview), and target population, making it difficult to compare their findings. Accordingly, the strength of association between test and retest measures varied widely across studies (Table 1). The majority of self-reported sedentary time measures showed moderate-to-high correlations, with magnitudes comparable to results reported for physical activity measures,11 indicating acceptable to good test–retest reliability. Stronger reliability was generally observed for sedentary behaviors that tend to be done on a regular basis and for prolonged periods of time, such as sitting at work and TV viewing time, than for less regularly performed behaviors, such as travel or other sitting.
Most questions about leisure-time3 and workplace sitting12, 13 asked about typical patterns of behavior. In comparison, the overall sitting measures asked either about typical behavior11, 14-17 or about sitting in the last 7 days.9, 11, 18, 23, 25-27 No difference between these two methods was found in a review of measures of non-occupational sitting time3 and in a comparison of two versions of the International Physical Activity Questionnaire (IPAQ; ‘typical’ or ‘last 7 days’).11
As detailed in Table 2, the validity of most questionnaire measures of sedentary time has been assessed against behavioral logs or accelerometers. However, these are not ‘gold standard’ measures of sedentary time, having their own errors and biases. To date, the most robust criterion employed has been combined hip-mounted accelerometer and behavioral log data.8, 37
The validity of the IPAQ single-item question used to assess overall sitting time has been extensively examined in a number of countries with participants of varying ages (18–65 years).9, 11 Most studies have shown low-to-moderate correlations with a criterion of accelerometer-derived sedentary time,9, 11, 17, 18 comparable in magnitude to those reported for interviewer-administered physical activity measures (Figure 1).41 While composite measures of sedentary time have also shown only low-to-moderate correlations with accelerometer-derived sedentary time (Figure 1),15, 21, 22, 28 total sitting time tends to be lower when assessed by a single-item (4.35–7.92 hours/day)9, 42, 43 than those by composite measures (7.25–9.80 hours/day).12, 19, 21 While direct comparison is hampered by the use of varying criterion measures, mode of administration, and target populations, correlations tended to be higher for domain-specific measures than for overall sedentary measures (Table 2) – particularly for screen time,8 computer use,19, 29 work,19 and TV viewing time.19, 36 Collectively, results suggest it may be more difficult to recall the time spent sitting during the entire day than the time spent sedentary for specific behaviors or in different domains.
Findings from the relatively few studies that have reported on absolute agreement are mixed, with reports of both overestimation20, 34, 35, 37 and underestimation19, 21, 22 of sitting time compared with criterion measures. The sitting time reported for TV viewing, screen time and eating were typically underestimated compared to device-based measures of these same behaviors.8, 36 For example, on average, people report half an hour less TV viewing time than is recorded by the criterion measure,36 and the wide limits of agreement showed large discrepancies between self-report and the criterion at the individual level.
The reliability and validity of available self-reported measures of sedentary time are highly variable but comparable with those reported for physical activity measures. The available evidence suggests many sedentary time measures have acceptable measurement properties (i.e., adequate test–retest and relative agreement with criterion measures) for establishing cross-sectional associations with health outcomes, but not necessarily for assessing changes over time in cohort and intervention studies. The evidence on absolute agreement is sparse, and shows only limited agreement against criterion measures that are less than ideal.8, 19-22, 34-37 In the only study to examine responsiveness to change, questionnaire-assessed sitting performed as well as accelerometer-assessed sedentary time.22
More work is also required to assess: nuances associated with mode of questionnaire administration (e.g., interviewer vs self-administration); different response formats (e.g., continuous or categoric); the time-frame of assessment (e.g., short-term, such as past day or last 7 days, versus habitual patterns such as typical day, usual week, or past year); and how these factors affect sedentary time estimates. Importantly, several achievable improvements to study design could improve understanding of the measurement properties. Much research to date has been conducted (either wholly or in part) with university samples26, 29, 30, 44 or with particular population subgroups, including overweight adults,36 middle-aged women,19 and young men.18
More research also is needed to focus on general population and subpopulation samples for which reliability and validity might be affected by issues of literacy, cognition, language and less ‘regular’ patterns of some sedentary behaviors (e.g., parents with young children or shift workers). Furthermore, improved criterion measures (see Section 2) are now available that could be used, with concomitant collection of behavioral log data where behavior- or domain-specific measures are required. Device-based measures specific to particular behaviors, such as the electronic TV monitor (which monitors user-specific TV viewing time),36 may also be useful.
Given the errors associated with self-report, the ideal measure of sedentary time would:
No such instrument currently exists. To date, the main instrument used to derive sedentary time in population-based studies is the hip-mounted uniaxial Actigraph accelerometer (model 7164), using 1-minute data-collection epochs.45, 46 In this paper, unless otherwise specified, the term “Actigraph activity monitor” refers to this particular model (7164), placement (hip), and epoch length (1 minute). This device has been shown to provide reliable, valid, and stable measurements of physical activity when compared with other measures of functional capacity.47 It can also provide information about total sedentary time and the manner in which sedentary time is accumulated, both of which have shown associations with health outcomes.48, 49
The primary aim of this section is to describe the collection, analysis and interpretation of data from the Actigraph activity monitor. Its validity was also reported when compared with two other device-based measures of sedentary time: the Intelligent Device for Energy Expenditure and Activity (IDEEA) monitor,50 and the activPAL activity monitor.51 Both instruments have been reported to have high accuracy for determining body position as compared to direct observation,50, 51 although neither has yet been used in population monitoring of sedentary time.
Accelerometers measure time-varying changes in force.52 Activity levels are typically recorded as counts, which are then summed over a user-specified time frame, or epoch. Several considerations for using accelerometers in field-based research have been reported in detail, 53-55 including accelerometer type, days of wear, and epoch length. Population-based studies utilizing accelerometers have typically used Actigraph activity monitors, had a 7-day wear protocol, and used a 1-minute epoch.45, 46, 56, 57
Once data are collected there are several analytic decisions, including cut-points, wear time, and data cleaning, to ensure that data can be meaningfully interpreted. Although the most accurate cut-point is yet to be established, counts per minute (cpm) of <100 are typically classified as sedentary time.11, 57-59
Wear time is a particularly important consideration. Participants are typically instructed to wear the monitor during “waking hours”, and to remove it for any water-based activity. As suggested by physical activity research, a minimum time of wear is generally required (for example, 10 hours per day59 and 4 days of wear including a weekend day60). Even so, individual wear time is highly variable and ‘missing data’ are usually indistinguishable from sleeping time, which should be excluded from sedentary time calculations. This introduces measurement error. In population-based studies, wear time for Actigraph activity monitors is usually estimated by automated programs, designed to detect long periods of low (mostly zero) counts.59 However, this can misclassify sedentary time as nonwear, and vice-versa.61 Methods of correcting for wear time include reporting sedentary time as a percentage of wear time, statistical adjustment in regression models, and using the residuals method.62
Sedentary time data derived from the Actigraph activity monitor are typically reported either as average hours per day or as a percentage of total wear time. The manner of sedentary time accumulation provides important additional information, such as the length and intensity of each sedentary bout or the number of interruptions (breaks) in sedentary time.48, 49 Furthermore, as data are date- and time-stamped, there is potential for more detailed examination of both sedentary time and patterns during specific time periods, such as during work hours.
Following is a description of two studies led by coauthor Charles Matthews that examined the validity of sedentary time derived from the Actigraph activity monitor (<100 cpm) against the criterion of the IDEEA monitor and the activPAL activity monitor.
Participants (n=19, mean age 40.1 years) concurrently wore the Actigraph activity monitor and IDEEA monitor for 2 days59 for the same amount of time on average (both 13.2 hours/day, SD 2.15). Sedentary time was similar for the accelerometer (8.63 hours/day, SD 1.90) and the IDEEA (8.53 hours/day, SD 1.86), and the two measures were highly correlated (ρ=0.59).59 This initial field study supported the use of the <100 cpm threshold for estimating sedentary time.11, 63
In a second study, 86 participants (87% women; mean age 52.7 years, SD 8.6 years) simultaneously wore an Actigraph activity monitor and activPAL for 7 consecutive days. For this analysis, only valid days that had similar estimated wear times for both devices (± 30 minutes) were considered. Sedentary time derived from the Actigraph activity monitor (<100 cpm) was compared with that from the activPAL (sitting and lying down) over an average of 4.5 observed days per person, and an average wear time of 14.3 hours per day (SD=1.5) for each device.
On average, recorded sedentary time was lower for the Actigraph activity monitor (8.7 [SD=1.6] hours/day, or 60.9%) than for the activPAL (9.0 [SD=1.8] hours/day, or 63.4%; both p=0.01), but the correlation between the measures was relatively high (ρ=0.76, p < 0.01). Interestingly, Bland–Altman analysis64 (Figure 2) showed a small mean difference (−0.34 hours) and wide 95% limits of agreement (2.11 to −2.79 hours). This indicates that the Actigraph activity monitor has minimal bias overall, but can both substantially over- and under-estimate sedentary time compared with the activPAL.
These two validity studies imply that Actigraph activity monitors provide useful estimates of sedentary time in the population and that they are sufficiently accurate to rank individuals by their level of sedentary time. The width of the limits of agreement observed warrants further study and suggests some caution is required when using indirect measures of sedentary time derived from only body motion. Instruments that measure body position more directly may be preferable in studies that require precise and accurate measures of sedentary time.
The incorporation of Actigraph activity monitor measures into the 2003/04 and 2005/06 NHANES was an important development in the field of physical activity and sedentary time research. With data from over 14,000 participants, it demonstrated the feasibility and utility of using these devices on a large scale. The inclusion of device-based measures in current65, 66 and future national health surveys will enable cross-country comparisons of levels of physical activity and sedentary time, as well as the ability to monitor population trends in these behaviors.
More-sophisticated systems for measuring time spent in various postures (e.g., sitting vs standing/upright) using more direct measures of body position have recently been developed.50, 51, 67-69 In addition, new approaches for translating more densely sampled data from hip-mounted accelerometers (e.g., 1- or 10-second epochs; raw data) to classify different types of behavior are also on the horizon.70-72 These new instruments and analytic approaches appear to provide more accurate and precise estimates of time spent in sedentary behaviors than were reported with the Actigraph 7164 activity monitor. There is also now the potential for the integration of multiple information sources, such as accelerometry, inclinometers, physiologic monitors, GPS technology, and behavioral logs.
In summary, key directions for future research in device-based measures of time spent sedentary are:
In 2003/04 and 2005/06, the large, population-representative NHANES included both self-reported (global sitting time, TV viewing time, computer time) and device-based (accelerometer) measures of sedentary time. These data provide the unique opportunity to examine, within one sample, the descriptive epidemiology of sedentary time in the U.S. using a variety of measures. Rather than reporting the relationships of the various sedentary measures (which have been described previously73), the aim of this section is to examine similarities and differences among the measures in the patterning of sedentary time by gender, race/ethnicity and age.
The relevant NHANES methods are described in at http://www.cdc.gov/nchs/nhanes.htm.45 The National Centre for Health Statistics Ethics Review Board approved the protocols and written informed consent was obtained. For this study, 2003–2006 data from adult participants (≥20 years) were used. The study did not vary in protocol and had high response rates across this period.45
In the household interviews, participants were asked to report the time they spent watching TV or videos (TV time) and using a computer or playing computer games (computer use) on an average day over the last 30 days. The categoric responses were collapsed into three dichotomous sedentary markers: TV time, computer use, and screen time (combined TV time and computer use). Cut-points were ≥2 hours per day for TV, ≥1 hour per day for computer use, and ≥3 hours for screen time. These were based on the availability of sufficient responses in all subpopulations, low rates of computer use in older age groups, and values used in previous research.74 Participants were also asked to best describe their usual daily activities (i.e., work, domestic activities, or general activities throughout the day). The response options were collapsed into a dichotomous variable sitting, which was yes if the respondent answered yes to the first option (“sitting during the day and not walking about very much”) or no if the respondent answered yes to any of the remaining options.
An accelerometer (Actigraph model 7164; Actigraph, LLC, Fort Walton Beach, Florida) was worn on the right hip during waking hours (except for water-based activities) for 7 days. Data cleaning and automated wear time estimation were undertaken as previously described.60 Daily sedentary time (<100 cpm) was calculated and standardized for wear time using the residuals method.62 Data are reported as averages for valid days (≥10 hours wear, counts <20,000, monitor returned in calibration), limited to participants who provided at least 4 valid days of observation.75
Data were analyzed in STATA version 11.0 (College Station, TX, Stata Corporation), with significance set at p<0.05. Data were pooled from 2003–2006 to obtain sufficient numbers for stratified analyses. No significant changes (2003/04 to 2005/06) were observed. Self-reported TV time and computer use data were available for 10,012 adults, self-reported sitting data were available for 10,009, and ≥4 days of valid accelerometer data were available for 6,235.
Mean accelerometer-derived sedentary time (hours per day) and the prevalences of sitting, ≥2 hours per day TV time, ≥1 hour per day computer use, and ≥3 hours per day screen time were compared across gender, race/ethnicity categories (self-reported non-Hispanic white, Mexican-American, and non-Hispanic black), and 10-year age bands using marginal means from linear (accelerometer) or population marginal probabilities from logistic (self-reported) regression models. In view of the complex survey design, and to ensure population representativeness, all models used linearized variance estimation and, except when testing interactions, were weighted for selection probabilities and nonresponse. The weights provided by NHANES were further reweighted to correct for the large amount of missing/invalid accelerometer data.75 The data are population-representative.
After adjusting for age and race/ethnicity, there were significant gender differences in all measures of sedentary time, with the direction and magnitude of the difference depending on the measure. For the domain-specific measures, prevalence was lower in women than men for high TV time (64.9% [95% CI=63.0%, 66.8%] vs 69.2% [67.6%, 70.7%], p<0.001), computer use (27.1% [25.1%, 29.1%] vs 31.3% [27.9%, 32.8%], p=0.034), and screen time (48.3% [46.2%, 50.3%] vs 52.0% [49.7%, 54.4%], p=0.012). However, more women than men reported sitting for most of the day (26.2% [24.4%, 28.0%] vs 21.5% [20.1%, 22.9%], p<0.001). This was consistent with the accelerometer findings (mean 8.50 [8.41, 8.59] hours/day in women vs 8.35 [8.25, 8.45] hours/day in men, p=0.006), although the magnitude of this difference was relatively small.
After adjusting for age and gender, Mexican Americans were significantly less sedentary (p<0.05) than non-Hispanic whites and non-Hispanic blacks according to all sedentary time measures, with the exception of high levels of TV time. Here, the prevalence was similar for Mexican Americans (69.0% [66.3%, 71.5%]) and non-Hispanic whites (67.6% [65.8%, 69.3%], p=0.383), but significantly higher for non-Hispanic blacks (79.1% [75.7%, 82.5%], p<0.01). Compared with non-Hispanic whites, non-Hispanic blacks also had a higher prevalence of high screen time (51.1% [48.6%, 53.7%] vs 65.8% [61.8%, 69.7%], p<0.001), but these two racial/ethnic groups did not differ significantly for any other measure.
Figure 3 shows the mean (a) and the prevalence estimates (b) of the sedentary time measures by age group (adjusted for gender and race/ethnicity). With the exception of computer use (where prevalence decreased with age), mean sedentary time and prevalence estimates tended to increase with age, but with a decrease between the 20–29 year and the 30–39 year age groups for all measures except sitting (which increased steadily with age).
Figure 4 expands on Figure 3 by showing the mean (A and B) and the prevalence (C–H) estimates of the sedentary time measures by racial/ethnic group across age categories separately for men and for women. Among men, age trends in sedentary time differed significantly across racial/ethnic groups according to accelerometer-derived sedentary time (F(df: 10, 21)=3.24, p=0.01), but not according to the self-reported measures (p≥0.1). Among women, the age trends differed significantly by race/ethnicity according to the self-reported measures (sitting, screen time, TV time, and computer use; all p<0.05), but not the accelerometer-derived measure (p>0.1). Screen time results (omitted) were very similar to TV time. For a complete summary of results, see Appendixes A and B (www.ajpmonline.org).
In summary, the sedentary measures were consistent to some extent in identifying populations comparatively more or less sedentary, with older (60+) adults generally the most sedentary, and Mexican Americans generally the least sedentary. However, these subgroup differences are not apparent if only a single sedentary time measure is assessed. For example, if NHANES had measured only TV time, then the strong and largely consistent differences between Mexican Americans and non-Hispanic whites would not have been observed. If accelerometer-derived sedentary time had been the only measure, then important differences in specific sedentary behaviors between men and women and across the lifespan would not have been seen. Thus, wherever possible, both domain-specific and overall measures of sedentary time (preferably device-based) should be assessed. Furthermore, the inclusion of time spent sedentary in other domains, such as work and travel, should also be considered.
This paper provides an overview of the reliability and validity of current self-reported and device-based (primarily the Actigraph activity monitor) population-based measures of time spent sedentary. The 2003–2006 NHANES was utilized as an example of how various measures of sedentary time identify different population as ‘at-risk’.
Given that both self-reported and device-based instruments capture important aspects of sedentary behavior, it is recommended that wherever possible, both measures should be used for population-monitoring of sedentary time. For self-reported measures, monitoring should extend beyond measures of overall sitting to include the various domains. The battery of questions should be succinct, consistent in their terminology and administration (to allow comparison across time, and across different populations), and based on reliable and valid measures. Device-based measures should be affordable, distinguish among various postures, have relatively low participant burden, and where possible, integrate multiple sources of information that provide greater context for the behaviors observed. This paper identified key research directions for the development and refinement of such measures.
Healy is supported by a NHMRC (#569861) / National Heart Foundation of Australia (PH 08B 3905) Postdoctoral Fellowship. Clark is supported by an Australian Postgraduate Award and Queensland Health Core Research Infrastructure grant. Winkler and Gardiner are supported by a Queensland Health Core Research Infrastructure grant and by NHMRC Program Grant funding (#569940). NHANES data used in this study were collected by the National Center for Health Statistics, CDC
No financial disclosures were reported by the authors of this paper.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.