Search tips
Search criteria 


Logo of amjepidLink to Publisher's site
Am J Epidemiol. 2013 March 15; 177(6): 576–585.
Published online 2013 February 22. doi:  10.1093/aje/kws269
PMCID: PMC3626043

Physical Activity Assessment: Biomarkers and Self-Report of Activity-Related Energy Expenditure in the WHI


We used a biomarker of activity-related energy expenditure (AREE) to assess measurement properties of self-reported physical activity and to determine the usefulness of AREE regression calibration equations in the Women's Health Initiative. Biomarker AREE, calculated as the total energy expenditure from doubly labeled water minus the resting energy expenditure from indirect calorimetry, was assessed in 450 Women's Health Initiative participants (2007–2009). Self-reported AREE was obtained from the Arizona Activity Frequency Questionnaire (AAFQ), the 7-Day Physical Activity Recall (PAR), and the Women's Health Initiative Personal Habits Questionnaire (PHQ). Eighty-eight participants repeated the protocol 6 months later. Reporting error, measured as log(self-report AREE) minus log(biomarker AREE), was regressed on participant characteristics for each instrument. Body mass index was associated with underreporting on the AAFQ and PHQ but overreporting on PAR. Blacks and Hispanics underreported physical activity levels on the AAFQ and PAR, respectively. Underreporting decreased with age for the PAR and PHQ. Regressing logbiomarker AREE on logself-reported AREE revealed that self-report alone explained minimal biomarker variance (R2 = 7.6, 4.8, and 3.4 for AAFQ, PAR, and PHQ, respectively). R2 increased to 25.2, 21.5, and 21.8, respectively, when participant characteristics were included. Six-month repeatability data adjusted for temporal biomarker variation, improving R2 to 79.4, 67.8, and 68.7 for AAFQ, PAR, and PHQ, respectively. Calibration equations “recover” substantial variation in average AREE and valuably enhance AREE self-assessment.

Keywords: biomakers, measurement error, physical activity, postmenopausal women

Measurement error in self-reported lifestyle behaviors is an impediment to advancing credible programs for the prevention of obesity and lifestyle-related chronic diseases, such as diabetes, cardiovascular disease, and cancer (1). A lack of consistency in results exists across observational studies of diet, physical activity level, and other modifiable risk factors in relation to several major chronic diseases. Some inconsistencies might be attributable to different study populations with variations in underlying susceptibility or a general lack of knowledge about certain confounding factors, but a large part of the problem may be attributed to measurement error in self-reporting (25).

Measurement error in self-reported physical activity levels has been less frequently studied than has dietary misreporting, but evidence suggests that the same phenomenon occurs. In 2008, Trioano et al. (2) reported that 51% of US adults in the National Health and Nutrition Examination Survey (NHANES) were adherent to national physical activity recommendations when data were gathered via self-report, but only 5% met the recommended guidelines based on accelerometer data, which suggests that self-assessment is grossly misreported. In a 2008 review of 20 studies that included both self-reported and objective measures of physical activity, Neilson et al. (3) reported that the face validity of self-reported physical activity was very poor compared with that of objective biomarkers. Whether these errors in self-report of physical activity level are random or systematic has been understudied. In one of the few studies in which systematic bias in self-reported physical activity was reported, investigators used standard questionnaires and diaries, as well as objective measures from accelerometry, in 154 adults. The authors reported systematic bias in self-reported data by age, sex, and body mass index (BMI; measured as weight (kg)/height (m)2) and commented that the results have implications for epidemiologic studies in which researchers test the associations between self-reported physical activity and disease outcomes (5).

We previously reported that measurement error in dietary self-report distorts relative risk estimates for the associations of diet with cancer, cardiovascular disease, frailty, renal function, and diabetes and that the magnitude of measurement error varies by the type of dietary assessment instrument used (612). We have developed and applied regression calibration equations utilizing objectively measured nutritional biomarker data to correct the measurement error in dietary self-report and have applied these data to subsequent disease association studies (6, 13). Here we build on this approach by using an objective measure of physical activity level to understand the measurement properties of self-reported physical activity assessment instruments.


Overview of the Women's Health Initiative Observational Study

The Women's Health Initiative (WHI) Observational Study is a prospective cohort study that enrolled 93,676 postmenopausal women from 1994 to 1998 throughout the United States. WHI Observational Study design details have been published previously (14). Briefly, women were eligible for the WHI Observational Study if they were postmenopausal, 50–79 years of age at enrollment, and likely to be residing in the same area for at least 3 years. Women were excluded if they had competing risk factors (medical conditions with a predicted survival of less than 3 years), had adherence/retention issues (alcohol/drug dependency, mental illness, or dementia), or were actively participating in any clinical trial. Participants attended baseline and year 3 in-person clinical visits; all other annual follow-up was completed by mail. Study procedures were approved by the institutional review boards of each clinical center and the Clinical Coordinating Center at the Fred Hutchinson Cancer Research Center, Seattle, Washington. All women signed written informed consent.

The Nutrition and Physical Activity Assessment Study (NPAAS) was an ancillary study to the WHI Observational Study and enrolled 450 participants from 9 of the 40 WHI clinical centers between 2007 and 2009. Recruitment was structured to oversample women who were: 1) black or Hispanic; 2) at the lower and higher ranges of the BMI distribution (<18.5 and ≥30.0 at the WHI Observational Study baseline clinical visit); and 3) 50–59 years of age at WHI enrollment. Letters of invitation were sent to potential participants, and women who responded were screened via a telephone interview. Women were excluded if they had weight instability, travel plans during the study period, or any medical condition that would preclude participation. Of the 2,184 women screened, 20.6% (n = 450) enrolled and completed the study. A subsample of 88 women (19.6%) repeated the entire protocol approximately 6 months later to provide data on the repeatability of study measures. This subsample was a convenience sample of women who were willing and eligible to repeat the protocol. Women who did not successfully complete the first 2 study visits were not eligible (i.e., women who experienced difficulties with timed urine collections or indirect calorimetry-related clautrophobia). All study procedures were approved by the institutional review boards of the 9 participating institutions plus that of the Clinical Coordinating Center at the Fred Hutchinson Cancer Research Center, Seattle, Washington. Participants provided written informed consent and received $100.00 upon study completion.

Study procedures

Activities included 2 visits to local WHI clinics, with additional study activities completed at home (Figure 1). At visit 1, participants supplied informed consent and anthropometry measures and completed the first part of a doubly labeled water (DLW) protocol, a food frequency questionnaire, the WHI Personal Habits Questionnaire (PHQ) (15), and questions on body image and other lifestyle habits. In addition, participants completed either the Arizona Activity Frequency Questionnaire (AAFQ) (16) or the 7-Day Physical Activity Recall (PAR) (17). At visit 2, participants completed the final part of the DLW protocol, completed either the AAFQ or PAR (whichever was not completed at visit 1), and underwent indirect calorimetry to measure resting energy expenditure (REE). Demographic characteristics had been collected previously in WHI.

Figure 1.
Women's Health Initiative Nutrition and Physical Activity Assessment Study design, 2007–2009. DLW, doubly labeled water; FFQ, food frequency questionnaire, NPAAS, Nutrition and Physical Activity Assessment Study.

Objective measure of physical activity

We obtained an objective measure (biomarker) of physical activity by measuring total energy expenditure (TEE) by DLW and REE by indirect calorimetry. We then computed activity-related energy expenditure (AREE) as AREE =TEE – REE. Analyses were repeated with the biomarker defined as AREE = 0.9(TEE) – REE, with the 0.9 factor intended to provide an adjustment for the thermic effect of food (18).

TEE was estimated from DLW protocol results using our standard procedure (13, 19). DLW is considered the gold standard for assessing short-term energy turnover because in weight-stable individuals, it measures energy intake and volitional and nonvolitional energy expenditure (i.e., total energy expenditure) over a 2-week period (1921). A 6.5% quality-control failure rate occurred for the DLW procedure. Of these, half were due to low tracer enrichments or lack of equilibration, whereas the others were due to dilution space or other external reproducibility issues.

REE was estimated by indirect calorimetry using a standard protocol (22) with either a DeltaTrac II Respiratory Gas Analyzer (Datex-Ohmeda, Inc., Madison, Wisconsin) or a Sensormedics VMAX metabolic cart (SensorMedics, Inc., Yorba Linda, California). All metabolic carts were calibrated each day according to the manufacturer instructions, and gasses were carefully monitored during each test. Participants arrived on day 15 after a 12-hour fast and rested in a semireclined position in a thermally neutral room for 30 minutes before a 30-minute test under a canopy (8 clinics) or using a nonrebreathing fitted mouthpiece (1 clinic). Previous research has shown no apparatus-related differences in REE measurements (23, 24). Data points were obtained every minute, but the first 10 minutes of collected data were excluded from data analysis because 10 minutes are needed to achieve steady-state metabolism. Steady state was defined as 10 minutes during which the oxygen consumption, minute ventilation, and respiratory exchange ratio did not vary by more than 10% (22). Participants who did not reach a steady state or did not have at least 10 minutes of useable data (n = 16) were not included in AREE biomarker analyses.

Self-reported assessment of physical activity

Participants completed 3 self-report physical activity instruments: the interviewer-administered PAR (17, 25), the self-administered AAFQ (16), and the self-administered WHI PHQ (15). The PAR included questions about time spent in sleep and performing moderate, hard, and very hard intensity activities for each segment of each day over the 7 days before the interview. Total daily energy expenditure was estimated by assigning to each activity category standard values of intensity expressed as multiples of resting metabolic rate or metabolic equivalents (METs). For example, 1 MET is approximately equal to 1 kilocalorie per kilogram per hour in sleep, and 1.5, 4.0, 6.0, and 10.0 METs were assigned to light, moderate, hard, and very hard activities, respectively. We next multiplied the METs by reported time spent in each activity, summed over all intensities, and multiplied by body weight to yield a summary score for energy expenditure in kilocalories per day (17, 26). Hours spent in light activities were derived by summing the time spent in the other reported activity categories and subtracting the total number of hours from 24. Final values represent 24-hour energy expenditure, including sleep. However, for the analyses presented in the present article, we omitted sleep (computed as reported weekly sleep hours × 1 MET × kg of body weight) by subtracting the sleep kilocalories from total activity.

The AAFQ is a self-administered mark-sense booklet adapted from the Minnesota Leisure Time Physical Activity Questionnaire (27) and was previously validated using DLW in a small sample of adults (16). The AAFQ includes specific activities grouped by domain (e.g., occupation, leisure, recreation, home maintenance), with categorical responses for frequency and duration of leisure, recreational, personal care, and household chore activities over the previous 4 weeks. Response options for each activity include the number of days and the length of time in hours. The average daily energy expenditure for each recorded activity was calculated as the average number of hours per day reported for each activity multiplied by an estimated REE (computed using the Mifflin equation) (18) and then multiplied by the METs intensity value assigned to the activity (28). Similar to what was seen with the PAR, AAFQ data output also yielded a summary score in kilocalories per day (including sleep). As done for the PAR, we omitted sleep by subtracting sleep kilocalories, which were computed as [sleep hours × (REE/24)]/4.184.

The WHI PHQ is a short, self-administered questionnaire that inquires about the usual frequency and duration of walking activity outside the home, other mild recreational activity (e.g., slow dancing, bowling), moderate recreational activity (e.g., outdoor biking, easy swimming), and strenuous recreational activity (e.g., aerobics, jogging) (15). To characterize duration of physical activity, women chose 1 out of 5 frequency categories that ranged from never to 5 or more days per week. In addition, participants were asked how often they walked outside the home for more than 10 minutes (ranging from never to ≥7 times/week), the usual duration of their walking episodes, and their usual walking speed (casual strolling, normal, fairly fast, or very fast). Similar to how we scored the PAR and AAFQ, standard intensity values (28), expressed as METs, were assigned to each activity item, multiplied by reported duration, and then summed to compute AREE in MET-hours per week. Because the PHQ only obtains data on recreational activity, we used data previously collected in WHI ( (years 3 and baseline for the primary and reliability studies, respectively) to obtain estimates for housework, yard work, sitting, sleeping, and all other activities to be able to create total daily AREE. We assigned METs to each of the activity categories using standard algorithms and 1.5 METs to all other nonspecified activities (28).

Statistical analysis

Our analytic goals included 1) understanding measurement properties of self-reported physical activity; 2) studying whether a biomarker of AREE could be used to calibrate (correct) the corresponding self-report assessments; and 3) characterizing the reliability of the self-reported data and biomarkers in a subsample of participants. We used descriptive statistics to characterize the study population and to assess the distributions of the self-reported measures of physical activity from the 3 instruments (PAR, AAFQ, and PHQ) and the objective biomarkers. To compare energy expenditure from the self-reported questionnaires with that from the biomarker using consistent units, we converted MET values to kilocalories for the PAR and PHQ (AAFQ output includes kilocalories). We computed daily PHQ kilocalories as total METS/week × body weight/7. For the PAR, we computed daily activity kilocalories as PAR TEE –PAR REE, where PAR REE = 1 kcal/kg/hour × body weight in kg × 24 hours. The biomarkers and self-reported values were transformed by the natural logarithm to improve the normal distributional approximations, and geometric means and 95% confidence intervals are presented. We examined the reliability of the biomarker and self-reported data by using unadjusted correlations to assess the agreement between the primary and reliability samples. Our principal analyses do not include an adjustment for the thermic effect of food because participants were in the fasting state.

Our measurement error analysis relied on the assumption that the log-transformed AREE biomarkers adhered to a classical measurement error model with “errors” that are independent of the corresponding log-transformed self-report assessment. We defined reporting error as log(self-report AREE) − log(biomarker AREE). To understand the extent to which participant characteristics influence self-reported physical activity reporting error, we regressed the reporting error on pertinent characteristics (age, race/ethnicity, BMI) for each of the physical activity instruments.

We next conducted a series of linear regression models to understand the association of the biomarker with the self-reported data and to estimate the fraction of the total variance in the log-transformed biomarker (R2) that could be explained by the self-report assessment and other pertinent variables. In other words, the R2 provides information about the strength of the AREE “signal” from the self-report and from other study subject characteristics. These regression analyses included age, race, and BMI, and results included an R2 for each predictor variable, which estimates the contribution of each predictor to the total variability. Statistical analyses were conducted with R software, version 2.13.0 (R Development Core Team, Vienna, Austria) and SAS, version 9.2 (SAS Institute, Inc., Cary, North Carolina).


The distributions of the demographic and lifestyle characteristics of the study participants are presented in Table 1. We met the recruitment goal to oversample women who were younger (<60 years of age at WHI enrollment), women who were at the extremes of the BMI distribution, and women who were black or Hispanic.

Table 1.
Demographic and Lifestyle Characteristics of Participants (n = 450), Women's Health Initiative Nutrition and Physical Activity Assessment Study, 2007–2009

Self-reported and objective biomarker estimates of AREE for the primary and reliability study are provided in Table 2. REE and AREE comprised, on average, 66.7% and 33.3% of TEE, respectively. Compared with the biomarker, the AAFQ and PAR both slightly underestimated AREE, whereas the PHQ substantially underestimated AREE by approximately 40%. Unadjusted Pearson correlation coefficients estimating the reliability between primary and reliability measures were 0.59, 0.75, 0.42, and 0.42 for TEE (from DLW), REE (from indirect calorimetry), AREE (TEE − REE), and calculated AREE (substituting the Mifflin equation (18) for measured REE), respectively (Figure 2). Intraclass correlation coefficients comparing primary measures with reliability measures of TEE, REE, AREE, and calculated AREE were 0.58, 0.77, 0.42, and 0.42, respectively.

Table 2.
Biomarker and Self-Reported Estimates of Energy in the Primary and Reliability Studies (n = 450), Women's Health Initiative Nutrition and Physical Activity Assessment Study, 2007–2009
Figure 2.
Comparisons between the primary and reliability study measures (n = 82) for total energy expenditure (TEE) from doubly labeled water, measured resting energy expenditure (REE) from indirect calorimetry, biomarker-assessed activity-related energy expenditure ...

Table 3 provides results from linear regression of reporting error on participant characteristics in which separate models are given for the 3 self-report instruments. Each physical activity assessment instrument demonstrates some systematic bias by personal characteristics, but the extent of the bias for related participant characteristics varies by instrument. For the AAFQ, AREE underreporting was greater for women with a higher BMI than for women with a normal BMI and for women who were black than for white participants. We found that in the PAR, activity underreporting decreased with age. Underreporting on the PAR was also predominant among those who were Hispanic and suggestive for black women compared with white women. The direction of the AREE misreporting in relation to BMI differed for the PAR, wherein women with a higher BMI tended to overreport physical activity on the PAR compared with those with lower BMI. Only the PHQ had systematic bias in reporting by age in which underreporting decreased with age. In each of these models, the R2 was small (ranging from 4.6 to 6.8); only a minor fraction of the reporting error variance for the 3 self-report instruments was explained by these selected participant characteristics.

Table 3.
Linear Regression of the Reporting Errora on the Participant Characteristics (n = 450), Women's Health Initiative Nutrition and Physical Activity Assessment Study, 2007–2009

Table 4 gives regression coefficients from linear regression of log(AREE biomarker) on log(self-report AREE) and other participant characteristics for each self-report instrument. Even though age, race/ethnicity, and BMI were not consistently associated with systematic bias across assessment instruments (Table 3), we included all covariates in these models to help explain the variation in the biomarker. In general, for these models, the self-report explains a small amount of biomarker variation. The AAFQ, PAR, and PHQ assessments explained 7.6%, 4.8%, and 3.4% of the biomarker variation in these equations, respectively, but adding the covariates increased these values substantially to 25.2%, 21.5%, and 21.8%. We next used biomarker AREE data from the reliability study to create an adjusted R2 that targeted the variation in average AREE over a 6-month period, assuming the measurement error correlation between primary and reliability biomarker assessments to be zero. These adjusted R2 values were calculated by dividing the unadjusted R2 by the correlation between log(AREE) for the primary and reliability measures (0.317). Here we see that correcting for temporal variation in the biomarker added valuably to explaining biomarker variation, improving the R2 estimates to 79.4, 67.8, and 68.7 for the AAFQ, PAR, and PHQ, respectively. To confirm the robustness of this approach, we conducted a sensitivity analysis in which the primary and reliability measurement error correlations took respective values of ρ = 0, −0.1, and −0.2, with the negative values allowing for the fact that the primary and reliability samples were drawn at the extremes of the targeted 6-month period and the zero value corresponding to our primary adjusted R2 analyses. The negative values for ρ yielded slightly larger adjustment factors (0.361 and 0.433, respectively) and adjusted R2 values that were 16.7% and 26.7% smaller than when ρ = 0 (data not shown).

Table 4.
Percent of Biomarker Variation Explained From Regression Calibration of Biomarker on Self-Reported Physical Activity and Participant Characteristics (n = 450), Women's Health Initiative Nutrition and Physical Activity Assessment Study, 2007–2009 ...


In the present study of 450 postmenopausal women participating in the WHI Observational Study, we used a biomarker of AREE to characterize the measurement properties of 3 commonly used self-reported measures of physical activity. The PAR and AAFQ slightly underestimated activity, whereas the PHQ substantially underestimated activity compared with the biomarker. All 3 self-reported physical activity assessment tools captured only a small fraction (4.2%–8.5%) of AREE biomarker variation. However, when participant characteristics, such as age, race/ethnicity, and BMI were considered in linear regression models, larger but still modest fractions of the biomarker variation could be explained: 25.2% for the AAFQ, 21.5% for the PAR, and 21.8% for the PHQ. Large and meaningful improvements in variance were observed when using the reliability study data to correct for the measurement error in the biomarker, even when using a more conservative approach to computing the adjusted R2. It should be noted that the adjusted R2 values assume uncorrelated measurement errors between primary and reliability samples and can be expected to involve some overadjustment if such errors are positively correlated. Overall, there appears to be only a modest signal from the self-reported physical activity, suggesting that participant characteristics should be included when using calibrated AREE estimates in disease association studies.

Few other studies have examined the measurement properties of self-reported physical activity and specifically whether systematic reporting bias exists. Several studies have compared self-reported physical activity from standardized questionnaires and diaries to various objective measures assessed by pedometer, accelerometer, heart rate monitoring with or without concurrent accelerometry, or DLW. Many of these studies concluded that self-reported assessment is fair to poor compared with objective measures when studied under free-living conditions (2, 4, 5, 27, 2933). Of these, only one study examined the role of systematic bias in reporting (5). Ferrari et al. used a measurement error model to examine 2 self-reported assessments (questionnaire and diaries) compared with an objective criterion measure obtained by accelerometer. They reported that measurement error varied by study participant characteristics, wherein greater misreporting occurred among persons who were older than 50 years of age and those above the median for sex-specific BMI (5). These findings are in general agreement with those we reported here. However, our results suggested some differential bias by instrument; for example, in the subset of WHI Observational Study women who completed the present study, BMI was associated with overreporting physical activity on the PAR but underreporting physical activity on the AAFQ and the PHQ.

Our regression calibration models explained only modest fractions of the variation in the AREE biomarker. Previous studies have shown that body mass is a strong determinant of energy expenditure that may account for up to half of the variance in total daily energy expenditure, and most of this is driven by the fat-free mass component (31, 34). Measures of body composition that are able to partition fat and lean muscle may be more informative in regression calibration models than BMI alone. Alternative explanations may be that our biomarker assessed TEE and AREE over a relatively short period of time (2 weeks), whereas the recall time period for the questionnaires in this study were the past 7 days and past 28 days for the PAR and AAFQ, respectively, and no reference time period was provided for the PHQ.

Many plausible reasons exist to explain the differences in physical activity patterns and related energy expenditure between self-reported and objectively measured physical activity. First, many commonly used physical activity assessment instruments measure only certain domains, such as recreation or sport, and do not assess occupational, transportation, or nonvolitional related activity. The AAFQ measures all of these domains (excluding nonvolitional), which may partly explain the slightly higher R2 in our regression models. Most questionnaires are limited by their ability to assess light activity, which is common among the postmenopausal women enrolled in the WHI. Second, most questionnaires ask respondents to self-assess a measure of activity intensity by reporting on or cueing responses to breathing intensity or sweating or perceived effort in general. However, these cues are more a measure of relative intensity and not necessarily specific to activity-related energy expenditure on an absolute scale. Third, MET values applied to activities and underlying resting energy expenditure are often used as constants on an absolute scale and may not account for variation in true activity energy expenditure across age and fitness distributions (28). Energy expenditure MET values are needed for older adults and for overweight and obese persons. Despite these limitations of self-reporting, numerous studies, including reports from the WHI, have demonstrated inverse associations between physical activity level and chronic disease risk (35). Healthy behaviors are highly correlated; persons engaging in one healthy behavior often engage in several healthy behaviors. It is possible that observational associations of self-reported physical activity and chronic disease risk are assessing one component of a healthy lifestyle that also includes optimal dietary patterns, refraining from smoking, and consuming alcohol in moderation.

Notable strengths of this study include the large, diverse study population. Our ability to recruit participants of varied ages, races/ethnicities, and BMIs allowed us to examine systematic bias in self-reported physical activity, which would not be possible with a more homogenous population. Another strength is that we tested 3 commonly used assessment tools and we were able to examine whether systematic bias differed between these instruments. The reliability study provided important data confirming the reproducibility of our biomarker measures and the use of these data to improve the variability explained in the AREE. The correlations of the primary and reliability measures of REE and TEE were very good, whereas those for AREE were a bit more modest. Limitations include the short-term nature of the biomarker assessments, which captured the previous 2 weeks of energy expenditure. Although we had a standardized protocol for the indirect calorimetry to assess REE, there still may be some noise in this biomarker for which we were not able to account. There are neither true gold standards to assess AREE nor consensus on approaches to manage and interpret data collected from devices such as pedometers, accelerometers, and heart rate monitors. Further, because TEE is largely dependent upon body size, there may be aspects of body composition that may be more influential than BMI alone. Future studies may be able to incorporate measures of body composition, such as dual x-ray absorptiometry, into the regression calibration equations to determine whether a larger fraction of the variation in AREE can be explained. A limitation of nearly all measures of self-reported physical activity is the lack of sedentary activity assessment. However, despite the minimal measurement of sedentary activities, light/sedentary activities are typically assigned a MET value of 1 (28) (identical to MET values for sleep). Therefore, it is unlikely that sedentary activities would valuably contribute to the explained variance in the AREE biomarkers. The generalizability of our findings could be limited because the WHI is comprised exclusively of postmenopausal women. Results may differ among younger and mixed gender populations. Finally, there might be other unmeasured covariates with systematic measurement error that were not assessed in the WHI that could contribute to the reporting error.

Physical activity is an important component of health promotion and disease prevention. In 2008, the US Department of Health and Human Services issued the first ever Physical Activity Guidelines for Americans (36), wherein all adults were advised to engage in at least 150 minutes/week of moderate intensity aerobic activity or 75 minutes/week of vigorous intensity aerobic activity. To fulfill this goal, credible advice must be provided by health care providers, scientists, and others with vested interests in the physical activity of the nation. Accurate assessment of physical activity to support such advice is important. The use of biomarker-calibrated estimates of self-report should be considered for use in population-based studies to reduce measurement error and to provide unbiased estimates of physical activity with disease outcomes. Future work will examine associations of calibrated AREE using the PHQ with various clinical outcomes in the WHI.


Author affiliations: Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington (Marian L. Neuhouser, Chongzhi Di, Lesley F. Tinker, Rebecca Seguin, Ross L. Prentice); Canyon Ranch Center for Prevention & Health Promotion, University of Arizona, Tucson, Arizona (Cynthia Thomson); Division of Research, Kaiser Permanente, Oakland, California (Barbara Sternfeld); Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, New York (Yasmin Mossavar-Rahmani); Department of Preventive Medicine, Stanford University, Stanford, California (Marcia L. Stefanick, Stacy Sims); School of Medicine, University of Hawaii, Honolulu, Hawaii (J. David Curb); Department of Preventive Medicine, University at Buffalo, State University of New York, Buffalo, New York (Michael Lamonte); and Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, Tennessee (Karen C. Johnson).

Funding was provided by the National Heart, Lung, and Blood Institute (grant N01WH22110) and the National Cancer Institute (CA119171).

Program Office: National Heart, Lung, and Blood Institute, Bethesda, Maryland: Jacques Rossouw, Shari Ludlam, Joan McGowan, Leslie Ford, and Nancy Geller. Clinical Coordinating Center: Fred Hutchinson Cancer Research Center, Seattle, Washington: Garnet Anderson, Ross Prentice, Andrea LaCroix, and Charles Kooperberg. Investigators and Academic Centers: Brigham and Women's Hospital, Harvard Medical School, Boston, MA: JoAnn E. Manson; MedStar Health Research Institute/Howard University, Washington, DC: Barbara V. Howard; Stanford Prevention Research Center, Stanford, California: Marcia L. Stefanick; The Ohio State University, Columbus, Ohio: Rebecca Jackson; University of Arizona, Tucson/Phoenix, Arizona: Cynthia A. Thomson; University at Buffalo, Buffalo, New York: Jean Wactawski-Wende; University of Florida, Gainesville/Jacksonville, Florida: Marian Limacher; University of Iowa, Iowa City/Davenport, Iowa: Robert Wallace; University of Pittsburgh, Pittsburgh, Pennsylvania: Lewis Kuller; and Wake Forest University School of Medicine, Winston-Salem, North Carolina: Sally Shumaker.

Conflict of interest: none declared.


1. Prentice RL, Sugar E, Wang CY, et al. Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease. Public Health Nutr. 2002;5(6A):977–984. [PubMed]
2. Troiano RP, Berrigan D, Dodd KW, et al. Physical activity in the United States measures by accelerometer. Med Sci Sports Exerc. 2008;40(1):181–188. [PubMed]
3. Neilson HK, Robson PJ, Friedenreich CM, et al. Estimating activity energy expenditure: how valid are physical activity questionnaires? Am J Clin Nutr. 2008;87(2):279–291. [PubMed]
4. Atienza AA, Moser RP, Perna F, et al. Self-reported and objectively measured activity related to biomarkers using NHANES. Med Sci Sports Exerc. 2011;43(5):815–821. [PubMed]
5. Ferrari P, Friedenreich C, Matthews CE. The role of measurement error in estimating levels of physical activity. Am J Epidemiol. 2007;166(7):832–840. [PubMed]
6. Prentice RL, Shaw PA, Bingham S, et al. Biomarker-calibrated energy and protein consumption and increased cancer risk among postmenopausal women. Am J Epidemiol. 2009;169(8):977–989. [PMC free article] [PubMed]
7. Prentice RL, Huang YS, Kuller LH, et al. Biomarker-calibrated energy and protein consumption and cardiovascular disease risk among postmenopausal women. Epidemiology. 2011;22(2):170–179. [PMC free article] [PubMed]
8. Beasley JM, LaCroix AZ, Neuhouser ML, et al. Protein intake and incident frailty in the Women's Health Initiative Observational Study. J Am Geriatr Soc. 2010;58(6):1063–1071. [PMC free article] [PubMed]
9. Beasley JM, Aragaki AK, LaCroix AZ, et al. Higher biomaker-calibrated protein intake is not associated with impaired renal function in postmenopausal women. J Nutr. 2011;141(8):1502–1507. [PubMed]
10. Tinker LF, Sarto GE, Howard BV, et al. Biomarker-calibrated dietary energy and protein intake associations with diabetes risk among postmenopausal women from the Women's Health Initiative. Am J Clin Nutr. 2011;94(6):1600–1606. [PubMed]
11. Prentice RL, Mossavar-Rahmani Y, Huang YS, et al. Evaluation and comparison of food records, recalls, and frequencies for energy and protein assessment by using recovery biomarkers. Am J Epidemiol. 2011;174(5):591–603. [PMC free article] [PubMed]
12. Prentice RL, Huang Y. Measurement error modeling and nutritional epidemiology association analysis. Can J Stat. 2011;39(3):498–509. [PMC free article] [PubMed]
13. Neuhouser ML, Tinker L, Shaw PA, et al. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women's Health Intitiative. Am J Epidemiol. 2008;167(10):1247–1259. [PubMed]
14. Design of the Women's Health Initiative clinical trial and observational study. The Women's Health Initiative Study Group. Control Clin Trials. 1998;19(1):61–109. [PubMed]
15. Johnson-Kozlow M, Rock CL, Gilpin EA, et al. Validation of the WHI brief physical activity questionnaire among women diagnosed with breast cancer. Am J Health Behav. 2007;31(2):193–202. [PubMed]
16. Staten LK, Taren DL, Howell WH, et al. Validation of the Arizona Activity Frequency Questionnaire using doubly labeled water. Med Sci Sports Exerc. 2001;33(11):1959–1967. [PubMed]
17. Sallis JF, Haskell WL, Wood PD, et al. Physical activity assessment methodology in the Five-City Project. Am J Epidemiol. 1985;121(1):91–106. [PubMed]
18. Mifflin MD, St. Jeor ST, Hill LA, et al. A new predictive equation for resting energy expenditure in healthy individuals. Am J Clin Nutr. 1990;51(2):241–247. [PubMed]
19. Cole TJ, Coward WA. Precision and accuracy of doubly labeled water energy expenditure by multipoint and two-point methods. Am J Physiol. 1992;263(5):E965–E973. [PubMed]
20. Schoeller DA. Measurement of energy expenditure in free-living humans by using doubly labeled water. J Nutr. 1988;118(11):1278–1289. [PubMed]
21. Schoeller DA, Hnilicka JM. Reliability of the doubly labeled water method for the measurement of total daily energy expenditure in free-living subjects. J Nutr. 1996;126(1):348S–354S. [PubMed]
22. Horner NK, Lampe JW, Patterson RE, et al. Indirect calorimetry protocol development for measuring resting metabolic rate as a component of total energy expenditure in free-living postmenopausal women. J Nutr. 2001;131(8):2215–2218. [PubMed]
23. Segal KR. Comparison of indirect calorimetric measurements of resting energy expenditure with a ventilated hood, face mask, and mouthpiece. Am J Clin Nutr. 1987;45(6):1420–1423. [PubMed]
24. Isbell TR, Kelsges RC, Meyers AW, et al. Measurement reliability and reactivity using repeated measurements of resting energy expenditure with a face mask, mouthpiece, and ventilated canopy. JPEN J Parenter Enteral Nutr. 1991;15(2):165–168. [PubMed]
25. Hayden-Wade HA, Coleman KJ, Sallis JF, et al. Validation of the telephone and in-person interview versions of the 7-day PAR. Med Sci Sports Exerc. 2003;35(5):801–809. [PubMed]
26. Pereira MA, Fitzgerald SJ, Gregg EW, et al. A collection of Physical Activity Questionnaires for health-related research. Med Sci Sports Exerc. 1997;29(6 suppl):S1–S205. [PubMed]
27. Jacobs DR, Ainsworth BE, Hartman TJ, et al. A simultaneous evaluations of 10 commonly used physical activity questionnaires. Med Sci Sports Exerc. 1993;25(1):81–91. [PubMed]
28. Ainsworth B, Haskell W, Whitt M. Compendium of physical activities: an update of activity codes and MET intensities. Med Sci Sports Exerc. 2000;32(9 suppl):S498–S504. [PubMed]
29. Hertogh EM, Monninkhof EM, Schouten EG, et al. Validity of the Modified Baecke Questionnaire: comparison with energy expenditure according to the doubly labeled water method. Int J Behav Nutr Phys Act. 2008;5:30. [PMC free article] [PubMed]
30. Colbert LH, Matthews CE, Havighurst TC, et al. Comparative validity of physical activity measures in older adults. Med Sci Sports Exerc. 2011;43(5):867–876. [PMC free article] [PubMed]
31. Mâsse LC, Fulton JE, Watson KL, et al. Influence of body composition on physical activity validation studies using doubly labeled water. J Appl Physiol. 2004;96(4):1357–1364. [PubMed]
32. Mahabir S, Baer DJ, Giffen C, et al. Comparison of energy expenditure estimates from 4 physical activity questionnaires with doubly labeled water estimates in postmenopausal women. Am J Clin Nutr. 2006;84(1):230–236. [PubMed]
33. Corder K, van Sluijs EMF, Wright A, et al. Is it possible to assess free-living physical activity and energy expenditure in young people by self-report? Am J Clin Nutr. 2009;89(3):862–870. [PubMed]
34. Carpenter WH, Poehlman ET, O'Connell M, et al. Influence of body composition and resting metabolic rate on variation in total energy expenditure: a meta-analysis. Am J Clin Nutr. 1995;61(1):4–10. [PubMed]
35. McTiernan A, Kooperberg C, White E, et al. Recreational physical activity and the risk of breast cancer in postmenopausal women: the Women's Health Initiative Cohort Study. JAMA. 2003;290(10):1331–1336. [PubMed]
36. US Department of Health and Human Services. Physical Activity Guidelines Advisory Committee Report. Washington, DC: US Department of Health and Human Services; 2008. (Accessed February 6, 2013)

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press