|Home | About | Journals | Submit | Contact Us | Français|
To assess accuracy and reliability of self-reported weight and height and identify factors associated with reporting accuracy.
Analysis of self-reported and measured weight and height from participants in the Sister Study (2003–2009), a nationwide cohort of 50,884 women aged 35–74 in the United States with a sister with breast cancer.
Weight and height were reported via computer-assisted telephone interview (CATI) and self-administered questionnaires, and measured by examiners.
Early enrollees in the Sister Study. There were 18,639 women available for the accuracy analyses and 13,316 for the reliability analyses.
Using weighted kappa statistics, comparisons were made between CATI responses and examiner measures to assess accuracy and CATI and questionnaire responses to assess reliability. Polytomous logistic regression evaluated factors associated with over- or under-reporting. Compared to measured values, agreement was 96% for reported height (±1 inch; weighted kappa 0.84) and 67% for weight (±3 pounds; weighted kappa 0.92). Obese women [body mass index (BMI) ≥30 kg/m2)] were more likely than normal weight women to under-report weight by ≥5% and underweight women (BMI <18.5 kg/m2) were more likely to over-report. Among normal and overweight women (18.5 kgm2≤ BMI <30 kgm2), weight cycling and lifetime weight difference ≥50 pounds were associated with over-reporting.
U.S. women in the Sister Study were reasonably reliable and accurate in reporting weight and height. Women with normal-range BMI reported most accurately. Overweight and obese women and those with weight fluctuations were less accurate, but even among obese women, few under-reported their weight by >10%.
Many studies have found an association between high or low body mass indices (BMI) and risk of adverse health outcomes using self-reported data on weight and height. With an increasing prevalence of overweight and obesity in the U.S.(1), the effect of anthropometric characteristics on reporting accuracy is a concern. Studies have examined the accuracy of self-reported versus directly measured height and weight but findings varied and many studies were small or otherwise limited(2, 3). In a meta-analysis of weight reporting in 34 studies, only 18 were from the U.S., sample sizes varied from 18 to 9,000, ages varied from 12 to 84, and measurement protocols differed or were not described(2). While many studies suggest that women tend to under-report their weight, less is known about factors associated with reporting accuracy.
Current weight has been shown to influence weight reporting accuracy. The overweight and obese tend to under-report their weight and the underweight tend to over-report(4, 5). Studies of select populations, including adult women in the U.S., have also suggested that age and race contribute to reporting bias(6–8).
The impact of weight fluctuation and weight cycling on weight reporting accuracy has not been thoroughly examined in the existing literature. Weight cycling is not uncommon. Among Finnish women, the prevalence of weight cycling (defined as losing and then regaining ≥5 kg.) was reported to be 29%(9). Strohacker et al. estimated that 38% of U.S. women weight cycle at least once in their lifetime(10), and 20% of women in the Nurses’ Health Study reported at least 3 weight cycling episodes (defined as losing and then regaining ≥10 lbs.)(11). Among obese bariatric surgery candidates, frequent weight cycling was associated with greater reporting accuracy, suggesting that frequent weight cycling might increase attentiveness to weight, leading to heightened accuracy in reporting(12). Weight cycling and fluctuation and weight reporting accuracy have not yet been examined in a large sample of the general population.
A tendency to over-report height has been observed, particularly among people who are older, shorter, and/or overweight(8), but under-reporting has been observed in higher income categories for certain age groups(13). Fewer studies have assessed reliability of self-reported measures and results were inconsistent(14, 15).
This study assessed the accuracy and reliability of self-reported weight and height in a large cohort of U.S. women and identified characteristics associated with reporting accuracy. We compared self-reported height and weight to examiner-measured values, and separately compared two self-reports obtained using different approaches, allowing us to consider design features affecting data quality.
We used data from the Sister Study, a nationwide volunteer cohort of 50,884 U.S (including Puerto Rico) women aged 35–74 years with a sister with breast cancer; enrollment occurred September 2003 to March 2009. This analysis examines early enrollees who completed baseline activities by September 21, 2007 (n=31,409). To avoid errors influenced by eating disorders(16–18), participants who reported ever having anorexia or bulimia were excluded (n=1,066). Pregnant women delayed baseline activities until at least three months after the end of pregnancy.
Study participants reported weight (pounds) and height (feet-inches) in a computer-assisted telephone interview (CATI) and separately on a self-administered scannable diet questionnaire. During a home visit, trained examiners used digital self-calibrating scales to measure weight and metal tape measures to measure height. The order of completing the CATI, questionnaire, and home visit varied; self reports could be completed before or after the home visit. All measurements were taken three times without shoes. Measurements were rounded to the nearest whole pound for weight and quarter inch for height. Other variables examined from the baseline CATI were weight cycling (frequency of losing and then gaining ≥ 20 pounds), lowest weight since age 20, heaviest non-pregnant/breastfeeding weight, age, race, education level, perceived health status, marital status, household income, smoking, alcohol, physical activity, gravidity, regular multi-vitamin intake, recency of last medical exam, history of depression, and use of anti-depressant medications.
BMI was categorized using Centers for Disease Control and Prevention definitions(19). Lifetime weight difference was calculated by subtracting lowest weight since age 20 from heaviest non-pregnant/breastfeeding weight. All statistical analyses were performed using STATA/IC 10.1(20).
To assess the accuracy of self-reported weight and height, we first compared CATI-reported values with examiner measures among women who completed the CATI within 30 days of the home visit (n=18,639). The primary source of Sister Study data is the telephone interview, which had less missing data and fewer structural errors (see below) for height and weight. For this analysis, examiner measures were treated as the true value. Percent agreement and weighted kappa statistics were calculated for each variable of interest. Kappa statistics were weighted according to a standard weight in STATA to account for the degree of disagreement. Polytomous logistic regression was used to calculate odds ratios (ORs) and 95% confidence intervals (95% CIs) for reporting accuracy by age, race, education level, perceived health status, marital status, and measured BMI.
To be consistent with the existing literature, we first examined the absolute difference between self-reported and measured weight. Differences between measured and self-reported weight were categorized as under-reporting ≥ 7 pounds, under-reporting 4 to 6 pounds, reporting within 3 pounds, and over-reporting ≥ 4 pounds. Because the relative impact of a specific weight difference will be greater in smaller than larger women, we also calculated the percentage of weight mis-reported; self-reports that differed by less than 5% from measured weights were the referent category. Polytomous logistic regression models explored the effects of measured BMI, weight cycling, lifetime weight difference, and current anti-depressant use on under- and over-reporting, adjusting for age, race, education, perceived health status, and marital status as potential confounders. Models examining weight cycling, lifetime weight difference, or current anti-depressant use also adjusted for measured BMI. Differences between measured and self-reported height were categorized as under-reporting >1 inch, reporting within 1 inch, and over-reporting >1 inch.
To determine the effect of misreporting on BMI categories, we compared categories calculated from CATI-reported data with categories based on examiner-measured data using percent agreement and weighted kappa statistics for all women and stratified by categories of age, race, education level, perceived health status, and marital status. We also determined the sensitivity and specificity of self-reported overweight/obese classification relative to examiner-measured data. To further explore the potential for bias in BMI we stratified on measured BMI and examined the percentage of CATI-determined BMI values that over- or underestimated BMI calculated from examiner measured values.
We carried out additional analyses stratifying by or adjusting for which measure came first, the home visit or CATI.
Using data from the subset of women with CATI and questionnaire completed within 30 days of the home visit (n=13,985), we carried out similar analyses to assess the accuracy of weight and height reported in the self-completed questionnaire compared with examiner measured data. We then compared the accuracy of the two self-report measures by calculating ratios of OR from models assessing reporting by CATI or questionnaire versus measured data. An analysis including all women (n=21,935) completing the diet questionnaire within 30 days of the home visit had similar results and is not shown.
Reliability of self-reported weight and height was assessed using percent agreement and weighted kappa statistics to compare self-reported data from the CATI and diet questionnaires. Analyses were limited to women who completed the CATI within 30 days of submitting their questionnaire (n=13,316) and had non-missing questionnaire data for weight (n=11,585) and height (n=11,885). Similar to the accuracy analysis, we stratified and adjusted analyses by reporting order with respect to each other and with respect to the examiner measurement.
Prior to analyses, we identified and corrected several problems inherent to the reporting method. Both random and systematic errors occurred with the self-administered diet questionnaire. About 1% of respondents appeared to make frameshift bubbling errors for weight and/or height by mistaking the bubbles in one or more columns as starting at 1 instead of 0. Figure 1 shows a frameshift error in which the respondent filled in the wrong value for weight in the tens place and the wrong values for height in the feet and inches columns. Frameshift errors occurred frequently in the hundreds place of weight, which were detected when an unreasonable weight (<100 pounds) was marked (e.g., 34 pounds instead of 134). We corrected obvious frameshift errors (0.7% of weight values and 0.1% of height values) when questionnaire values differed from both the CATI and examiner reports by >60 pounds or 11 inches.
Some errors were related to the choice of unit. In the diet questionnaires, a small percentage of respondents appeared to report height in total inches rather than feet and inches as instructed. For example, instead of 5 feet–4 inches, a respondent marked the total inch equivalent (64 inches) which was then mistakenly interpreted as 6 feet–4 inches. We corrected these unit errors in about 0.8% of all responses by checking suspiciously high reports and confirming corrections with CATI and examiner reports. Although these errors occurred for units (inches, pounds) used in the U.S., similar errors could occur for those used in other countries (e.g., meters, kilograms).
There were considerable missing values for weight (13%), height (11%) or both (8%) in the self-administered diet questionnaires. Non-response did not substantially vary by age or BMI category. Missing weight and height were uncommon in the CATI (<1%).
There seemed to be a tendency to round to 0 or 5 when reporting weight in the CATI (59%) and questionnaires (52%), whereas an end digit of 0 or 5 occurred in 27% of examiner measures. We did not correct for this apparent rounding.
We detected infrequent random reporting errors for all modes of reporting. In self-administered questionnaires, random bubbling errors such as pencil smudges were sensitive to the questionnaire scanner. For the CATI, there were occasional data entry errors by interviewers and for examiners, some inconsistencies following measurement protocols. We corrected CATI values if they greatly differed from both examiner and questionnaire values (≥100 pounds for weight; ≥11 inches for height).
Participants were predominantly white (93%), aged 45-64 years (70%), college-educated (>50%), and married or living as married (77%) (Table 1). Over half (58%) were overweight or obese; 78% perceived themselves as being in very good or excellent health.
Measured and self-reported (CATI) weight were highly correlated (correlation coefficient [r] = 0.99). Overall, women under-reported weight by an average of 1.6 pounds. The mean absolute difference between measured and CATI weight was 3.3 pounds (standard deviation [SD] 4.1; range 0–50). Mean self-reported weight was 160.2 pounds (SD 35.5; range 82–402); mean examiner-measured weight was 161.8 pounds (SD 36.4; range 80–425). The average absolute time between the CATI and examiner home visit was 12.6 days (SD 8.7).
Overall, 66.5% of women reported their weight within 3 pounds of measured values (Table 1) with overall weighted kappa=0.92. Agreement within 3 pounds increased with age and perceived health status and was greater for women who were married, had a college degree, and had normal measured BMI. Agreement was lower for black women, obese women, women who weight cycled ≥3 times, and women who completed the CATI before the physical exam.
The crude odds ratio for under-reporting by ≥7 pounds decreased with increasing age; OR = 0.84 (95% CI: 0.75, 0.94) for women aged 55–64 and OR = 0.62 (95% CI: 0.54, 0.73) for women over 65, compared with those 45–54 years (Table 2). Compared with non-Hispanic whites, blacks had a higher odds of under-reporting weight (OR = 1.26; 95% CI: 1.00, 1.59 for 4–6 pounds and OR = 1.72; 95% CI: 1.36, 2.17 for ≥7 pounds). Never married (OR = 1.41; 95% CI: 1.16, 1.72) and widowed/divorced/separated women (OR = 1.25; 95% CI: 1.11, 1.40) had an increased odds of under-reporting weight by ≥7 pounds than married women. The odds ratio for under-reporting by ≥7 pounds increased from 3.82 (95% CI: 3.29, 4.43) for overweight women to 8.92 (95% CI: 7.74, 10.29) for obese relative to normal weight women. Associations remained after adjusting for age, race, and education (Table 2); further adjustment for perceived health and marital statuses did not substantially change estimates. Results from analyses stratified by reporting order (CATI before or after exam) were similar.
The effect of weight cycling differed by BMI status, mainly affecting reporting accuracy among underweight and normal weight women (Table 3).
About 8% of all women (n=1,439) under-reported weight by ≥5%. Compared with normal weight women, in adjusted analyses, the odds of under-reporting weight by ≥5% was higher among overweight (OR = 2.38; 95% CI: 2.05, 2.77) and obese women (OR = 4.10; 95% CI: 3.54, 4.76) (Figure 2). A lifetime weight difference of 25–49 pounds was also associated with under-reporting (OR = 1.35; 95% CI: 1.11, 1.65) (Figure 3). Stratifying by BMI, overweight and obese women with a lifetime weight difference >50 pounds had a decreased odds of under-reporting weight by ≥5% compared with those with a smaller weight difference (overweight OR = 0.65; 95% CI: 0.54, 0.78; obese OR = 0.52; 95% CI: 0.39, 0.70). Conversely, underweight and normal weight women who weight cycled at least once had an increased odds of under-reporting weight compared with those who never weight cycled (OR = 1.35; 95% CI: 1.02, 1.78).
Only 2% (n=465) of all women over-reported weight by ≥5%. In adjusted analyses, the most important factor associated with over-reporting weight by ≥5% was being underweight; OR = 5.30 (95% CI: 3.67, 7.66) (Figure 2). Weight cycling and increasing lifetime weight difference were also associated with over-reporting weight (Figure 3).
After excluding currently underweight and obese women, the increased odds of over-reporting by ≥5% remained for those having a lifetime difference of ≥75 pounds (OR = 2.89; 95% CI: 1.76, 4.75) (data not shown).However, the increased odds of over-reporting among women with ≥3 episodes of weight cycling was no longer significant (OR = 1.30; 95% CI: 0.89, 1.90) (data not shown). After stratifying by BMI, lifetime weight difference >50 pounds was associated with over-reporting among currently normal-weight (OR = 1.73; 95% CI: 1.22, 2.46) and overweight women (OR 1.58; 95% CI: 1.05, 2.38).
While current anti-depressant use seemed to have some effect on weight reporting accuracy (Table 2), the associations were attenuated after adjusting for BMI. Household income, perceived stress, physical activity (total MET-hours per week), regular multi-vitamin use, gravidity, recency of last medical exam, smoking, and alcohol were not associated with over- or under-reporting weight (data not shown).
Measured and self-reported height were highly correlated (r=0.96); the average absolute difference between self-reported (CATI) and examiner-measured height was only 0.5 inches (SD 0.6; range 0-5.9). Slight variations between the CATI and examiner were likely due to different rounding conventions. Mean self-reported height was 64.6 inches (SD 2.6; range 50–75) and mean examiner-measured height was 64.7 inches (SD 2.5; range 50.7–75.1).
Over-reporting of height increased slightly with age and BMI. The odds of under-reporting height was higher among black women compared with whites. Also, women with less than a bachelor’s degree had an increased odds of mis-reporting their height compared with women with a bachelor’s degree. No other factor was associated with differences in self-reported and measured height.
The classification of overweight or obese BMI using self-reported measures was highly sensitive (0.95) and specific (0.96). For obese classification alone, sensitivity was 0.90 and specificity was 0.98.
BMI values based on CATI-reported and examiner-measured data were very close. The mean absolute difference between CATI-reported and examiner-measured BMI was only 0.7 kg/m2 (SD 0.8); the correlation was very high (r=0.98) (Figure 4).
Among women with normal range examiner-based BMI, BMI values calculated from CATI reports were within 4% of measured BMI 83.4% of the time (Table 4). However, despite an overall high correlation between BMI values from self-reported and examiner-measured data, there were noticeable discrepancies among women with lower and higher BMI. As shown, self-reported BMI was at least 5% greater than measured BMI for about a quarter of underweight women. Also, BMI based on CATI-reported values was under-reported by at least 5% for about 12% of overweight women and 17% of obese women.
Restricting to participants who completed both the questionnaire and CATI within 30 days of examiner assessment (n=13,985), the average absolute differences between CATI and measured height and weight were 0.4 inches (SD 0.6) and 3.2 pounds (SD 4.0), respectively. The average absolute differences between questionnaire and measured height and weight were 0.5 inches (SD 0.6) and 3.4 pounds (SD 3.6), respectively.
The tendency to under-report weight increased with BMI for both questionnaire and CATI although the differences were greater for telephone reports. For example, obese women were almost twice as likely to over-report by telephone compared with self-completed questionnaire (OR ratio 1.86). Other differences were similarly magnified with telephone reported data. Interestingly while most trends suggest overweight women under-report their weight while underweight women over-report, women with large differences between heaviest and lowest weight also tended to over-report their weight when compared to examiner measurements, especially when reporting by telephone. (See Appendix)
There were high correlations between the self-reported values for weight (r=0.99) and height (r=0.98). The average absolute difference between weight reported in the CATI and questionnaire was 2.0 pounds (SD 3.3; range 0–55). The absolute difference in height was 0.2 inches (SD 0.5; range 0-5). The absolute difference in time between self reports was 15 days (SD 9). For weight, 80% were within 3 pounds. For height, 99% were within 1 inch. The overall weighted kappa was 0.95 for weight and 0.92 for height.
Factors associated with agreement in self-reported weight and height were largely similar to those for accuracy. Whereas height agreement decreased with age, weight agreement within 3 pounds increased with age. Percent agreement for weight and height increased with better perceived health status. Reporting agreement was inversely associated with BMI, weight cycling, and lifetime weight difference. Findings were similar in analyses stratified by reporting order.
Overall, women in the Sister Study reported weight and height accurately. Although participants were slightly leaner (on average 2 kg/m2 lower in BMI) than middle-aged non-Hispanic white women in a smaller, nationally-representative sample from the National Health and Nutrition Examination Survey (NHANES) 2003–2006(21) we confirmed previous findings that errors in reporting weight were associated with specific weight characteristics. Besides current weight status, we found that reporting accuracy was affected by excessive weight cycling (≥3 times) and extreme lifetime weight differences in adulthood (≥75 lbs.).
This is among the first studies examining weight cycling and lifetime weight difference and reporting accuracy in a general population of women. Since weight cycling and lifetime weight difference both involve weight fluctuation, the extent to which the two variables were related was a concern. Weight cycling was associated with a lifetime weight difference of ≥30 pounds (χ2 P <0.001). However, 44% of those who had a lifetime weight difference of ≥30 pounds had never weight cycled, thus large changes in weight were not entirely explained by weight cycling.
Similar to previous studies, BMI values calculated from self-reported data were similar to those using measured data and there was high sensitivity for classifying a participant as overweight/obese or obese. Among adult women in the National Health and Examination Survey (1999–2004), there was substantial agreement between self-reported and measured BMI categories(7). In an overweight Dutch sample, self-reported BMI was found to be reasonably accurate for the assessment of overweight/obesity prevalence(22). Even with high correlation, there is still a potential for bias when examining associations between BMI based on self-reported measures and risk of disease and mortality(23). Similar to our results, self-reported BMI in 2001–2006 NHANES and the National Health Interview Survey overestimated measured BMI values at the low end of the BMI scale (<22 kg/m2) and underestimated values at the high end (>28 kg/m2), and respondent socio-demographic characteristics were associated with some misclassification of obese people as overweight.(13, 24) In our study, although BMI was under-estimated by ≥5% for over 10% of overweight and obese women, only 3% of obese women under-reported their weight by ≥10% and fewer than 1% of women in any BMI category under or over-reported by ≥15%. Furthermore, the average examiner weight among obese women was 207 pounds (SD 32) and the average amount under-reported by these women was only 3.3 pounds (SD 6.8). Only 126 obese women under-reported by >20 pounds. For obese women, in particular, a five percent difference in weight may have a negligible impact on associations with health outcomes.
Depression was of interest because it is associated with low self-esteem (25, 26) and therefore could affect accuracy of weight reporting. However, diagnosis of depression or current use of anti-depressant medication was not significantly associated with under- or over-reporting weight.
Several studies have suggested that respondents give more socially desirable answers in interviews than on self-administered questionnaires(27). Despite finding a high correlation between CATI and questionnaire responses and seeing similar trends in accuracy for CATI and questionnaire, overweight and obese women reported weight more accurately on the questionnaire. While this finding might suggest that the anonymity of the self-completed questionnaire promotes more honest reporting, it is also possible that women weighed themselves while completing the form at home. Access to a scale while completing the form may facilitate accurate reporting. Our participants may have been more motivated than others to do this because of the pending home visit during which they knew they would be weighed. Since women were asked to have their questionnaire ready for the examiner to collect, it is also possible that these forms were completed just before the home visit, increasing the likelihood of similar results. Thus our data may provide a “best case” assessment of the validity of weight data reported on self-completed questionnaires.
Response rates and data quality can be higher in telephone interviews than mailed questionnaires(28, 29). CATI item non-response may have been minimized because interviewers asked each question, although women could refuse to answer. Having examiners physically collect the self-administered questionnaires may have helped reduce overall non-response for that form.
This analysis has some unique caveats. Participants were told they would be weighed and measured during a home visit and therefore may have reported more accurately than they would have otherwise. Some variation between self-reported and measured weight may have occurred because examiners weighed women with clothing whereas women may have reported their weight without clothes. There was the potential for a learning effect caused by the order of the home visit and CATI self-report. Women who had the home visit first may have remembered their measured weight and height and later reported the same values in the CATI (59% had home visit first; 37% completed CATI first; 4% completed both on same day). However, when we stratified the analyses by which measure came first, we found no evidence that the order of reporting influenced accuracy. Similarly, timing of the CATI in relation to filling out the questionnaire had little impact on reliability. Data were collected by many different examiners using different scales. Although all examiners were trained, we could not verify that measurement protocols were consistently followed.
In conclusion, U.S. women in the Sister Study were reasonably reliable and accurate in reporting weight and height. Women with normal-range BMI reported most accurately. Overweight and obese women and those with fluctuations in their weight were less accurate, but even among obese women, few women under-reported their weight by >10%. Nonetheless, even though self-reported and measured weight and height are highly correlated, bias can still exist in studies relying on self-reported data due to the tendency of over overweight women to under-report and underweight women to over-report their weight. This is among the first studies to show that repeated weight cycling and large weight changes in adulthood are also associated with less accurate weight reporting in a general population of women.
|Under-report≥5% of examiner-measured weight||Over-report ≥5% of examiner-measured weight|
|%||aORCATI (95% CI)||aORQuestionnaire (95% CI)||OR Ratio||aORQuestionnaire (95% CI)(95% CI)||aORQuestionnaire (95% CI)(95% CI)||OR Ratio|
|Body mass index|
|Underweight (<18.5 kg/m2)||1.2||0.49(0.16, 1.56)||0.55(0.17, 1.75)||0.89||4.89(3.11, 7.69)||3.06 (2.22, 4.24)||1.59|
|Normal (18.5–24.9 kg/m2)||40.3||REF||REF||REF||REF|
|Overweight (25.0–29.9 kg/m2)||31.7||2.44(2.05, 2.9)||1.68(1.40, 2.01)||1.45||0.48 (0.37, 0.63)||1.03(0.92, 1.15)||0.47|
|Obese (30.0+kg/m2)||26.8||3.97 (3.34, 4.72)||2.14(1.77, 2.58)||1.86||0.52 (0.40, 0.70)||1.25(1.12, 1.40)||0.42|
|1–2||25.1||0.97(0.82, 1.14)||0.82(0.68, 1.00)||1.17||1.11 (0.85, 1.45)||1.08(0.96, 1.20)||1.03|
|3+||19.6||1.10(0.91, 1.33)||0.96(0.77, 1.2)||1.15||1.32(0.95, 1.84)||1.09(0.95, 1.25)||1.21|
|Heaviest-lowest weight difference|
|25–49 lbs.||38.5||1.39(1.10, 1.75)||1.29(1.01, 1.64)||1.08||0.94(0.70, 1.26)||1.06(0.92, 1.21)||0.89|
|50–74 lbs.||24.5||1.25(0.96, 1.62)||1.20(0.91, 1.59)||1.04||1.21 (0.83, 1.74)||1.10(0.93, 1.29)||1.10|
|75+lbs.||20.0||0.79(0.57, 1.11)||0.92(0.64, 1.32)||0.86||2.51 (1.57, 4.02)||1.30(1.06, 1.61)||1.93|
Abbreviations: CATI, computer-assisted telephone interview; aOR, adjusted odds ratio (adjusted for age, race, education, perceived health status, marital status, BMI)