|Home | About | Journals | Submit | Contact Us | Français|
Misreporting characterized by the reporting of implausible energy intakes may undermine the valid estimation of diet-disease relations, but the methods to best identify and account for misreporting are unknown. The present study compared how alternate approaches affected associations between selected dietary factors and body mass index (BMI) by using data from the European Prospective Investigation Into Cancer and Nutrition-Spain. A total of 24,332 women and 15,061 men 29–65 years of age recruited from 1992 to 1996 for whom measured height and weight and validated diet history data were available were included. Misreporters were identified on the basis of disparities between reported energy intakes and estimated requirements calculated using the original Goldberg method and 2 alternatives: one that substituted basal metabolic rate equations that are more valid at higher BMIs and another that used doubly labeled water-predicted total energy expenditure equations. Compared with results obtained using the original method, underreporting was considerably lower and overreporting higher with alternative methods, which were highly concordant. Accounting for misreporters with all methods yielded diet-BMI relations that were more consistent with expectations; alternative methods often strengthened associations. For example, among women, multivariable-adjusted differences in BMI for the highest versus lowest vegetable intake tertile (β = 0.37 (standard error, 0.07)) were neutral after adjusting with the original method (β = 0.01 (standard error, 07)) and negative using the predicted total energy expenditure method with stringent cutoffs (β = −0.15 (standard error, 0.07)). Alternative methods may yield more valid associations between diet and obesity-related outcomes.
Underreporting of dietary intakes or low energy reporting is a major challenge to research on relations between diet and health. Although the reporting of implausibly low energy intakes is also related to factors such as age, sex, and psychosocial characteristics, numerous studies have found it to be particularly prevalent among obese subjects and to be characterized by a tendency to report relatively low intakes of foods high in fat and simple carbohydrates that may be perceived as socially undesirable (1–6). Thus, underreporting is especially problematic for studies in which investigators explore associations between diet and obesity or obesity-related disorders. Furthermore, underreporting has been observed to be widespread and to persist across methods of dietary assessment (1, 7–9). Although generally estimated to be less prevalent, overreporting is a problem as well, and it is also related to individual characteristics (10, 11).
Methods for identifying underreporters and overreporters have been proposed, although recent reviews on various dietary factors and obesity have suggested that relatively few studies account for the presence of subjects with implausible intakes beyond excluding subjects with extreme energy intakes (1, 12–14). Ideally, implausible reporters would be identified by comparing reported energy intakes (rEIs) with objective estimates of energy intake. However, such methods are often not feasible in large-scale studies, as they are relatively costly (1, 15). Other methods suggested for identifying implausible reporters (16–18) are more indirect and based on the extent of the disparity between rEIs and predicted energy requirements. Several studies have found that excluding underreporters through the use of such methods affects the magnitude and/or direction of diet-obesity relations, strengthening associations with factors such as fat, sugar, and fiber consumption (1, 2, 19, 20). Alternative indirect methods have been proposed, such as predicting energy needs on the basis of either estimated basal metabolic rates (BMRs) (18, 21) or doubly labeled water prediction equations, at times recommending more stringent cutoffs for classifying reported intakes as implausible (17, 22). No studies to date have directly compared these different methods.
Using diet history data from the Spanish cohort of the European Prospective Investigation Into Cancer and Nutrition (EPIC-Spain), we compared how these different methods affected the estimated prevalence of and characteristics associated with implausible reporting. We also examined how alternative methods of accounting for implausible reporting affected relations between body mass index (BMI, measured as weight in kilograms divided by height in meters squared) and intakes of energy, fat, and selected food groups hypothesized a priori to be susceptible to biased reporting.
The EPIC-Spain cohort, part of the multicountry EPIC study, included men and women aged 29–69 years at enrollment in 1992–1996, recruited from the general population and among blood donors in 5 regions: Asturias, Granada, Murcia, Navarra, and Gipuzkoa. Details on the study design have been published previously (23, 24). Informed consent was obtained from all participants, and the ethics committee of the Spanish Carlos III Health Institute approved the study. The analysis sample excluded subjects ≥65 years of age (n = 453), those for whom height or weight data were missing (n = 113) or implausible (n = 7 subjects whose self-reported weight at a 3-year follow-up represented losses >90 kg), underweight subjects (BMI <18) (n = 18), and those who reported following weight-loss diets and who were perhaps not in energy balance (n = 1,456). Exclusions reduced the original sample of 25,808 women and 15,632 men to 24,332 women and 15,061 men.
Trained interviewers collected detailed data on habitual diet in the past year by using a validated, computerized diet history instrument with >600 items (23, 24). Briefly, a structured interview was used to ask subjects to report, for each meal or food intake occasion, frequency of consumption, usual portion size, and preparation methods of all foods consumed at least twice per month (lower intake frequencies were allowed for liver). For this analysis, in addition to energy and dietary fat, we selected 3 food groups hypothesized a priori to be susceptible to misreporting: vegetables, fruits, and pastries/cakes (hereafter referred to as pastries). Food groups were analyzed in grams per megajoule of energy; sex-specific tertiles were used because some relations were nonlinear. Energy and percentage of calories from fat were analyzed continuously after confirming the linearity of relations using quartiles.
Interviewers used standardized methods to measure height and weight; BMI was used as a measure of fatness, with standard cutoffs for overweight (≥25–30) and obesity (>30) (25). Data on sociodemographic characteristics, health history, and health behaviors were collected with interviewer-administered questionnaires. A validated index of physical activity was developed from questions about exercise, cycling, and occupational activity (26). The index was modified to classify subjects as “very active” if they reported strenuous manual labor or >7 hours of sports/exercise per week, with at least 3 hours reported to be vigorous. Food and Agriculture Organization physical activity level values were assigned to this index as follows: inactive = 1.35, moderately inactive = 1.55, moderately active = 1.75, active = 1.85, and very active = 2.2 (16, 27).
BMRs were estimated using the recommended Schofield equations (27), and the ratios between BMRs and reported energy intakes (rEI:BMR) were calculated, providing an estimate of energy available for activity after meeting the needs for basic metabolism (i.e., weight maintenance). The plausibility of rEIs was determined by comparing this ratio with physical activity levels: Implausible reporters had rEI:BMR values that differed from physical activity levels by more than ±2 standard deviations when standard deviations were calculated as prescribed by Black (16), using estimates of variance in rEIs, BMR, and activity. More stringent cutoffs of ±1.5 standard deviations were also applied, because previous researchers using the predicted total energy expenditure (pTEE) method described below suggested that this cutoff yielded a sample in which associations between rEIs and estimated requirements were consistent with theoretical relations (22).
Because the Schofield equations have been found to lead to overestimation of BMRs among obese and sedentary subjects (28, 29), we also classified implausible reporters by using alternative BMR equations, which have been shown to correspond well with measured values in both obese and nonobese subjects using indirect calorimetry (30). The calculations were otherwise identical to the Goldberg method.
pTEE was estimated using the Dietary Reference Intakes prediction equations, derived by using large pooled data sets compiled from doubly labeled water studies, which were found to correlate well with measured TEE in independent samples (7, 31). Implausible reporters were identified on the basis of the ratio of reported intakes to estimated requirements (rEI:pTEE). As with the Goldberg method, standard deviations were calculated using published estimates of variation in energy balance components, as prescribed previously (7, 22). As the mean value for 1 standard-deviation was 15.1%, rEIs beyond approximately ±30.0% and ±23.0% of pTEEs were used to identify implausible reporters, corresponding to 2.0- and 1.5-standard-deviation cutoffs, respectively.
Web Appendix 2 shows the mean rEI:BMR and rEI:pTEE ratios for subjects classified as under-, over-, and plausible reporters who were identified using 2-standard-deviation cutoffs for each method. Means resembled expected values among plausible reporters: Values on the order of 1.55 were expected for rEI:BMR in a moderately inactive population (16), and on average rEIs should correspond to pTEEs among plausible reporters. Values for both ratios were substantially lower in underreporters and higher in overreporters.
Analyses were conducted separately for men and women. The concordance of under- and overreporting estimated using the different methods was assessed with kappa statistics. For each method, differences in subject characteristics and dietary intakes across reporting groups were evaluated using analysis of variance or chi-square tests. We used linear regression to estimate the effect of accounting for implausible reporting on multivariate associations between BMI and energy and fat (model 1) and between BMI and intakes of vegetables, fruits, and pastries (model 2). Results were adjusted for age, physical activity, education, center, height, smoking status, season, alcohol intakes, parity, diabetes, the other dietary variables shown in the model, and use of special diets related to hypertension, cholesterol, or diabetes. Excluding rather than adjusting for subjects on these diets did not meaningfully change the findings (data not shown). There were significant interactions (P < 0.05) between several food groups and smoking: Interaction terms were included in analyses of women, and models in men were stratified by smoking status, as there were multiple interactions with several food groups. To evaluate different strategies of accounting for implausible reporting, a baseline multivariate model was compared with results obtained after 1) restricting the sample to plausible reporters identified using each method, and 2) using dummy variables to adjust for under- and overreporting. Correlations between BMI and dummy variable indicators of underreporting (r = 0.13–0.25) and overreporting (r = −0.06–0.21) were low-to-moderate. Variance inflation factors indicated the absence of collinearity problems. In supplementary models, the effects of excluding only underreporters were examined, as were effects of excluding subjects with extreme energy intakes that fell outside the recommended cutoffs (<500 and <800 kcal or >3,500 and >4,000 kcal in women and men, respectively) (32) without taking energy requirements into account.
The estimated prevalences of underreporting in women and men determined using the Goldberg method (21.7% and 14.7%, respectively) were higher than estimates determined using the revised Goldberg method (14.4% and 9.1%, respectively) and the pTEE method (12.0% and 7.2%, respectively; chi-square P < 0.05 for all prevalence differences) (Table 1). Conversely, overreporting estimates determined by using the Goldberg method (4.8% and 5.9% in women and men, respectively) were lower than those seen when using the revised Goldberg method (9.2% and 10.5%, respectively) and the pTEE method (17.9% and 21.2%, respectively). Concordance of underreporting was highest for the revised Goldberg and pTEE methods (kappa = 0.83), intermediate for the Goldberg and revised Goldberg methods (kappa = 0.75), and lowest for the Goldberg and pTEE methods (kappa = 0.64). Similarly, for overreporting, the concordance of the revised Goldberg and pTEE (kappa = 0.62) and the Goldberg and revised Goldberg (kappa = 0.67) methods was higher than that for the Goldberg and pTEE (kappa = 0.38) methods.
Regardless of the method used, underreporters had higher mean BMIs and overreporters had lower mean BMIs than did plausible reporters, with especially marked differences among women (analysis of variance P < 0.05) (Table 1). Thus, the estimated prevalence of underreporting found with each method was higher among obese women than among overweight and normal-weight women (underreporting prevalences of 32.6%, 20.4%, and 12.2% with the Goldberg method; 23.3%, 12.9%, and 7.2% with the revised Goldberg method; and 20.4%, 10.9%, and 4.8% with the pTEE method, respectively). Patterns were similar among men (not shown).
Across all methods, underreporters reported lower overall intakes of energy and of energy from fat than did plausible reporters (analysis of variance P < 0.05) (Table 1). Underreporters also reported lower intakes of pastries and higher intakes of fruits and vegetables as a proportion of energy intakes (g/MJ), whereas the opposite was true for overreporters (analysis of variance P < 0.05). Consequently, there were strong effects of accounting for implausible reporters. Compared with the baseline model, in women (Table 2), either excluding or adjusting for implausible reporters identified using any method resulted in positive rather than negative multivariate-adjusted associations between BMI and energy and pastry intakes, null or negative rather than positive associations between BMI and vegetable intakes, and negative rather than null associations with fruit intakes (among nonsmokers). As shown, however, accounting for implausible reporters had little added effect on associations with percentage of energy from fat.
Although all 3 methods had consistent effects on estimates, the magnitude of these diet-BMI associations was generally stronger when we used the pTEE and revised Goldberg methods than when we used the standard Goldberg method. In men, models were stratified by smoking status, as there were significant interactions between current smoking and several food groups (Table 3). Accounting for implausible reporting had effects similar to those seen in analyses of women: Positive associations with energy and pastry intakes and negative associations with fruit intake (among nonsmokers) were seen after accounting for implausible reporters, whereas positive associations with vegetable intakes were strongly attenuated.
With the more stringent 1.5-standard-deviation cutoffs, underreporting estimates based on the Goldberg, revised Goldberg, and pTEE methods increased to 31.2%, 21.6%, and 19.4% in women and 22.7%, 14.9%, and 13.0% in men, respectively. Estimates for overreporting also increased substantially, to 8.8%, 15.4%, and 24.1% in women and 11.1%, 17.9%, and 28.4% in men, respectively. Applying these cutoffs to the revised Goldberg and pTEE methods led to underreporter classification that was highly concordant with the Goldberg method using 2.0-standard-deviation cutoffs (kappa = 0.90 and kappa = 0.83 for agreement with the revised Goldberg and pTEE methods, respectively).
As shown in Figure 1, among women, adjusting for implausible reporters by using more stringent cutoffs generally increased the magnitude of BMI-diet associations. Again, the revised Goldberg and pTEE methods generally yielded somewhat stronger associations than did the Goldberg method. For example, coefficients for the highest vegetable intake tertile among women were negative and significant when we used the revised Goldberg and pTEE methods but remained neutral when we used the Goldberg method. Similar patterns were observed among men (Figure 2).
Accounting only for underreporters yielded results that were substantially different than when both types of misreporters were considered for some dietary factors. For example, when only underreporters were excluded, energy-BMI associations among women were strongly attenuated compared with values shown in Table 2, with β coefficients of 0.12 (standard error (SE), 0.01), 0.10 (SE, 0.01), and 0.07 (SE, 0.01) using 2.0-standard-deviation cutoffs, and 0.18 (SE, 0.01), 0.16 (SE, 0.01), and 0.14 (SE, 0.01) using 1.5-standard-deviation cutoffs for the Goldberg, revised Goldberg, and pTEE methods, respectively (P < 0.01 for all methods). When we used the Goldberg method, the association between BMI and the highest vegetable intake tertile among women remained weakly positive rather than null after excluding only underreporters using 2.0-standard-deviation cutoffs (β coefficient = 0.13, SE, 0.08; P < 0.10). Using the pTEE method with the more stringent 1.5-standard-deviation cutoffs, excluding only underreporters rather than both types of implausible reporters yielded null versus negative associations (β coefficient = 0.05 (SE, 0.07), P > 0.10, vs. −0.15 (SE, 0.07), P < 0.01). Although attenuated, associations with other food groups, for which associations were more consistent across the different methods, were not meaningfully different when overreporters were not taken into account (data not shown). Results were similar when adjusting for rather than excluding underreporters, and patterns were similar among men (data not shown).
When subjects with extreme energy intakes (1.1% of women and 2.2% of men) were excluded rather than using methods to identity implausible reporters based on estimated energy requirements, coefficients in all models were similar to baseline multivariate models (e.g., in women the energy-BMI coefficient was −0.12 (SE, 0.01), P < 0.01).
Dietary misreporting characterized by implausible energy intakes is often overlooked. In the absence of objective measures of energy intake, however, indirect methods are typically used to identify implausible reporters and to evaluate how misreporting may influence associations between dietary intakes and health outcomes (16). Recently, some researchers have proposed that pTEE equations may be better suited than previous methods to estimate energy requirements and identify implausible reporters (17, 22); others have suggested that the equations most frequently used to estimate BMR may be insufficiently valid among overweight and obese subjects (28–30). In the present study, we assessed how these alternative methods of estimating energy needs affected estimated prevalences of implausible reporting and influenced associations between dietary factors and obesity.
Levels of under- and overreporting obtained using the traditional Goldberg method—19% and 5%, respectively, in the sample as a whole—were comparable to those reported in the literature that used diet histories or food frequency questionnaires (1). In comparison, when we used the revised Goldberg and pTEE methods, which were highly concordant with each other, levels of underreporting were 7%–10% lower and levels of overreporting were 13%–15% higher. Nonetheless, regardless of the method used, underreporters had higher mean BMIs and overreporters had lower mean BMIs than did plausible reporters, as observed elsewhere (1, 2, 10). As in earlier studies (2, 19, 20, 22), likely underreporters identified with each method reported higher intakes of healthy foods, such as fruits and vegetables, and lower intakes of energy and less-healthy foods, such as pastries, than did plausible reporters. The opposite pattern was true for overreporters. After excluding implausible reporters using each approach, coefficients for several diet-BMI associations changed in magnitude or direction, becoming more consistent with hypotheses relating energy-dense foods to obesity (14, 33), again consistent with several earlier reports (2, 20, 22). For example, among women, initially negative associations between BMI and intakes of energy and pastries were reversed, whereas a neutral association with fruit became negative. In contrast, excluding subjects with extreme energy intakes by using recommended cutoffs (32) had no meaningful effect. Coefficients for percentage of energy from fat were not meaningfully affected by adjustment for misreporting. Although reasons for this finding are uncertain, Huang et al. (22) also found that associations between BMI and percentage of energy from fat were not influenced by excluding implausible reporters. Similarly, coefficients for the percentage of energy from saturated, polyunsaturated, and monounsaturated fat in the baseline multivariate model (β = −0.02 (SE, 0.01), 0.17 (SE, 0.01), and 0.04 (SE, 0.01), respectively) were consistent with those obtained excluding (β = −0.01 (SE, 0.01), 0.14 (SE, 0.01), and 0.03 (SE, 0.01), pTEE method 1.5-standard-deviation cutoffs) or adjusting for implausible reporters (not shown). In separate models, we briefly examined associations between BMI and the percentage of energy from carbohydrates. As for fat, misreporting adjustments had little effect (not shown).
Although the effects of accounting for misreporting were generally consistent across methods, the magnitude of associations observed after these adjustments was frequently stronger when using the revised Goldberg and pTEE methods to identify misreporters than when using the original Goldberg method. This was observed despite the lower prevalence of underreporting found when using these alternative methods. As observed previously, using more restrictive cutoffs to identify implausible reporters tended to strengthen associations (17, 22). Results also suggested that in some cases, overreporting might have been influential, as excluding or adjusting only for underreporters at times yielded associations that differed when also accounting for overreporters. Additionally, as in a previous study on a different population (2), we found that adjusting for rather than excluding implausible reporters yielded consistent results: Relations between dietary intakes and BMI that emerged after stratifying by reporting group were similar to those observed among plausible reporters (Web Figure 1). This suggested that adjustment—effectively summarizing across reporting groups—was a viable alternative to omitting a substantial proportion of subjects, which some researchers have suggested may lead to bias (34). Similarly, in another recent study, de Castro et al. (35) found that positive relations between variables such as energy density and energy intakes were preserved within reporting subgroups defined on the basis of the rEI:BMR ratio, despite the disparate levels of intake reported across these groups.
The stronger associations observed using the revised Goldberg and pTEE methods versus the original Goldberg method might be due in part to improved classification of implausible reporters, as these methods could better estimate energy requirements. pTEE equations have high R2 values (7), and the revised BMR equations have been reported to yield better estimates across the range of BMIs (28, 29). It is noteworthy that these revised approaches, although based on independent equations for estimating energy needs, yielded highly concordant estimates of both under- and overreporting. It is important to keep in mind, however, that although the Goldberg method has been evaluated against doubly labeled water, the true validity of these alternative methods is uncertain. In previous studies, researchers have shown the Goldberg method with cutoffs of 2.0 standard deviations to be specific (97%–98%) and reasonably sensitive (72%–74%) for identifying underreporters (21). Reassuringly, when 1.5-standard-deviation cutoffs were applied, the numbers of underreporters identified with these updated methods were highly concordant (94%–96% agreement) with the Goldberg method. Thus, the major discrepancy was the substantially higher level of overrreporting identified using these methods. Indeed, the validation of the original Goldberg method suggested this method had limited ability to identify overreporters (21).
The substantial differences in the prevalence of implausible reporting across alternative methods highlight that in the absence of valid objective measures of habitual energy intakes, it is not possible to determine to what extent implausible rEIs reflect misreporting rather than true habitual intakes in subjects whose energy requirements may be poorly estimated (31). However, the findings that emerge after accounting for implausible reporters are consistent with the disparity in associations observed in several studies comparing how questionnaire data versus biomarker-based markers of intake relate to obesity or related health outcomes. For example, in one recent study, urinary sugars and plasma vitamin C, but not food frequency questionnaire-based estimates of intake, were found to be associated with obesity (36). In another population, estimates of vitamin C intake derived from plasma or food records, but not from food frequency questionnaires, were associated with ischemic heart disease (37). In yet another study, positive associations between energy intakes and obesity-related cancers, such as breast and colon cancer, emerged only after using biomarker-calibrated measures of intake, whereas associations with non-obesity-related cancers such as lung cancer and lymphoma remained neutral (38). The absence of objective measures of energy intake is an important limitation of this analysis. However, there are important strengths, including the large sample size with measured anthropometry, and the availability of a validated physical activity level measure to aid estimation of energy needs (21). Nonetheless, as household activities were not included, activity levels might have been assessed with some degree of error (39, 40).
Recent literature has suggested that imprecise or biased intake reporting, often more prevalent among obese subjects, may undermine the validity of research on diet and numerous health outcomes (11, 36, 38, 41, 42). In the absence of objective biomarkers, the updated methods used in this study, which attempted to address limitations identified with the original approach, appear to be a reasonable alternative, enabling researchers to examine the effects of accounting for likely overreporters as well as underreporters. Although its relevance may vary across populations and dietary assessment methods, additionally accounting for overreporting appeared to influence associations with some dietary factors, and this type of misreporting should be considered. Future studies to assess the sensitivity and specificity of these alternative methods against objective measures of energy intake are needed to better evaluate their ability to identify under- and overreporters compared with the Goldberg method.
Author affiliations: Center for Research in Environmental Epidemiology/Municipal Institute for Medical Research-Hospital del Mar, Barcelona, Spain (Michelle A. Mendez); Unit of Nutrition, Environment and Cancer/Cancer Epidemiology Research Programme, Catalan Institute of Oncology, Barcelona, Spain (Michelle A. Mendez, Genevieve Buckland, Carlos A González); Department of Nutrition and Carolina Population Center, University of North Carolina, Chapel Hill, North Carolina (Barry M. Popkin); Cardiovascular Risk and Nutrition Research Group/Municipal Institute for Medical Research-Hospital del Mar, Barcelona, Spain (Helmut Schroder); Public Health Division of Gipuzkoa and IIS Instituto Investigación Sanitaria BioDonostia, Basque Government, San Sebastian, Spain (Pilar Amiano); Navarre Public Health Institute, Pamplona, Spain (Aurelio Barricarte); Epidemiology Department, Health Council of Murcia, Murcia, Spain (José-María Huerta); Public Health and Health Planning Directorate of Asturias, Oviedo, Spain (José R. Quirós); Andalusian School of Public Health, Granada, Spain (María-José Sánchez); Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Barcelona, Spain (Michelle A. Mendez, Pilar Amiano, Aurelio Barricarte, José-María Huerta, José R. Quirós, María-José Sánchez); and Consortium for Biomedical Research in Obesity and Nutrition Physiopathology, Barcelona, Spain (Helmut Schroder).
Data are from the Spanish cohort of the European Prospective Investigation Into Cancer and Nutrition (EPIC), coordinated by the International Agency for Research on Cancer (agreement NTR/2000/01). The project was financed by the European Commission (agreement SPC.2002332) and participating regional governments, including the Health Research Fund of the Spanish Ministry of Health (exp. 96 0032). Centers from Barcelona, Granada, and Murcia received funding from the Epidemiology and Public Health Centers Network sponsored by the Carlos III Health Institute.
Conflict of interest: none declared.