|Home | About | Journals | Submit | Contact Us | Français|
To assess the accuracy of portion-size estimates and participant preferences using various presentations of digital images.
Two observational feeding studies were conducted. In both, each participant selected and consumed foods for breakfast and lunch, buffet style, serving themselves portions of nine foods representing five forms (eg, amorphous, pieces). Serving containers were weighed unobtrusively before and after selection as was plate waste. The next day, participants used a computer software program to select photographs representing portion sizes of foods consumed the previous day. Preference information was also collected. In Study 1 (n=29), participants were presented with four different types of images (aerial photographs, angled photographs, images of mounds, and household measures) and two types of screen presentations (simultaneous images vs an empty plate that filled with images of food portions when clicked). In Study 2 (n=20), images were presented in two ways that varied by size (large vs small) and number (4 vs 8).
Convenience sample of volunteers of varying background in an office setting.
Repeated-measures analysis of variance of absolute differences between actual and reported portions sizes by presentation methods.
Accuracy results were largely not statistically significant, indicating that no one image type was most accurate. Accuracy results indicated the use of eight vs four images was more accurate. Strong participant preferences supported presenting simultaneous vs sequential images.
These findings support the use of aerial photographs in the automated self-administered 24-hour recall. For some food forms, images of mounds or household measures are as accurate as images of food and, therefore, are a cost-effective alternative to photographs of foods.
Portion size estimation is an important element in self-reports of dietary intake. A body of research suggests that a variety of aids help participants more accurately estimate the amounts of foods consumed (1-13). Three elements affect portion-size reports: perception, conceptualization, and memory (14). Perception is the ability to relate a food amount present in reality to the amount presented by a portion-size aid. Conceptualization is the ability to develop a mental picture of a food portion not actually present and relate it to a portion-size aid. Memory is the ability to accurately recall an amount of food eaten and can affect conceptualization. Studies of perception alone, where subjects identify which picture or model from a set accurately represents the amount of food present, show that subjects vary in their ability to accurately perceive and estimate portion sizes, depending on their age, type of aid, study conditions, and type of food (13-18). Studies involving conceptualization and memory (9,17,19-25) similarly show that accuracy of estimates varies. The literature also suggests a flat-slope phenomenon, in which large portions tend to be underestimated and small ones overestimated (9,14,15,26,27) and that amorphous foods (eg, mashed potatoes) and those eaten in small portions (eg, spreads) are reported less accurately than are other types of foods (13-16,22,25).
The National Cancer Institute with Westat, a research firm; Archimage Inc, a digital arts studio; Baylor College of Medicine; and the US Department of Agriculture Center for Nutrition Policy and Promotion developed an Internet-based, automated, self-administered 24-hour dietary recall (ASA24) for adults intended as a public-use tool for the research community. Participants report foods consumed in one of two ways: by browsing through lists of foods or by typing a food into a text box and searching for it. They are also asked to report a portion size for each food reported. Given the self-administered computer-based environment of ASA24, digital food photographs are used to aid in reporting portion sizes. Most research indicates that the use of photographs is not systematically more accurate than three-dimensional models (16,18,19,28); although one study of children indicates the superiority of photographs over three-dimensional models (24).
The foundation for the images used in ASA24 is a set of 9,000 aerial photographs used in the Food Intake Recording Software System, which is an automated recall developed for children (29). Research, however, has not determined how accurately children report what they consume using the photographs and the Food Intake Recording Software System photographs did not include all foods and portion sizes commonly consumed by adults. ASA24 is also modeled after the Automated Multiple Pass Method, which was developed by the US Department of Agriculture Agricultural Research Service (30). The Automated Multiple Pass Method, which is used in the National Health and Nutrition Surveys, uses three-dimensional measuring cups and spoons during the in-person interviews and two-dimensional photographs of household cups, shapes, and images of mounds, found in a Food Model Booklet, during the telephone interviews (31). The purpose of our research was to determine how best to provide digital images as portion-size estimation aids in the collection of 24-hour dietary recalls online. An observational feeding study was conducted that incorporated both conceptualization and memory. The images were varied by the angle in which the photographs were taken, the type of image, and the number and size of digital images presented in a computer application. The goal was to determine how to best present digital images in the ASA24 to facilitate accuracy.
Two studies were conducted. Twenty-nine participants took part in the first study (three groups of approximately 10 each), and 20 in the second (two groups of 10). In each, participants came to the study site in separate groups of two to three on two consecutive days. Participants were told that they would be fed two meals the first day and then return the second day to respond to some questions about foods people eat. They were not told that they would be asked about portion sizes consumed. To approximate real-life eating, participants served themselves rather than being provided standard portions for each food. On the first day participants signed an informed consent and then, for both breakfast and lunch, served themselves nine foods from five food categories in a buffet line: amorphous/soft foods, single unit foods, small pieces, spreads, and shaped foods. (Figure 1 lists the foods served within each category for each study.) Subjects were instructed to take and consume at least some of each food at each meal. A room monitor ensured that no foods were spilled, exchanged, or thrown out. Trained research assistants weighed the serving containers unobtrusively using the UltraShip digital scale, model UL-35 (My Weigh/GKI Technologies, Vancouver, BC, Canada), (accurate to 2 g) before and after each participant selected food and left the serving area. The assistants recorded weights in grams into a spreadsheet that calculated a gram weight for the amount self-served by each participant for each food. Any food remaining on the plate was also weighed so that a final determination of how much was consumed could be made. Two staff members independently recorded the weights for each item. When observations differed, the item was reweighed and discrepancies reconciled. On the second day, participants used a computer-based application, developed specifically for this study, to view a series of screens that varied by presentation factors described below and selected the portion sizes they estimated to be closest to what they had eaten the day before. For each screen presentation, participants were offered the option to select “less than” or “more than” the amount that was presented by the images to best represent what they had consumed the previous day. After estimating their portion sizes, participants completed a paper questionnaire to rank their preferences among the various screen presentations.
Westat staff used an internal database of research volunteers, including Westat employees, to identify potential participants for each study. Potential volunteers agreed to attend sessions on two contiguous days and take and eat at least some of each food offered at each meal. Participants were recruited to represent a range of demographic characteristics, including sex, race/ethnicity, age, and educational status via a telephone-administered questionnaire. The goal was to recruit about one third with only a high school education for purposes of evaluating the usefulness of the application with lower literacy participants. Participants were required to be familiar with a computer. Westat employees (N = 12) received $90 as an incentive for participation; others, because they had to travel to the location of the study for 2 days, received $120. The Institutional Review Boards of both Westat and National Cancer Institute approved this study.
A computer software program was developed to display portion-size images used for estimating amounts consumed. The application presented participants with different screens displaying pictures of various portions sizes of each food that had been served the previous day. Images showed the food on or in a plate, bowl, or other container with a knife and fork on either side that served as a point of reference. No other information (eg, cups, ounces) regarding portion size was provided on the images. Participants selected a picture that represented the portion size they judged to be closest to the amount they had eaten the day before. For each test, the computer program randomly assigned one of three orders in which differing screen layouts were presented; and, within a food, images appeared in a sequence of smallest to largest portions. Participants completed all judgments for all foods they consumed using one screen layout before moving to the next layout. All subjects ate and reported the portion size for all nine foods within the five food categories leading to 30 to 51 judgments per subject (depending on the study and group described below) for each relevant combination of the factors of food, type of image, and method of presentation (simultaneous vs sequential).
Study 1 had two objectives. The first was to determine the accuracy of portion size estimates using four different types of images: aerial photographs, photographs shot at a 45° angle, images of household measures such as cups and spoons, and images of food mounds used in National Health and Nutrition Examination Surveys (30). The portion size ranges displayed were based on the 5th to the 95th percentiles of reported amounts from National Health and Nutrition Examination Survey 2003-2004. For example, for scrambled eggs, the 5th percentile is slightly less than ¼ c and the 95th percentile is approximately 1 c. Therefore, the range of photos shown was ¼ to 1 c in ¼-c increments for four images. Figure 2 provides example screen shots used in Study 1.
The second objective was to evaluate two different screen presentation methods for each type of photograph: simultaneous—participants were presented a screen showing all portion sizes at once—and sequential—participants were presented with a screen with one photograph of an empty plate and selected buttons to display pictures depicting a sequential increase/decrease in portion sizes.
Types of images tested for each food category are shown in Figure 3. Some photographs were not relevant for certain foods (such as images of mounds for bread) and were, therefore, not tested.
For Study 1, two groups of 10 subjects and one group of nine participants participated in each 2-day cycle. In each group, participants were offered different foods within each food category (shown in Figure 1) to allow for testing a variety of foods within a food category. Evaluation of image type and screen presentation method in Study 1 was conducted by displaying images combining simultaneous or sequential presentation with each of the following image types: aerial photographs, angled photographs, images of mounds, and images of household measures.
The purpose of Study 2 was to assess the effects of both the size and number of portion-size images on the accuracy of estimates. For size, participants were shown either large (1 7/8-in×2½ in) or small (1 5/16-in×1 7/8-in) images. For number, participants were shown either four or eight images. Similar to Study 1, the 5th to 95th percentiles of portion sizes from National Health and Nutrition Examination Survey 2003-2004 were used. The size of the increments varied among the various presentations. For example, for scrambled eggs, the increment between photos was ¼ c when four pictures were shown and ⅛ c when eight pictures were shown. Figure 4 provides example screen shots used in Study 2.
Twenty participants in two groups of 10 participated in Study 2. As in Study 1, subjects were randomly assigned to one of three orders of viewing the screens. Each of these three orders started with a different image size and number combination (eg, large size, four photographs). Based on findings of Study 1 (presented below), all portion sizes were shown simultaneously.
Most portion sizes of food displayed in the images were weighed to determine their gram weight. For some foods, such as honey, butter, and jam, for which there is little variability in volume weights, gram weights for images of the various portion sizes were assigned using standard weights from the Food and Nutrient Database for Dietary Studies (version 1.0, 2004, US Department of Agriculture Agricultural Research Service, Food Surveys Research Group, Beltsville, MD). In this way, each image in the database had an associated gram weight.
Accuracy of participants' estimates was computed as the absolute difference between the gram weight consumed and the gram weight assigned to the selected image. Therefore, lower values indicate greater accuracy.
In Study 1, estimates were missing for three foods because one participant did not take one food and two others took amounts too small to register (<2 g) on the scale. In Study 2, judgments were missing for one food. The univariate distributions of gram-weight differences were inspected to identify outliers, and no consistent patterns were found. Nine estimated amounts and four estimated amounts in Studies 1 and 2, respectively, were found to be more than 10 times the amount consumed. These estimates represented <1% of the possible total 2005 estimates across both studies. Since these values distorted the overall analyses, they were excluded. Thus, the total numbers of estimates used for analyses were 1,413 for Study 1 and 592 for Study 2.
Repeated measures analysis of variance was used to determine whether presentation factor was significantly associated with accuracy. An interaction term formed by each of the independent variables also was included in the model. These analyses were conducted for each food and each food category. An F test was used to test for significant effects, and multiple comparisons were conducted using the Tukey-Kramer (32) (Tukey JW. The Problem of Multiple Comparisons, 1953, unpublished manuscript) method. Because this was an exploratory study with a small sample size, the ability to detect meaningful differences was poor. The power to detect a 20% difference in mean accuracy for a single food was 9% to 86% with coefficients of variation ranging from 1.0 to 0.2, respectively.
Means of the rankings were computed from preference questionnaires. The means were used to determine the preferred image type for each food and for each food category.
Approximately half of participants were men. Participants ranged in age from 18 to 69 years, and half were nonwhite. For one third of participants, the highest level of education completed was high school.
Table 1 presents the results for image type in Study 1. The values indicate the mean absolute gram weight differences between measured and reported intake. The image type that yielded the most accurate or smallest absolute difference by food is indicated as well. The only significant analyses of variance that emerged by image type were for corn chips where angled images were most accurate. For jam, both household measures and simultaneous presentation were more accurate, with the difference approaching significance (P=0.09 and 0.07), respectively.
Table 1 also shows results for method of presentation for Study 1. Within each food category, there was no significant difference in accuracy by method of presentation. However, overall there was better accuracy across groups for simultaneous presentation vs sequential, particularly for amorphous/soft foods and small pieces. When analyses were conducted by food categories, simultaneous presentation was significantly more accurate for the small pieces food category only (data not shown).
Table 2 shows the percentage of estimates for which the gram weight of the portion size selected by the participants fell within ±10% of the measured weight of food consumed by image type, presentation method, and food category. Approximately 15% of the overall number of estimates were within 10% of the amount actually consumed. The mean percentage of accuracy ranged from approximately 9% for foods represented by images of household measures (cup or spoons) to 23% for foods in the single-unit category (such as bagels).
Participants generally preferred the aerial photos for food categories other than spreads for which household measures were preferred (data not shown). Further, participants in Study 1 indicated a striking preference (28 of 29) for simultaneous vs sequential presentation.
Accuracy results for estimates for different photograph sizes and numbers are shown in Table 3. For three foods only four portion-size photos were available, and thus statistical testing was not possible. For number of photos, only one (carrots) was statistically significant, showing greater accuracy for eight rather than four photographs. For the remaining 11 foods, eight vs four photographs were more accurate, though not significantly so. For photograph size, there were no significant differences in accuracy. Though not significant, there was a tendency for portions to be estimated more accurately with large photographs for three of four amorphous/soft foods. When analyses were conducted by food category, eight vs four pictures was significantly more accurate for the amorphous and small pieces food categories only.
The percentages of foods that had estimated weights within ±10% of the measured weights by image size, number of images, and food category are shown in Table 4. The overall percentage of food intake estimates that were within 10% of what was actually consumed was 14%. There were no differences by image size. Accuracy was higher (16%) with four vs eight images (12%). The accuracy of estimates ranged from a low of 11% for foods in the spreads category to 16% for single-unit foods, such as bagels, bread, and chicken breast.
Data (not shown) from the preference questionnaire indicated that participants strongly favored larger over smaller photograhs and four vs eight photographs.
Estimation of portion size is a difficult task. Even when a food or beverage is present and different portion-size estimation aids are tested, some individuals still make errors of up to 40% or greater (14-17). Measurement error in reporting portion size will always exist given that portions must first be accurately perceived, conceptualized, remembered, and reported. The only other study (17) that has assessed portion-size estimation accuracy using digital images displayed on a computer found that accuracy was no different between two types of aids: those displayed on a computer vs those displayed as pictures on a poster. We know of no other study that addressed angle at which a photograph was taken, eight vs fewer pictures, method of presentation, or preferences regarding different aids or how they were presented.
No single image type was significantly associated with more accurate estimates for individual foods or food categories. This was also true when comparing presentation of photographs/images simultaneously vs sequentially. Similarly, the size and number of images had little measurable effect on accuracy. However, though not statistically significant, eight rather than four images tended to be more accurate. These findings are similar to another study that concluded that varying the size of pictures did not affect accuracy and that more pictures were more accurate than fewer (14).
The accuracy results were largely not statistically significant, not surprising in this small sample. However, the findings were still instructive in how to use limited financial resources. A main objective of Study 1 was to decide whether to use a single type of image (photographs) or multiple types of images (photographs, mounds, and household measures) for the different food categories. An important question was whether to continue to add to the existing set of aerial photographs or use different types of images that emerged as more accurate for specific food categories. Similar to other studies, accuracy varied by type of food, with amorphous foods being difficult and single-unit foods (such as bagels) being easiest to estimate (13-16,22,25). Results for different types of photographs indicated that for foods like spreads, images of household measures (such as pictures of teaspoons or tablespoons) yielded more accurate results than aerial photographs. This is fortunate because taking photographs of many foods is more expensive than taking photographs of a few household measures. For the foods for which we have no photographs, our findings allow us to feel comfortable using images of either mounds or household measures as a cost-effective option for providing images. Images of household measures or mounds can be used for the food categories for which subjects preferred these types of images and/or estimated portions more accurately (eg, pasta salad). Given the number of foods that can potentially be reported, many of which are consumed rarely, and the changing food supply, using images of household measures or mounds provides a reasonable option for providing images for foods for which we have no food-specific portion-size photographs.
The finding that aerial photos generally were about as accurate as the other types of images also was fortuitous, given that we already had a significant number of such photographs available. Findings from Study 1 allow us to feel comfortable using the existing aerial food photography and to continue taking photographs in this manner in the future.
The overriding goal was to choose images that support accuracy for reporting portion size; preference information aided decision making in instances when there were no clear findings regarding accuracy. The preference results clearly indicate that images of varying size should be presented simultaneously rather than sequentially. Fortunately, doing so is supported by the direction of the accuracy results. The sequential method was generally viewed as more burdensome; subjects disliked the extra steps (clicks) needed to view alternatives. Photograph size did not affect accuracy, although most participants preferred larger over smaller photographs. In contrast, eight vs four tended to be slightly more accurate, despite the preference for fewer pictures. Therefore, the ASA24 will present more pictures, despite participant preferences. This decision will necessarily lead to smaller pictures. It is possible that six pictures are enough, but we have no evidence to support that conclusion. Furthermore, it is cost-effective to take eight pictures, given that after a food is purchased and prepared for photography, the marginal cost of taking more vs fewer pictures is small. Future research could further elucidate how many pictures are optimal.
As most previous literature indicates, accuracy of portion-size estimation in the context of a 24-hour dietary recall is challenging. Obtaining estimates to within 10% of actual gram weight consumed proved to be difficult. Many individuals do not attend to portion size while eating; and even if they do, visualizing and remembering what was eaten and making estimates using digital photographs or other types of images are difficult. Another source of error is that we asked participants to select the best portion size match among four or eight pictures (representing specific gram weights), yet in reality amounts of foods eaten vary on a continuous scale. Realtime data collection has potential for improving the accuracy of reports, but the bias and burden inherent in recording food intake can greatly affect what and how much is consumed, leading to intakes that are not typical and usually in the direction of undereating (33-35).
By design, photographs were not labeled with portion-size information. Our intention was to evaluate the use of only photographs in assessing accuracy so that we could make decisions regarding the optimal presentation of images in the ASA24. However, in ASA24 portion-size images are labeled with common units (eg, ounces, cups) as an additional reference point. It is unknown if this will improve accuracy or if it will introduce bias that could lead to misreporting.
One limitation of our studies is the small sample size necessitated by the expense of such studies. Larger numbers would have provided more statistical power and perhaps more clarity with respect to some of the issues tested. Our purpose was to conduct small targeted studies to provide direction for how best to proceed with software development for the ASA24.
The overall goal of the ASA24 is to provide a publicly available 24-hour recall that could be unscheduled, automated, and self-administered. Such a tool would make feasible the collection of multiple recalls in large-scale epidemiologic studies, behavior trials, and clinical research, thus enhancing investigators' ability to assess dietary intakes. This instrument could either be sent to participants over the Internet or administered in a clinic/office setting at low cost. Our goal is for this new technology to provide information that is comparable to information provided by an interviewer-administered Automated Multiple Pass Method recall. Multiple small-scale cognitive and usability tests, such as the studies presented here, have been included in the development of the ASA24. Incorporating photographs based on these studies will enhance the application. The ASA24 is available from the National Cancer Institute at http://riskfactor.cancer.gov/tools/instruments/asa24.html.
Funding/Support: This research was funded by National Cancer Institute contracts awarded to Archimage, Inc, and Westat.
Statement of Potential Conflict of Interest: No potential conflict of interest was reported by the authors.