|Home | About | Journals | Submit | Contact Us | Français|
Objective To determine the accuracy and consistency of fourth-graders' school breakfast and school lunch recalls obtained during 24-hour recalls and compared with observed intake.
Design Children were interviewed using a multiple-pass protocol at school the morning after being observed eating school breakfast and school lunch.
Subjects 104 children stratified by ethnicity (African-American, white) and gender were randomly selected and interviewed up to 3 times each with 4 to 14 weeks between each interview.
Statistical analysis Match, omission, and intrusion rates to determine accuracy of reporting items; arithmetic and/or absolute differences to determine accuracy for reporting amounts; total inaccuracy to determine inaccuracy for reporting items and amounts combined; intraclass correlation coefficients (ICC) to determine consistency.
Results Means were 51% for omission rate, 39% for intrusion rate, and 7.1 servings for total inaccuracy. Total inaccuracy decreased significantly from the first to the third recall (P=0.006). The ICC was 0.29 for total inaccuracy and 0.15 for omission rate. For all meal components except bread/grain and beverage, there were more omissions than intrusions. Mean arithmetic and absolute differences per serving in amount reported for matches were -0.08 and 0.24, respectively. Mean amounts per serving of omissions and intrusions were 0.86 and 0.80, respectively.
Applications/conclusions The low accuracy and low consistency of children's recalls from this study raise concerns regarding the current uses of dietary recalls obtained from children. To improve the accuracy and consistency of children's dietary recalls, validation studies are needed to determine the best way(s) to interview children.
The 24-hour dietary recall is the most commonly used method for dietary surveys in the United States (1) and is often used to collect information from children (2). For example, children's 24-hour recalls were used in the Bogalusa Heart Study (3), Child and Adolescent Trial for Cardiovascular Health (4), 5-a-Day Power Plus (5), Continuing Survey of Food Intakes by Individuals (6), and School Nutrition Dietary Assessment Study (7).
Validity is the extent to which a method provides accurate information. Reliability is the extent to which the information varies when the same method is administered on different occasions. Research concerning the validity and reliability of children's dietary recalls must include knowledge about foods actually eaten (8,9) because comparing information obtained from 2 self-report methods depends on the child's memory without knowledge of foods actually eaten. Although parents may provide information regarding what their children have eaten, several studies emphasize that once children begin attending school, parent's reports cannot be taken as truth (2,10,11). A better comparison is between information from a self-report method and information from a method independent of the child's memory, such as observation (8,9).
A recent Medline search (12) yielded 8 validation studies contained in 7 publications (10, 13-18) but no reliability studies regarding children's dietary recalls provided without parental assistance. To our knowledge, 9 other studies concern the validity of children's dietary recalls provided without parental assistance (11,19-26). Some studies (19,27) obtained children's dietary recalls on multiple days, and others (28-31) included “practice” recalls that were not analyzed. For one study (22) titled Reliability and Validity of the 24-hour Recall, children's recalls were compared with observations of 3 camp meals; however, only 1 recall per child was obtained. Thus, reliability pertained to the variance accounted for between observed and recalled values, which is different than consistency or variability from one recall to another. To our knowledge, no validation studies have evaluated the consistency of children's dietary recalls provided on multiple days without parental assistance. (The term consistency is used rather than reliability because each recall regards a different dietary event whereas the statistical term reliability regards measuring the same event multiple times.)
Observations of children eating school meals provide an excellent opportunity to validate portions of children's dietary recalls (32,33). Observations in homes may be too intrusive (34), but children are accustomed to being watched while eating at school. Foods eaten at school are important because a significant percentage of children's total daily intake is consumed at school (34). Regulations stipulate that school breakfast and school lunch provide one-fourth and one-third, respectively, of the daily recommended levels for energy, protein, calcium, iron, and vitamins A and C (35,36). More than 95% of children are enrolled in school (37); on a typical school day nationwide, almost 7.4 million and 27 million children participate in the School Breakfast and National School Lunch Programs, respectively (35,36).
Children report what they eat as foods, but the accuracy of dietary self-reports compared with actual intake is typically assessed indirectly at the nutrient level (38,39). Accuracy assessed indirectly may appear high for some nutrients but not others because substitutions of certain items may be similar to items actually eaten in some nutrients but not others (38,39). Insight gained from direct comparisons of foods reported eaten to foods actually eaten may guide research to improve methods for assessing diet to produce more accurate self-reports and provide practical guidance for eating (40-42).
The purpose of this study was to determine the accuracy and consistency of fourth-graders' school breakfast and school lunch recalls obtained during 24-hour recalls by comparing recalls with observations. Accuracy was the extent to which a child's recalls provided correct information compared with observations. Consistency was the extent to which a child's accuracy varied from one day to another.
The institutional Human Assurance Committee approved the study. References regarding reliability studies (43,44) were consulted to determine an appropriate design and sample size. The intraclass correlation coefficient (ICC) estimator was based on moment estimators of the between- and within- variance components (45). The expected length of a 95% confidence interval (45) for assumed ICCs between 0.4 and 0.6 for designs of 2 to 5 recalls per child was calculated. In all cases, the design with the shortest confidence interval per recall had 3 recalls per child. A sample size of 80 was selected because a sample of more children was of marginal benefit to further reducing the expected confidence interval width. Although the sample would be stratified by ethnicity and gender, the study was not powered to detect ethnic/gender differences.
Children were recruited from all 22 fourth-grade classes at 6 schools in one district during the 1999-2000 school year. The schools were selected to obtain a final sample with equal numbers of African-American (AA), white (W), male (M), and female (F) children with high participation in school breakfast and school lunch. A total of 36%, 66%, 68%, 71%, 83%, and 91% of the children across all grades at the respective 6 schools were eligible to receive free or reduced-price school meals during the data collection period. Of the 523 fourth-grade children (21% AAM, 22% AAF, 27% WM, 30% WF) invited to participate, 73% overall (n=382; 22% AAM, 23% AAF, 25% WM, 30% WF) provided child assent and parental consent (46). Of those who agreed to participate, a sample of 104 children, stratified by ethnicity and gender, was randomly selected.
Only randomly selected children who participated in school breakfast and school lunch were observed because contents of meals brought from home and eaten at school can be difficult to identify while conducting unobtrusive observations (47). One of 3 trained dietitians used a recording form while observing 1 to 3 children simultaneously. Observations followed previously used procedures and covered the entire breakfast and lunch periods to account for trading of foods (48,49). An observer stood by tables where children sat; thus, children knew when they were being observed, but did not know who would be interviewed. Practice observations were conducted with each class before data collection to acquaint children with the presence of observers and therefore lessen reactivity during data collection (33).
Inter-observer reliability (IOR) was conducted for training before data collection, and twice monthly throughout data collection. Results from IOR on 18 children indicated 89% agreement across observers for food items in which the amounts observed eaten were within one-fourth serving. This percent agreement is considered satisfactorily high (47,50).
Children were interviewed individually by 1 of 3 trained dietitians the morning after school breakfast and school lunch were observed. A different dietitian conducted the interview than the one who observed the child the previous day. Interviews were conducted in a private location at school, audio-recorded, and transcribed. For each child, a different dietitian conducted each interview on a different weekday when possible. Beginning and ending interview times were recorded to determine length. Before data collection, the minimum number of weeks between any 2 recalls provided by each child was set at 4 to reduce possible learning effects; however, a maximum number was not set.
Interviews followed a written multiple-pass protocol. The original multiple-pass protocol was developed by the United States Department of Agriculture (51) and included 3 passes (52); they recently revised it to include 5 passes (53). The multiple-pass protocol used in this study had 4 passes (Figure 1) and was patterned after the one used by the Nutrition Data System for Research (NDS-R, version 4.03, Nutrition Coordinating Center, University of Minnesota, Minneapolis, 2000). Instead of using the computerized NDS-R version during the interviews, information was written on an interview form.
Inter-interviewer reliability (IIR) was conducted for training before data collection, and monthly throughout data collection. One dietitian interviewed a child while the remaining dietitian(s) sat behind the child and completed an interview form according to what was heard. At the end of the interview, the listening dietitian(s) could ask the child questions. Results from IIR on 8 children indicated that only one additional question would have been asked if a different dietitian had conducted the interview. The study's principal investigator also randomly selected and reviewed 5% of each interviewer's transcripts and audiotapes to ensure adherence to the interview protocol across dietitians.
To determine which meals in the children's 24-hour recalls referred specifically to school breakfast and school lunch, children had to indicate school as the location where the meal was eaten, refer to breakfast as school breakfast or breakfast, and refer to lunch as school lunch or lunch. These requirements were established after determining they occurred in most recalls. Furthermore, the reported mealtime had to be within 1 hour of the observed mealtime. Although some children had difficulty reporting observed meal times exactly, especially at breakfast, most children could report times within 1 hour.
Items observed and/or reported eaten at school breakfast and school lunch were grouped by meal component and statistical weights were assigned (Figure 2). Combination entrees were considered a single meal component and counted only once during analyses. Each item reported and/or observed eaten was classified as a match, omission, or intrusion (Figure 2). Because foods can be reported many ways, items were scored as matches unless it was clear that the child's recall did not describe an observed food. This broad interpretation maximized the scored correctness of the child's recall; thus, true correctness may be overestimated. Examples of items observed and reported that matched were all types of white milk (eg, skim, 1%, whole) and all types of pizza (eg, cheese, sausage, pepperoni). Vegetables (eg, green beans, green peas), milk flavors (eg, chocolate, strawberry, white), and juices (eg, orange, apple) that differed were not matched.
To analyze accuracy for reporting food items (irrespective of amounts), matches, omissions, and intrusions were tallied for each recall for each child, and corresponding rates were calculated (16) (Figure 2). Omission and intrusion rates were calculated separately because a previous study indicated they are empirically uncorrelated (16). The omission rate addresses reality because it concerns the proportion of items eaten but not reported; the intrusion rate addresses the recall because it concerns the proportion of the recall that consists of items that were not observed (38,39). A child's recall could have low accuracy due to high omission and low intrusion rates, low omission and high intrusion rates, or high omission and high intrusion rates.
Amounts eaten were observed, recorded, and scored in servings (Figure 2). To analyze accuracy for reporting amounts for matches, absolute and arithmetic differences (16) were calculated between amounts observed and reported eaten to determine the extent to which children underreported or overreported amounts (Figure 2). Amounts per item were calculated for omissions and intrusions separately (Figure 2) to assess whether these errors in reporting involved small or large amounts of servings.
A single measure of total inaccuracy (Figure 2), which was developed and evaluated previously (26), was used to capture children's total inaccuracy for reporting items and amounts combined. This measure has the advantage of being based on all items and amounts in the observation and/or recall; however, it fails to indicate whether errors are due to omissions, intrusions, or incorrectly reported amounts.
To determine the consistency (expressed as ICCs) of the accuracy of children's recalls, omission rate, intrusion rate, and total inaccuracy were each to be fit to a multivariate normal distribution with an assumed compound symmetric variance-covariance matrix (ie, a mixed-model analysis of variance) using SAS PROC MIXED. Total inaccuracy was square root transformed prior to analysis to stabilize the within-child variances across the range of mean inaccuracy per child. Data management and statistical calculations were conducted using Microsoft Access 2000 (SRI, Microsoft Corp, Redmond, Wash, 1999), SPSS for Windows (version 7.5, SPSS, Chicago, Ill, 1996) and SAS (Release 8.00, TS Level 00MO, SAS Institute, Inc, Cary, NC, 1999).
One hundred four children (24 AAM, 27 AAF, 25 WM, 28 WF) were observed and interviewed once each, 92 (21 AAM, 23 AAF, 23 WM, 25 WF) of the 104 children were observed and interviewed twice each, and 79 (19 AAM, 20 AAF, 20 WM, 20 WF) of the 92 children were observed and interviewed 3 times each for a total of 275 recalls to compare with school breakfast and school lunch observations. The time interval between each recall for each child ranged from 25 to 99 days (mean = 44; median = 41). Interviews to obtain the 24-hour recalls ranged in length from 5 to 29 minutes (mean = 15; median = 14).
Mixed-model analyses of variance, when applied to the square root of total inaccuracy and omission rate, both failed to indicate significant differences by interviewer or weekday. There was a significant sequence (first, second, or third recall) effect using square root of total inaccuracy (P=0.006). Least squares means (LSM) for the square root transformed data were 2.69, 2.58, and 2.45 for the first, second, and third recalls, respectively. Corresponding LSM for untransformed data were 7.54, 7.17, and 6.43 servings, respectively; these were slightly higher than if the transformed data were squared. The ICC for the square root of total inaccuracy was 0.29 (P<0.0003). Table 1 provides a descriptive summmary of raw (unadjusted) means, medians, and modes for total inaccuracy, square root of total inaccuracy, omission rate, and intrusion rate for all 275 recalls.
Figure 3, Sections A and B, shows the distributions of omission and intrusion rates, respectively, across all 275 recalls. Results from a mixed-model analysis of variance on omission rates indicated an ICC of 0.15 (P<0.04), meaning that variability within child (ie, from one recall to another for the same child) was much greater than variability between children (ie, from one child to another). A mixed-model analysis of variance was not attempted on intrusion rates because the underlying structure of the data with 0% for 33 of the 275 recalls (12%) did not fit the required normality assumption. To further examnine errors for reporting items, omission and intrusion rates were tallied by deciles for the 79 children with 3 recalls each (Table 2).
Across all 275 observations/recalls, there were 2,292 items observed eaten and 1,779 items reported eaten with 35% matches, 41% omissions, and 24% intrusions. Table 3 provides the distribution of matches, omissions, and intrusions by meal component. For all meal components except for bread/grain and beverages, there were more omissions than intrusions.
For matches, mean arithmetic difference per serving in amounts reported was −0.08, indicating an overall slight tendency to underreport amounts of items actually eaten; mean absolute difference per serving was 0.24. Mean amounts of omitted and intruded items per serving were 0.86 and 0.80, respectively.
The importance of validation studies has been thoroughly discussed (54) and the need for validated methods for assessing children's dietary intake has been emphasized repeatedly (12,25,34,55-59). However, this is the first validation study to evaluate the accuracy and consistency of children's dietary recalls provided on multiple days without parental assistance. Results indicate that the accuracy of children's school breakfast and school lunch recalls obtained during 24-hour recalls was poor compared with observation; furthermore, accuracy was inconsistent from one recall to another for the same child.
Considering that omission and intrusion rates may both range from 0% to 100%, with 0% as perfect, what constitutes acceptable accuracy? Applying an arbitrary pass or fail criteria for accuracy which establishes omission and intrusion rates of ≤30% as passing and >30% as failing, only 3 of the 79 children (4%) achieved passing accuracy on each of their 3 recalls (Table 2). With this criteria, a child's recall would have acceptable accuracy even if the child omitted (ie, failed to report) up to approximately one third of items actually eaten, and if up to approximately one third of items reported by the child were intruded (ie, falsely reported). Taken together, a recall with this level of error would have little utility.
To what extent can recall accuracy vary from one recall to another for a child and still be deemed acceptably consistent? Applying an arbitrary consistent or inconsistent criteria that establishes omission and intrusion rates within 30% across an individual child's 3 recalls as consistent and >30% as inconsistent, only 18 of the 79 children (23%) provided consistent recalls (Table 2). With this criteria, recall accuracy would be consistent for a child who omitted 25%, 10%, and 35% of items and intruded 5%, 35%, and 15% of items during each of 3 respective recalls. However, recall accuracy would also be consistent for a child who omitted 80%, 70%, and 50% of items and intruded 50%, 75%, and 60% of items during each of 3 respective recalls. Thus, although acceptable consistency is important, acceptable consistency without acceptable accuracy is useless because children who are consistently inaccurate provide very little information regarding what they actually ate.
Analyzing reporting of items separately from reporting of amounts provides insight about what contributes to inaccurate recall, which in turn provides insight about what improvements can be made. When items are omitted or intruded, amounts cannot be correct. In this study, children omitted more than half of the items they were observed eating at school breakfast and school lunch; furthermore, of what they did report eating at these two meals, almost 40% was not observed eaten. Thus, research to improve the accuracy of children's dietary recalls must first focus on how to improve children's ability to correctly report items eaten. Furthermore, because children report what they eat as foods, not nutrients, unless reporting of items improves, derived measures of intake (eg, energy, nutrients) from children's recalls are not based on accurate reports of actual foods consumed.
When items were correctly reported (ie, matched), children's reported amounts were fairly accurate (ie, one-tenth and one-fourth serving for arithmetic and absolute differences, respectively). This is similar to results from a previous study with fourth-graders (16). The average omission was more than four-fifths serving; thus, omissions were generally not items for which children ate only small amounts. The average intrusion was four-fifths serving; thus, intrusions were items for which children falsely claimed to have eaten most of the serving.
The total inaccuracy measure for each recall, based on all items observed and/or recalled across breakfast and lunch, captured the dimension of the total error, in servings, of the dietary recall. Across multiple days, this measure had an ICC of 0.29, indicating that there was less variability of total error from child to child than within children. We expected mean total inaccuracy to decrease (and thus accuracy of recall to increase) from the first to the third recall because some children quickly learn tasks and therefore perform better at them. Examining the within-child data, this learning effect was confined to a subset of children, suggesting that similar learning across all children in the population was unlikely; however, our sample size was insufficient to explore this supposition.
There were more omissions than intrusions for 8 of 10 meal components; this suggests that children were not simply substituting intrusions for omissions most of the time. The lowest omission rate by meal component was for beverage, perhaps because milk was regularly included for school breakfast and school lunch. Omissions and intrusions were high among all other meal components, including entree and combination entree. Although each child was asked during the third interview pass whether anything was added to each item (Figure 1), the omission rate was 67% for condiments, perhaps because children also omitted many items to which condiments were added.
The omission and intrusion rates from this study were higher than those from a previous study (16). Although both studies included fourth-graders, recalls in the previous study only included school lunch (16); in this study, however, recalls covered 24 hours. Thus, the cognitive burden of recalling school breakfast and school lunch items in the context of a 24-hour recall seems to be greater than that of recalling lunch items as a single meal, and negatively impacts fourth-graders' ability to accurately recall items eaten. This is similar to a comment by Reynolds et al (17) that underreporting is less of a problem if recalls are restricted to one meal.
To what extent did the interview protocol used in this study contribute to the low accuracy and low consistency of the children's recalls? This question cannot be answered because this study used the same interview protocol to obtain each recall from each child on each occasion. However, results from 3 validation studies (11,24,26), each with a small number of children, indicated that the structure of the interview protocol seemed to influence children's accuracy regarding daycare snack or school lunch intake compared with observation.
Several studies have compared children's dietary recalls with school lunch observations (eg, 10,13,15,16,24-26); however, to our knowledge, only one other study (14) also included school breakfast observations. Although certain aspects of school meals (eg, written cycle menu) may enhance memory for the items eaten (8,12), other aspects of school meals (eg, menu changes) may impede memory for items eaten (8).
An important strength of this study is the use of observation (which did not rely on memory) to validate children's recalls. This strength is especially important in understanding the low accuracy and low consistency of recalls from most children studied. If the recalls had been validated against another self-report method (eg, food diaries), many may have been deemed statistical artifacts rather than actual recalls with low accuracy and low consistency. Dietary recalls are commonly used with children, but research regarding the accuracy and consistency of children's recalls compared with non–self-report methods is scarce. Although there is concern that observations cause reactivity (54), if reactivity were an issue, the expected levels of accuracy would be higher than our results. Considering the importance of obtaining children's dietary recalls, research is needed to develop improved methods to increase the accuracy and consistency of their recalls. Such research needs to rely on non–self-report methods for validation. Conducting observations of children's consumption during part of their day in an environment natural to them, such as school, provides a gold standard for comparison to their recalls.
Another strength of this study is that accuracy was analyzed several complementary ways. These included accuracy for reporting food items, accuracy for arithmetic and absolute differences per serving in reporting amounts for matches, errors in amounts per serving of omissions and intrusions, and total inaccuracy accounting for errors.
There are several limitations of this study. First, children were recruited from only 6 schools, which were not selected randomly; however, the percentages of AAM, AAF, WM, and WF children who agreed to participate were similar to those of the total population of fourth-graders at the 6 schools, which suggests that our sample is representative of the population (46). Second, the study was not powered to detect differences between the ethnic/gender groups; instead, the sample was stratified by ethnicity and gender. Third, analyses were limited to the school breakfast and school lunch portion of the 24-hour recall because these were the only meals observed.
■Dietitians and nutrition practitioners need to be cautious about how information regarding 24-hour dietary recalls obtained from children is used. Children in this study reported less than half of the items they were observed eating, and almost 40% of what they reported eating was not eaten. Furthermore, a child's accuracy could be high for one recall but low for another. Other objective evidence (eg, anthropometric measures, laboratory values) may provide some, but not all, of the missing information.
■ Nutrition researchers need to conduct validation studies to determine the best way(s) to interview children to enhance recall accuracy. Specifically, direction is needed regarding how to: a) initially instruct children to report items eaten, b) prompt children to report forgotten foods, and c) prevent children from falsely reporting items.
■ Dietitians and nutrition practitioners need to rely on concrete evidence to determine whether changes in methods for obtaining dietary recalls from children actually improve their accuracy and consistency.
This research was supported by grant HL 63189 from the National Heart, Lung, and Blood Institute of the National Institutes of Health. S.D. Baxter was the Principal Investigator.
The authors express appreciation to the children, faculty, and staff of Goshen, Hephzibah, McBean, Monte Sano, Rollins, and Southside Elementary Schools, and to the Richmond County Board of Education in Georgia for allowing data to be collected. Appreciation is also expressed to Candace Kopec, PDt, for her help with conducting observations and interviews, and to Roy Frye for developing the Microsoft Access forms and reports for data entry and data management.