|Home | About | Journals | Submit | Contact Us | Français|
Fruit and vegetable (FV) assessment tools that are valid, reliable, brief, and easy to administer and code are vital to the field of public health nutrition.
To evaluate three short FV screeners (a 2-item SERVING, a 2-item CUP, and a 16-item FV screener) among adults using multiple 24-hour dietary recalls (24HRs) as the reference instrument and evaluate test-retest reliability of the screeners over a two–three week time period.
Validity and reliability study.
244 adults for validity study and 335 adults for test-retest reliability.
Median values for FVs were calculated for the screeners and 24HRs. The Wilcoxon signed rank test was used to compare the screeners to the 24HRs. De-attenuated Pearson correlations were reported for validity and intraclass correlation coefficient (ICC) used for reliability.
The estimated median daily servings/cups of FVs for the 2-item SERVING screener was lower, for the 2-item CUP screener was equivalent for men but higher for women, and for the 16-item screener were about the same when compared to the 24HR values. The deattenuated correlations comparing the 24HR with the screeners were positive but weak for the 2-item SERVING screener and were positive and moderate in strength for the 2-item CUP and 16-item screeners. The test-retest ICCs were all positive and fairly strong for all of the screeners.
While dietary screeners offer a more cost-effective, less burdensome way to obtain gross estimates to rank individuals with regard to FV intake, these methods are not recommended for assessing precise intake levels.
High dietary intake of fruits and vegetables (FVs) is associated with a lower risk for chronic diseases1. It is recognized by some dietary assessment researchers that the 24-hour dietary recall (24HR), and in some situations, the food record can best assess dietary intake2. For FV measurement, biomarkers such as serum carotenoids also prove useful as adjunct measures3,4. However, these intensive measures are typically not included given the invasiveness and expense of these measures. Fruit and vegetable (FV) assessment tools that are valid, reliable, brief and easy to administer and code are vital to the field of public health nutrition.
In 2003, Kim and Holowaty reviewed 10 brief validated FV instruments that measured self-reported FV intakes in adults5. Validation studies of the instruments were based upon comparisons with multiple methods including extended food frequency questionnaires (FFQs), weighed dietary records and single or multiple 24-hour dietary recalls (24HRs). Correlations observed for FV intakes combined for most of the instruments ranged from r = 0.29 – 0.84, and were usually around 0.4 – 0.5. Other additional reports examining the validity of brief survey tools have been published. A study among predominately African-American adults compared 7- and 31-item FFQs measures of FV intake against 24HRs6. When average values for FV intake at the first administration were compared to 24HRs, both the 7- and 31-item screeners overestimated FV intakes6. Low to moderate correlations were reported for both screeners, although the strength of correlations differed markedly depending on whether the screener was a first- or second-time administration. A second study evaluating the National Cancer Institute’s (NCI) 19-item FV screener among a representative sample of U.S. adults compared FV estimates from the screener to four 24HRs over a year, modeled to take into account within-person variability7. Screener estimates of median intake were within 0.8 servings of that of 24HRs; correlations were 0.66 (men) and 0.51 (women). Lastly, a study among adults participating in community interventions assessed FV intakes with the NCI FV screener and a single question and compared those measures with multiple 24HRs8. The NCI screener generally overestimated while the single item screener underestimated FV intakes measured by 24HRs modeled to take into account within-person variability. Low to moderate correlations were observed for the NCI screener and the single question that assessed overall FV intake; correlations were higher among women than men for both instruments8.
Our primary study purpose was to evaluate three short fruit and vegetable screeners (FVS)—a 2-item SERVING, a 2-item CUP and a 16-item FV screener—in adults using multiple 24HRs as the reference instrument. One of the screeners was a new 2 item CUP FVS, developed to assess FVs in terms of cups rather than servings, to be consistent with the 2005 Dietary Guidelines for Americans9. As a secondary aim, we sought to evaluate the test-retest reliability of the screeners over a 2–3 week time period. Specifically, we examined how the FV estimates obtained from each of the screeners compared with FV estimates from two or three non-consecutive 24HRs, which were modeled to take into account within-person variability. This was done with the intent of selecting the best of the screeners for use in a larger survey, the Food Attitudes and Behaviors (FAB) Survey, which was developed by the NCI to assess best correlates of FV intake among a sample of US adults.
The sampling frame included adults 18 years and older in Synovate’s Consumer Opinion Panel (N=450,000) in Fall 2006. A stratified random sampling of households in the panel, with an oversampling of households with African-Americans, was conducted. This study was approved by the NCI Office on Management and Budget as well as by the NCI Institutional Review Boards.
The 2-item SERVING FVS consisted of two items:
Both questions were open ended and participants were asked to write in a number for servings each day.
The 2-item CUP FVS consisted of two items:
In addition, a visual call-out box providing examples of one cup equivalents was placed on the same page of the mailed survey to aid participants with recall of portion sizes. The portions for one-cup equivalents (see Figure 1) were developed using information directly from the USDA MyPyramid website (http://www.choosemyplate.gov/food-groups/vegetables_counts_table.html and http://www.choosemyplate.gov/food-groups/fruits_counts_table.html)10. These questions and the visual call-out box were developed and tested through several rounds of cognitive interviewing among diverse adult participants in Bethesda and Baltimore, Maryland, and New York City. Portion size options were close ended and included the following options: none, ½ cup or less, ½ to 1 cup, 1–2 cups, 2–3 cups, 3–4 cups and 4 cups or more.
The 16-item FVS was a modified version of the NCI FVS evaluated in the Eating at America’s Table Study (EATS) 7,11. The original NCI screener consisted of 19 questions (http://riskfactor.cancer.gov/diet/screeners/fruitveg/allday.pdf)12. After cognitive interviewing to evaluate comprehension and content of items and an expert review by scientists at NCI who originally developed the 19-item screener, as well as other content experts at NCI, it was decided to delete three questions that were deemed as not applicable or relevant for the current 16-item FVS. The final 16-item FVS in FAB consisted of frequency and portion size questions that asked about consumption over the past month (i.e., fruit juice, fruit, lettuce/salad, fried potatoes, other potatoes, dried beans, other vegetables and tomato sauce). Under other vegetables, a shortened version of the 1-cup equivalent visual was provided to aid with portion sizes. Ten frequency category choices ranged from never to 5+ times per day. There were four portion size categories for each food ranging from less than ¼ cup to more than 2 cups, as well as small, medium and large portions where applicable (e.g., for fried potatoes, small, medium and large order were placed in parentheses next to cup size).
Multiple 24HRs were conducted by phone with participants by interviewers experienced in collecting data for the National Health and Nutrition Examination Survey (NHANES). The data were collected using the Automated Multiple Pass Method (AMPM)13 and processed and coded using the USDA Post-Interview Processing System and SurveyNet14. Participants were mailed the USDA Food Model Booklet and a set of household measuring guides ahead of time to facilitate portion size estimation. Attempts were made to obtain three recalls (two weekday and one weekend recalls).
For the validity study, 1,263 adult members of Synovate’s Consumer Opinion Panel were sent a postcard inviting them to participate in a study on eating behaviors. Synovate attempted to contact 833 of those adult members and was able to reach 56 percent (n = 516). Of the 516 they were able to reach, 185 refused or were ineligible, resulting in a sample of 331. Of those, 77 percent (n = 254) completed two or three 24HRs (69% completed 3 recalls and 8% completed only 2 recalls) on the phone over a two to three week time period. After completing the recalls, the 254 participants were mailed a Food Attitudes and Behaviors (FAB) Survey, which included $30 as remuneration. The FAB Survey consisted of a total of 65 questions in 8 sections. The three short FV screeners were embedded in different places throughout the larger survey instrument (the 2-item SERVING FVS in the beginning, the 2-item CUP FVS in the middle and the 16-item FVS at the end of the survey). Earlier cognitive testing on the survey determined that participants did not realize they were being asked their fruit and vegetable consumption three different ways, so this was a successful strategy to minimize potential participant bias. The full survey took approximately 20–30 minutes for participants to complete. Of the 254 who completed at least two recalls, 244 returned the survey.
A separate group of 663 Synovate panel members were invited to take part in the test-retest reliability study of the FV screeners. Of those, 60 percent (n = 401) respondents returned the first survey and 83 percent of those (n = 335) returned the second survey. Participants received $5 for returning each completed survey. Test-retest was conducted over a 2–3 week timeframe.
SAS (v. 9.1, SAS Institute, Cary, NC) was used for all statistical analyses. For the 2-item SERVING FVS, if a range was reported, the midpoint was calculated to one decimal point (e.g., 2–3 servings = 2.5). Portions or fractions were rounded to the closest half serving. For the 2-item CUP FVS, the mean for the responses were calculated in cups (e.g., ½ to 1 cup = 0.75 cups). For the 16-item FVS, the values for fruits, vegetables, FVs, vegetables without fried potatoes and fruits and vegetables without fried potatoes were computed using the scoring system outlined by NCI (http://riskfactor.cancer.gov/diet/screeners/fruitveg/scoring/allday.html)15. The MyPyramid Equivalents Database, released in Oct 2006, allowed derivation of fruit and vegetable cup equivalent values in the 24HR data16.
For validity, sums for fruits, vegetables and FVs were created for the three screeners and 24HR using cup equivalent values. Legumes were included in the calculation of vegetables in the 24HR data for comparison to the cup equivalent screeners (2-item CUP FVS and 16-item CUP FVS), but not for the serving screener (2-item SERVING FVS), as legumes were specifically queried in the cup equivalent screeners but not the serving screener. A measurement error approach was used to establish the validity of the screeners with truth as estimated by the 24 hour recall11. Outliers were identified and any observation that was greater than 3 times the interquartile range—above the 75th percentile or below the 25th percentile (except for the 2-item CUP FVS which was bounded) was excluded 17. Median intake was calculated after exclusion of participants with outliers and missing data18. The Wilcoxon signed rank test was used to compare the screeners to the 24HRs. The best transformation for the 24HR variables was identified by selecting the Box-Cox transformation that maximized the Shapiro-Wilk statistic and was applied to all variables prior to calculating correlation coefficients. All variables were transformed, with transformation parameters ranging from 0.44 – 0.52. A measurement error model was fit, using the 24HRs as the reference instrument. This model allows the screener to have systematic bias as well as random error, but assumes that the 24HRs include only random error that is uncorrelated with the screener11. From this model, the deattenuated correlation between the screener and truth as estimated by the recalls and the attenuation coefficients were estimated. Because of the assumptions made regarding 24HRs only being comprised of true intake and random error, the deattenuated correlations represent the correlation between true intake and the screener, adjusted for the random noise that occurs in the 24HR. This type of measurement error leads to attenuation of the crude correlation coefficients, and, therefore, the deattenuated correlations tend to be higher than the corresponding crude correlation between the 24HR and the screener11. To assess test-retest reliability, intraclass correlation coefficients were computed for the two administrations of the screeners over a 2–3 week timeframe. This model included a different intercept for the second administration of the survey in case that the mean intake on the second survey decreased due to participant fatigue, as often occurs with dietary data.
Most of the participants who comprised the validity sample (n = 244) were female (57%), white (71%), aged 35 years or older (53%), had at least a high school education (87%) and earned $32,500 or more annually (68%). Similar findings were observed for participants who comprised the test-retest reliability sample (n = 335), as 51% were female, white (72%), aged 35 years or older (56%), had at least a high school education (84%) and earned $32,500 or more annually (57%) (data not shown). African-Americans were oversampled and comprised 25% of the validity sample and 28% of the reliability sample.
Tables 1 A and B shows median values for fruits, vegetables and FVs for the 24HRs and for each of the screeners for men and women, combined and separately (N=244). The estimated median daily servings of FVs as measured by the 2-item SERVING FVS was lower than the median servings for the 24HRs for both men (3.0 vs. 4.8) and women (4.0 vs. 4.2). The estimated daily cups of FVS as measured by the 2-item CUP FVS was not significantly different from the 24HRs for men (2.5 on screener vs. 2.6 on 24HR); for women, 2-item CUPS FVS was higher than the 24HR values (3.0 vs. 2.3). Daily cups of FVS as measured by 16-item FVS were lower than the 24HR values (2.1 vs. 2.6) for men and were not significantly different for women (2.2 on screener vs. 2.3 on 24HR).
Table 2 shows overall deattenuated Pearson correlation coefficients for the three screeners compared to truth as estimated by the 24HRs for participants who comprised the validity sample (n=244). For the 2-item SERVING FVS, the correlation coefficients for fruits and vegetables for men and women combined and for women only were positive, but weak. Values obtained for men were not significantly correlated. For vegetables, none of the values were significantly different from zero. For the 2-item CUP FVS, the correlation coefficients for FVs for men and women combined, men and women were positive and moderate in strength. For the 16-item CUP FVS, the correlation coefficients for FVs for men and women combined, men and women were also positive and moderate in strength. Correlations of similar direction and strength were also observed for fruits, vegetables, vegetables without fried potatoes, and FVs without fried potatoes with noted values of 0.77 for vegetables without fried potatoes (men) and 0.65 for fruits (women).
When comparing the three screeners compared with the 24HRs stratified by socio-demographic factors, correlations did not improve markedly (data not shown). For all three screeners, the strongest correlations were seen in the younger versus older (r = 0.39 – 0.52 for 18–34 year olds vs. 0.09 – 0.23 for 55 years and older) and those reporting a college degree or higher rather than some college or less (r = 0.43 – 0.48 vs. 0.21 – 0.25). Generally those reporting higher income had stronger correlations than those reporting lower income (0.47 for 2-item CUP and 0.55 for 16-item FVS for annual income of $60,000 and above vs. 0.31 for 2-item CUP and 0.24 for 16-item FVS for <$32,500), with the exception of the 2-item SERVING FVS, where the results were opposite (r=0.34 for an annual salary of <$32,500 vs. 0.14 for $60,000 and above). There were stronger correlations for African-Americans versus whites for the 2-item SERVING FVS (0.24 vs. 0.36); results were comparable for African-Americans and whites for the 24HRs and 2-item CUP FVS (0.40 for both) and for the 16-item FVS compared with the 24HRs, the correlations were weaker among African-Americans versus whites (0.27 vs. 0.42).
Table 3 shows the intraclass correlation coefficients (ICCs) for test-retest reliability of all three screeners over a 2–3 week timeframe for participants who comprised the reliability sample (n=335). The 2-item SERVING FVS had ICCs of 0.70 for FVs for men and women combined and for men (the value for women was 0.71). Values of similar direction and strength were observed for fruits only and vegetables only. The 2-item CUP FVS and the 16-item FVS had ICCs that ranged from 0.62 to 0.67 for FVs for men and women combined and men and women only categories. The fruits only ICCs ranged from 0.58 to 0.66 and the vegetables only ICCs ranged from 0.52 to 0.63.
Dietary screeners are increasingly being implemented as an alternative to more comprehensive dietary assessment methods for population-level community (e.g., practice-based) intervention studies and surveillance due to cost and time restraints. For example, the Behavioral Risk Factor Surveillance System (BRFSS)19, the National Health Interview Survey (NHIS)20, and the National Health Information National Trends Survey (HINTS)21, all use short screener measures to obtain estimates of FV intake.
The findings for the current study show that short dietary screeners are useful to assess gross level estimates and to rank individuals with regard to fruit and vegetable intake; however, when compared to more comprehensive gold standard methods such as multiple 24HRs and food records, screeners fall short on being able to validly and reliably estimate FV intake. Whenever possible, gold standard measures should be used to obtain accurate and precise intake. To facilitate this process, some advances have been made with regard to 24HRs making them more feasible for researchers to implement in their studies. The Automated Self-Administered 24-Hour Recall (ASA24) is a web-based dietary recall, which can be collected in English and Spanish and is available free of charge22. However, there will be cases when these more comprehensive assessments are still not practical to complete, since they can take 20–30 minutes each, increasing participant burden.
In the current study, overall, the 16-item FVS demonstrated the least bias for capturing total FV intake with and without fried potatoes. The median FV intake on the 16-item FVS was not significantly different than median values in daily cups of FV from the 24HR overall or for women; men underestimated their FV intake when fried potatoes were included as a vegetable. Both the 2-item SERVING FVS and 2-item CUP FVS overestimated fruit intake, which is consistent with previous findings5. However, the 2-item SERVING FVS greatly underestimated vegetable intake when compared with the recalls, whereas the 2-item CUP FVS had the same value as the recalls for overall median vegetable intake. This resulted in the 2-item SERVING FVS underestimating total FV intake and the 2-item CUP FVS overestimating total FV intake when compared with the recalls. These results are consistent with what was found in the Kim and Halowaty review: longer screeners overall tend to have better validity than shorter ones5.
One main point that must be highlighted in interpreting the findings across males and females is that the validity of the estimates seen in Tables 1A and 1B are more suspect and problematic in men versus women, which is further exemplified in Table 2A, where both the 2-item SERVING and 2-item CUP FVS had non-significant correlation coefficients for men when compared with the 24HRs.
The correlations between the 16-item screener and 24HR in this study varied from other studies. For example, correlations between the 19-item FVS and the 24HRs in Thompson et al.7 were higher for both males and females than in the current study. Correlations between the 19-item FVS and 24HR in Greene, et al.8 were generally equivalent to those obtained in the current study for both the 2-item CUP and 16-item FVS compared with the recalls among males and females (with the current study having generally higher correlations when comparing the screeners with 24HRs for men). Correlations between the 1-item global FVS compared with 24HRs in the Greene et al. study were comparable to those found in the current study for females but the values were higher in the current study for males (all of the values for the 1-item global FVS were low and non-significant for males in the Greene et al study). The correlations in this study were comparable to those found by Serdula and colleagues23 for a 6-item FVS compared with diet records, and slightly lower than those found by Kristal et al.24 for a FVS compared with multiple recalls for FV intake. It is worth noting that differences in validity across studies may have been due to a variety of factors, including varying sample sizes, mode of administration, age of subjects and other factors.
One unique characteristic of the current study is that correlation coefficients of the three screeners were evaluated compared with 24HRs across various sociodemographic factors and found varying validity results for age, education, income and race. Many previous studies did not specifically evaluate sociodemographic differences and did not stratify or adjust for these factors in their studies5. In the current study, the correlations for the screeners and the recalls tended to be stronger among younger, more educated and higher income levels, which is all in the expected direction. However, the results for race were mixed.
We also thought it was important to have a consistent measure, so we evaluated test-retest reliability in a sample separate from the validity sample. The test-retest ICCs were best for the 2-item SERVING FVS but were still relatively strong for the 2-item CUP and 16-item FVS. However, it is important to note that an instrument can be very reliable but not at all valid, so these findings need to be viewed in this context. Moreover, reliability is typically not measured and/or reported in studies of this type5.
A few limitations are to be noted. First, because this was a cross-sectional study, it was not possible to test the screeners’ sensitivities to change in FV intake over time—an important aspect to the screeners’ use in behavioral intervention studies. Secondly, the sample was drawn using a Consumer Opinion Panel, which is commonly equated to a large convenience sample; however, validity and reliability studies typically use convenience samples and the sample size in the current study was larger than many other studies.
Strengths of this study include a relatively large sample size and the ability to have a separate sample to test the validity and test retest reliability of the screeners. In addition, all three screeners were embedded within the larger FAB Survey, allowing the opportunity to test validity of all three screeners concurrently in the same subjects. Although the 24HR is subject to measurement error25 and to bias, including social desirability26, it is considered one of the best measures of self-reported intake for free-living populations. The 24HR approach used consisted of unannounced multiple non-consecutive days of report, was conducted using the multiple pass methodology used in the National Health and Nutrition Examination Survey (NHANES), and was modeled to adjust for within-person variability. Another strength is that we included a call-out box for one cup equivalents for the 2-item CUP FVS, which was devised to be in accord with the recommendations for FVs given in cups from the Dietary Guidelines for Americans, 20059. Example items were selected based on cognitive interviewing with participants about which FVs they consumed most and which they needed guidance on for portion sizes. However, mixed FV dishes were not included, which limited the ability to get at variety with this 2-item FVS. In addition, the relatively low validity coefficients for the 2-item CUP FVS suggest that participants had difficulty estimating the amount of FVs in cups. However, the validity coefficients for the 2item CUP FVS compared to 24HRs were worse, so general “serving size” may be more difficult to estimate than cups.
Results from this cross-sectional study add to the cache of other validity and reliability studies that have been conducted to evaluate short dietary screeners. Our study highlights the continued complexity of dietary assessment. While dietary screeners present a more cost-effective, less burdensome way to collect information on FV intake, these methods are not recommended to get at precise intake levels but rather to obtain gross estimates, and to rank individuals with regard to intake of particular food group (e.g., FVs). In the current study, whether true median intakes of FV intake and relative intakes across demographic groups are reflected by intakes reported on screeners is the key concern. The results of this study indicate that the 16-item FVS reflects the median FV intake levels and correlates moderately with the 24HRs and that the 2-item CUP outperformed the 2-item SERVING FVS on these criteria. However, although these short FVS showed adequate reliability, they all had low validity correlation coefficients when compared with multiple 24HRs, especially both of the 2-item FVS among males.
The current study adds to previous evaluation work on the use of the 16-item FVS in surveillance studies. In addition, this work presents the first evaluative information on a new 2-item CUP FVS. While detailed quantitative instruments such as multiple 24HRs or food records are still the preferred methods for collecting information on FV intake, in situations with resource or time constraints, the 16-item FVS may be a viable alternative to get gross or approximate estimates of median FV intake. The 2-item SERVING FVS is not recommended for use, given the low validity estimates; however in constrained situations, the less precise 2-item CUP FVS may provide useful information to rank individuals with regard to FV intake.
This study also highlights a critical gap in the area of dietary assessment in that there continues to be a lack of dietary measures that are short and easy to administer and have robust validity and reliability associated with their use. The best option, if time allows, are multiple 24-HRs and the ASA24. However, because time and burden continue to be important factors for many researchers and practitioners, there is still the continued need to modify and/or develop new dietary screeners and assess them for validity and reliability.
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the National Cancer Institute or the Centers for Disease Control and Prevention.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Amy L. Yaroch, Gretchen Swanson Center for Nutrition, 505 Durham Research Plaza, Omaha, NE 68105, Office: 402-559-5500, Fax: 402-559-7302, Email: ayaroch/at/centerfornutrition.org.
Janet Tooze, Wake Forest University, Winston Salem, NC, Email: jtooze/at/wfubmc.edu.
Frances E. Thompson, National Cancer Institute, Rockville, MD, Email: thompsof/at/mail.nih.gov.
Heidi M. Blanck, The Centers for Disease Control and Prevention, Atlanta, GA, Email: hcb3/at/cdc.gov.
Olivia M. Thompson, University of Nebraska Medical Center, Omaha, NE, Email: othompson/at/unmc.edu.
Uriyoan Colón-Ramos, George Washington University, Washington, DC, Email: uriyoan/at/gmail.com.
Abdul Shaikh, National Cancer Institute, Rockville, MD, Email: shaikab/at/ail.nih.gov.
Susanne McNutt, Westat, Inc, Rockville, MD, Email: SusieMcNutt/at/westat.com.
Linda C. Nebeling, National Cancer Institute, Rockville, MD, Email: nebelinl/at/mail.nih.gov.