Here we give more detail about the nutrition context that motivates this work.
In surveys conducted in the United States, the preferred method of obtaining intake data is the 24-hour dietary recall because it limits respondent burden and facilitates accurate reporting; yet the measure of greatest interest is “usual” or long-term average daily intake. Thus dietary intake is assessed with considerable measurement error. Also, diets are comprised of numerous foods, nutrients, and other components, each of which may have distinctive attributes and effects on nutritional health. Sometimes, it is useful to examine intake of these components separately, but increasingly nutritionists are interested in exploring them collectively to capture patterns of dietary intake. Consumption patterns of these components vary widely; some are consumed daily by almost everyone while others are episodically consumed so that 24-hour recall data are zero-inflated. In addition, these various components are often correlated with one other. Finally, it is often preferable to analyze the amount of a dietary component relative to the amount of energy (calories) in a diet because dietary recommendations often vary with energy level, and this approach provides a way of standardizing dietary assessments.
One of the US Department of Agriculture's (USDA's) strategic objectives is “to promote healthy diets” and it has developed an associated performance measure, the Healthy Eating Index-2005 (HEI-2005, http://www.cnpp.usda.gov/HealthyEatingIndex.htm
). The HEI-2005 is based on the key recommendations of the 2005 Dietary Guidelines for Americans (http://www.health.gov/dietaryguidelines/dga2005/document/default.htm
). The index includes ratios of interrelated dietary components to energy. The HEI-2005 comprises 12 distinct component scores and a total summary score. See for a list of these components and the standards for scoring, and see Guenther et al. (2008) for details. Intakes of each food or nutrient, represented by one of the 12 components, are expressed as a ratio to energy intake, assessed, and ascribed a score.
Table 1 Description of the HEI-2005 scoring system. Except for saturated fat and SoFAAS, density is obtained by multiplying usual intake by 1000 and dividing by usual intake of kilo-calories. For saturated fat, density is 9 × 100 usual saturated fat (grams) (more ...)
The HEI-2005 is used to evaluate the diets of Americans to assess compliance with the 2005 Dietary Guidelines, yet use of the HEI-2005 is limited by the challenges described above. Until recently, there have been no solutions to these challenges, so published evaluations have been limited to analyses of mean scores for the population and various subgroups. Freedman, et al. (2010)
have described a method of estimating the population distribution of a single component of HEI-2005, and the prevalence of high or low scores on that component; but there has been to date no satisfactory way to determine the prevalence of high or low total HEI-2005 scores, considering all of its interrelated components simultaneously. In addition, answers to the complex questions posed in the Introduction remain unavailable. This paper aims to provide a means to do these crucial evaluations.
The 12 HEI-2005 components represent 6 episodically consumed food groups (total fruit, whole fruit, total vegetables, dark green and orange vegetables and legumes or DOL, whole grains and milk), 3 daily-consumed food groups (total grains, meat and beans and oils), and 3 other daily-consumed dietary components (saturated fat; sodium; and calories from solid fats, alcoholic beverages and added sugars, or SoFAAS). The classification of food groups as “episodically” and “daily” consumed is based on the number of individuals who report them on 24hr recalls. If there are only a few zeros for a component, we treat that as a daily-consumed food, and replace all zeros with 1/2 the minimum value of the non-zeros for that food. However, the crucial statistical aspect of the data is that six of the food groups are zero-inflated. The percentages of reported non-consumption of total fruit, whole fruit, whole grains, total vegetables, DOL, and milk on any single day are 17%, 40%, 42%, 3%, 50% and 12%, respectively.
We are interested in the usual intake of foods for children aged 2-8. The data available to us, described in more detail in Section 5, came from the National Health and Nutrition Examination Survey, 2001-2004 (NHANES). The data used here consisted of n = 2, 638 children, each of whom had a survey weight wi for i = 1, ..., n. In addition, one or two 24hr dietary recalls were available for each individual. Along with the dietary variables, there are covariates such as age, gender, ethnicity, family income and dummy variables that indicate a weekday or a weekend day, and whether the recall was the first or second reported for that individual.
Using the 24hr recall data reported, for each of the episodically consumed food groups, two variables are defined: (a) whether a food from that group was consumed; and (b) the amount of the food that was reported on the 24hr recall. For the 6 daily-consumed food groups and nutrients, only one variable indicating the consumption amount is defined. In addition, the amount of energy that is calculated from the 24hr recall is of interest. The number of dietary variables for each 24hr recall is thus 12+6+1 = 19. The observed data are Yijk
for the ith
person, the jth
variable and the kth
= 1, . . . , 19 and k
= 1, . . . , mi
. In the data set, at most two 24hr recalls were observed, so that mi
≤ 2. Set Ỹik
, ..., Yi,19,k
- = Indicator of whether dietary component is consumed, with .
- = Amount of food consumed. This equals zero, of course, if none of food is consumed, with .
- = Amount of non-episodically consumed food or nutrient , with .
- Yi,19,k = Amount of energy consumed as reported by the 24hr recall.