|Home | About | Journals | Submit | Contact Us | Français|
The food products available, purchased, and consumed in the United States are changing rapidly (1), yet there is little understanding among nutrition professionals of the nature of these changes and what they mean for nutritional health. In 2010, we identified over 85,000 uniquely formulated products in the US food system (2), while there were around 7,600 foods with unique nutrient compositions in the US Department of Agriculture (USDA) food composition tables (FCT)(3). In addition, the USDA updates the nutrient composition of foods in the FCTs periodically, while the food industry claims to be making key product reformulations (on calories, sodium, sugar, trans-fat, and other saturated fat) as part of major commitments (4, 5), along with shifts in industry norms and government regulations.
The evolution of the US food system reflects innovations in food production, delivery, preservation and preparation, changing economic conditions, social norms and expectations regarding food availability, safety, variety, and knowledge. Government policy and food industry actions affect the food supply, consumer behavior (demand), and ultimately the diet and health of the US population. The Institute of Medicine, National Cancer Institute (NCI), Centers for Disease Control and Prevention (CDC), Economic Research Service of the USDA, recent conferences, workshops, and publications have highlighted the urgency for finding ways to improve diets by addressing these structural changes (6–8).
Historically, national aggregate data sources exist that provide macro-level and per-capita measures of food purchase, availability, and nutritional quality— the USDA food balance data (since 1909) and food disappearance data (since 1970) (9), and the Bureau of Labor Statistics’ Consumer Expenditure Survey (since 1980) (10). However, none of these surveys provides detail data regarding specific food products, their nutritional composition, and who consumed them.
In the 1990s, the National Nutrition Monitoring and Related Research Program (NNMRRP) laid the groundwork for improvements in nutrition monitoring efforts across various agencies and included over 50 surveillance activities that monitor and evaluate the US population’s health and nutritional status (11). Unfortunately, the NNMRRP could not provide continuous support for these agencies to keep up with the rapidly changing landscape of the food supply. By 2005, an expert panel reported that there was not a single data source that provided all of the needed information to address research on food and nutrition policies (6). Rather, researchers rely on numerous public and commercial data sources designed independently with varying purposes, not as parts of a unifying system for monitoring food and nutrient intakes. , These data sources vary in sample size, representativeness, breadth and depth of measures, and costs. Clearly, there is a need to enhance the monitoring of our food and nutrition environment in order to create effective programs and policies across the areas of public health, agriculture, economic development, and welfare.
Data sources that monitor the foods sold, purchased, or consumed in the United States are summarized in Supplementary Table 1. These sources include:
Store-level data like Scantrack (The Nielsen Co.) (12) and Total Store Advantage (Symphony IRI) (13) provide information about when, what, and how much of a specific product at the barcode level is sold, and at what price (including promotions and sales) over a week, month, or year, using point-of-sale scanners. Store-level data do not provide insights as to how products are distributed or consumed except for the few store systems with limited socio-demographic measures linked to memberships.
These store-level data measure market shares in the packaged goods sector and are widely used by marketing firms, food manufacturers, and retailers to measure product performance. However, these data provide a limited picture of packaged food sales because they do not include warehouse club stores (e.g., Costco), some large store chains (e.g., Walmart, Whole Foods), and ignore sales of packaged food and beverages from vending machines, restaurants, and specialty markets. Because people’s food shopping preferences and habits appear to be changing toward favoring food purchases from these venues (14), the current store sales data may be increasingly non-representative.
Homescan (The Nielsen Co.) provides data on barcoded food products that households purchase over a week, month, or year (using scanners provided to participating households), along with important socio-demographic information (12). Although an improvement over store-level data, under-reporting exists and intra-household distribution of these products is unknown. Researchers have systematically examined several methodological aspects regarding these data, including sample selection and participation biases (15), potential biases in the demographic characteristics of respondents, high reporting variance within food categories with more random-weight foods such as produce and fresh meat (16), and recording errors and price imputation issues (17, 18).
Prepared foods (e.g., food from restaurants, cafeterias, delivery, concessions), contribute to a significant and increasing number of calories consumed (19) and should be monitored. The Consumer Reports on Eating Share Trends (CREST) from the NPD Group (20) contains data on foods and beverages purchased by or for people ≥2 years of age.
Individual-level dietary survey data are publically available from the What We Eat in America dietary component of the National Health and Nutrition Examination Survey (NHANES) (21). NHANES collects 24-hour dietary recalls of what, where, and how much people eat, and the data are publically available and widely used. However, such data is only based on two days, which may not be representative of usual intake unless the NCI or Iowa State University adjustment methods are applied (22, 23). In addition, NHANES does not take into account seasonal differences because it does not sample each primary sampling unit in each season of the year. Also, NHANES does not use a longitudinal sample, limiting researchers’ abilities to determine any potential causality in studies aimed at understanding the factors affecting health behaviors. Moreover, changes in dietary information collection are unaccounted for (e.g., extra probes included in the 1980s–1990s, and changes in the number of days of recall), making it difficult to know the extent to which changes in intake are due to better reporting or to true changes.
Similarly, commercial survey data asking respondents to report their food intakes may not be truly representative and may not allow for longitudinal analyses (e.g., there may be sampling biases especially with the movement toward online surveys). The National Eating Trends (NET) survey from the NPD Group (24) also does not ask about portion sizes and relies on average portion sizes from NHANES, assuming that these two sample populations are comparable even though the sampling frames and methods are vastly different. Nonetheless, NET collects two-week diaries of detailed foods consumed by 5,000 individuals per year, along with demographic information, health status, behaviors, and attitudes. Another NPD data source, Snacktrack (25), can help fill diet monitoring gaps of snacking, which appears to be increasing for both children and adults (26, 27). Snacktrack captures who, when, where, why, and how specific snack-oriented foods and brands are obtained and consumed.
To make the link between food and health, information on the nutritional content of foods is needed. Supplementary Table 2 summarizes existing data sources with such information, which includes:
Food composition tables (FCTs) translate foods reported as consumed by individuals (e.g., from NHANES and NET) to measurements of nutrient intake and diet quality. The USDA National Nutrient Database for Standard Reference (SR) is the basis for these FCTs; in 2010, the SR provided nutrient analyses for approximately 7,600 raw, processed, and prepared foods (28). The SR is updated annually and serves as the foundation for the Food and Nutrition Database for Dietary Studies (FNDDS), MyPyramid Equivalent Database (MPED) (29), and parts of the University of Minnesota’s Nutrition Coordinating Center (NCC) Food and Nutrient Database (30).
The FNDDS uses the SR to determine composition of the “average recipe” for a variety of foods and dishes reported in NHANES (28). Modification codes added since 2003 allow for differences in the types and amounts of ingredients used. However, the 2007–2008 FNDDS applied to NHANES only had 903 unique modification codes out of 5,663 unique foods, and only 3% of the food reported as consumed used any modification codes.
Also, the FNDDS has limited brand names for foods with unique USDA food codes; primarily only for ready-to-eat cereals, infant formulas, candies, and selected fortified foods. In addition, USDA reports that actual updating of each food item in the FCTs used for dietary surveys varies in frequency depending on what foods are considered most important to update in any given year, and only select food categories are comprehensively reviewed for each FNDDS version (31). In some cases, certain food groups are updated only once every six years. Another critical limitation of the FNDDS is the two-year lag in its creation and access, which is of particular concern for packaged and processed foods because they are reformulated and new products are introduced at a far faster pace compared to fresh or raw foods.
Regardless, these FCTs are important for nutrition researchers because they provide estimates for 65 nutrients for the foods studied (3, 8). The MPED translates the amounts of food reported as consumed in NHANES into the number of food-group and subgroup equivalents present in foods and determines if Daily Recommended Intakes and Dietary Guidelines for Americans recommendations are met. However, there is no planned release for MPED for application to NHANES 2005–06, or 2007–08, and since MPED is based on FNDDS and SR, limitations from those data sources also apply to MPED. Moreover, the FCTs used for foods reported in NHANES came from different sources (NCC Food and Nutrition Database before 2003; USDA FCTs since 2003). Thus, there is no official link of food codes and their changes over time beyond a set of bridging studies done by the USDA in the 1980s (32). Because it is unclear how comprehensively updated the USDA FCTs are, historical mis-measurements may have been carried into more recent periods.
The NCC Food and Nutrient Database includes over 18,000 foods and 7,000 branded products, including fast food items. Compared to the FNDDS, the NCC Food and Nutrient Database updates more frequently, particularly for food categories that are more dynamic (e.g., ready-to-eat cereals, beverages, and snacks) (30), and updates fast food items every two years.
Commercial nutrition data focus on packaged and processed foods, which list the information on their nutrition facts panels (NFP). Food and Drug Administration (FDA) rules require serving size (in household measure and metric amounts), total calories, calories from fat, total fat, saturated fat, trans fat, total sugars, total carbohydrate, protein, dietary fiber, sodium, cholesterol, vitamin A, vitamin C, calcium, and iron. The Gladson Database includes more items than public FCTs, and subsets of these (around 2,000 barcodes) are updated weekly. We found 170,000+ barcodes with NFP label data in the 2010 Gladson database compared to the 7,600+ foods available in the 2010 FNDDS (version 4.1). In addition, Product Launch Analytics (Datamonitor) and Mintel Global New Product Database (GNPD) contains NFP and barcode data for new products entering the marketplace. Thus, commercial NFP label data may provide a more complete and updated picture of the nutrients in the US food supply, at least for calories, macronutrients, and a few key micronutrients.
However, commercial data also have limitations. First, although NFP labeling is required for most packaged foods, labeling for raw produce (fruits and vegetables) and fish is voluntary, while delicatessen foods, bakery products and confections sold directly to consumers from preparation locale, and self-service bulk foods have no such requirements. Second, there is a 20% labeling measurement allowance between what is on the NFP and what is found during enforcement analyses (33) and reporting rules (e.g., foods with <5 calories or <0.5 gram of fat per serving meet the definition of “calorie free” and “fat free”, respectively). Third, barcodes in commercial NFP label data are not updated equally, and its proprietary nature limits researchers’ abilities to determine the comprehensiveness and frequency of collection and updating.
People eat food, but it is the composition of these foods that affects health. Given what is known about these data sources, how can the measures of the foods bought or eaten and the measures of nutritional content of these foods be brought together? Figure 1 provides an example of how the four types of data can be integrated.
Congruent years of the household (or store) food purchase (sales) data can be linked to the NFP label data by barcodes. However, these commercial data are expensive, and their proprietary nature creates challenges in research collaboration and transparency. Our preliminary work shows that in 2008, there were 600,000+ unique barcodes in the food purchase data, and 170,000+ unique barcodes in the NFP label data, of which 150,000+ matched to Homescan (household purchase) and Scantrack (store sales) data. For most of the other 450,000 barcodes, it is possible to derive basic nutrition information (calories, total fat, saturated fat, sugars) for different sized items of the same product, or apply USDA FCT data for fresh or raw produce, seafood, meats, and some dairy and eggs. In our preliminary work to date, about 90% of US dollar and volume sales have basic nutrition information.
Meanwhile, there are established methods for linking the individual food intake and FCT data based on existing food code and recipe files, but these FCTs do not keep up with changes in the marketplace. Consequently, our team is developing a new nutrient database that aims to better reflect the rapidly changing nutrient profiles (including calories, fats, sugars, sodium, cholesterol, and fiber) of foods people consume for each 2-year period that corresponds with NHANES. We plan to estimate a composite nutrient profile for each USDA food code based on weighted averages (weighted by calories purchased) of nutrient profiles of all matched commercial data over the same time period. A critical step is the creation of a cross-walk between USDA food codes and commercial nutrient data sources (barcode level data from Gladson, Product Launch Analytics, or Mintel and commercial food item level data from the NCC Food and Nutrient Database). We will document our steps and assumptions, including validation procedures (e.g., comparing means and variance of NFP label data for branded products with branded food items in the NCC Food and Nutrient Database; comparing NFP label data to NFP data for a subset of products from food manufacturing companies and based on field data collection efforts). We will involve the scientific community in reviewing our efforts.
By making the links across these four types of data, two forms of data can result:
These datasets together can help us identify methodological challenges, sampling, and measurement errors in both public and commercial data sources, and provide guidance on improvements to federally funded monitoring efforts to better capture the rapidly changing nutrition contents of foods.
Commercial data can supplement public data in other ways. For example, Gladson data includes the full ingredients lists, allowing researchers to identify ready-to-eat cereals, bars, and cookies reported in NHANES 2007/08 that contain fruit juice concentrate as an added sweetener (2, 34), something the USDA FCTs cannot do. Of course, to maintain the relevance of such data, these linkages will have to be updated annually in order to properly capture the changes in foods in the marketplace.
In conclusion, our measurement and understanding of the US food and nutrient supply has not kept up with the modern food landscape. Existing data sources miss important distinctions about the nutritional makeup of food products available across the country and purchased by certain socio-demographic subpopulations. Complex and overlapping sets of measurements exist from commercial vendors capturing many of the rapid shifts in the packaged food sector at the market, household, and individual levels, as well as in major federal surveys capturing raw, packaged, and prepared foods. Yet, such measures miss reformulated, new, and even many old products. While expensive and labor intensive, it is possible to create systematic and meaningful linkages across data on food purchase/sales, food intake, food composition, and nutrition facts panel. Researchers and nutrition professionals’ ability to properly integrate these data sources and fully capitalize on the opportunities that lie behind them is at its infancy.
There is no specific or sustained funding for critical methodological challenges that the USDA and CDC face in monitoring food and nutrient intake of US residents. Cross-governmental groups concerned with methods of reducing caloric intake, sodium, added sugars, and saturated fats in the US diet should create initiatives to support the use of all of these existing data collection systems to validate and improve each system. At the same time, market- and consumer-research companies should recognize their potential contributions to public health and nutrition and collaborate to develop strong relationships with researchers and federal agencies to explore options for researchers to gain access to these commercial data via lower cost contracts with academic libraries or the creation of data repositories.
Integrated data that better reflects the nutritional profile of foods purchase and consumed will allow researchers and policy makers to answer questions like:
The answers to these and many other research questions that inform policy-making and national nutrition guidance can emerge from a data system that better captures the dynamics of the actual US food supply and the nutrient intakes of individuals. We can only change what we can measure. In order to advance our understanding of how changes in the food supply affect how and what Americans eat, then we must address the challenge of ensuring that we utilize comprehensive and appropriate data sources and that our measures are valid and reliable. Only then can we truly assess the effectiveness of our collective efforts toward helping improve our nation’s overall health and well-being.
Sources of support: The work presented in this paper was supported by funds from the Robert Wood Johnson Foundation (Grant 67506 and 68793) and from the National Institutes of Health (R01 HL104580 and CPC 5 R24 HD050924).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.