|Home | About | Journals | Submit | Contact Us | Français|
Use of direct observation to characterize neighborhood retail food environments is increasing but to date most studies have relied on a single observation. If food availability, prices, and quality vary over short time periods, repeated measures may be needed to portray these food characteristics. This study evaluated short-term (2-week), within-season temporal stability in retail food availability, prices, and quality.
In-person observations of retail food stores at 2 time points, 2 weeks apart.
Southwest Chicago, IL.
157 food stores.
Availability and prices of foods selected from the following food groups: fruits, vegetables, grains, meats and beans, and dairy, as well as fresh produce quality.
Temporal stability was tested for availability using a McNemar test and for price and quality using a Wilcoxon signed rank test.
We found that measures of food availability and prices as well as fresh produce quality at stores were generally stable at the 2 time points.
This study suggests that a single observation may be sufficient to accurately characterize within-season food availability, food prices, and fresh produce quality.
Use of direct observation to characterize neighborhood retail foods (e.g., availability, prices, quality) is increasing.1–13 Studies in the United States (U.S.) have generally highlighted barriers to healthy eating for residents in Black and low-income neighborhoods,14 and have helped to identify points for intervention at the environmental level consistent with an ecological model. Yet, few have documented the reliability or validity of their measures.5 Furthermore, with notable exceptions,5, 7 most studies have relied on a single observation to characterize food availability, prices, and quality. While seasonal variations in these food characteristics are recognized,5, 7 little is known about whether retail food characteristics are stable over a short period of time. This understanding is important to inform future studies involving direct observation of retail foods. Specifically, temporally stable measures would support the sufficiency of a single observation to portray food characteristics within season. On the other hand, if variable over short periods, repeated measures may be needed to more accurately assess food characteristics. The purpose of this study was to evaluate short-term (2-week), within-season temporal stability in retail food availability, prices, and quality in diverse neighborhoods in Chicago, Illinois.
The City of Chicago is divided into 77 officially designated neighborhoods, or “Community Areas”.15 This study was conducted in 5 contiguous, racially/ethnically and socioeconomically diverse Community Areas in southwest Chicago: Chicago Lawn, West Lawn, Ashburn, Englewood, and West Englewood. Chicago Lawn and West Lawn are involved with the Illinois Prevention Research Center, with which we collaborated for this project. In 2000, the 5 Community Areas included populations between 29,235 (West Lawn) and 61,412 (Chicago Lawn). The percentage of non-Hispanic Black population ranged from 2.6 (West Lawn) to 97.8 (Englewood and West Englewood), while the percentage of non-Hispanic White population ranged from 0.4 (Englewood and West Englewood) to 42.9 (West Lawn). Further, there was wide variation in the percentage of Hispanic residents: 0.9 (Englewood) to 51.9 (West Lawn). Socioeconomic conditions also differed among the Community Areas with the percentage of residents below poverty ranging from 6.9 (Ashburn) to 43.8 (Englewood).
Based on a 2006 Chicago Department of Revenue list, we identified food outlets falling in zip codes that roughly corresponded with the Community Area boundaries. The list did not classify outlets by type or include information useful for categorization (e.g., NAICS codes, annual sales, square footage) beyond the outlet name. Therefore, guided by definitions from the Food Marketing Institute,16 we used store names and in-person observations to classify outlets into 1 of 4 categories: grocery stores, liquor stores, convenience/corner stores, and other food stores (bakeries, delis, drug stores). Briefly, grocery stores had both fresh produce and fresh meat sections; liquor stores sold liquor as the primary good or had “liquor” in the store name; “other” food stores were identified by primary good sold or store name (e.g., bakery, pharmacy such as Walgreen’s); and convenience/corner stores sold gasoline (39%) or were classified by exclusion from the other categories.
We drew upon data from a project whose objectives included understanding the relative availability and prices of more and less healthful food choices and identifying fruits and vegetables in racially/ethnically diverse neighborhoods. The measurement instrument was adapted from 3 existing instruments.1, 11, 13 It included a fairly comprehensive list of fresh, frozen, and canned fruits (n=35 varieties) and vegetables (n=73 varieties). For the subset of fruits and vegetables selected for price and quality assessments and for the other food groups (grains, meats and beans, dairy), inclusion of food products was guided by dietary recommendations (i.e., MyPyramid),17 commonly consumed foods in the U.S.,18 and food preferences of the 2 predominant, non-majority racial/ethnic populations (Black, Hispanic) in the Community Areas.19 Drawing on foods identified in MyPyramid,17 grains, meats and beans, and dairy were subdivided into “more healthful” choices (e.g., brown rice, fresh skinless chicken breast, skim milk) or “less healthful” choices (e.g., white rice, fresh split chicken breast with skin, whole milk) choices. Thus, this paper includes food products of the following types: fruits (fresh, frozen, canned), vegetables (fresh, frozen, canned), grains (more healthful, less healthful), meats and beans (more healthful, less healthful), and dairy (more healthful, less healthful).
To help assure it was appropriate for the study area and to improve clarity of operational definitions, the instrument was pretested and revised prior to observer training and the field period. More specifically, using an iterative process, we collected practice data at different store types located in Chicago communities that were demographically similar to the study Community Areas and revised the instrument and instructions based on questions that arose. Because we wanted to identify product sizes that would be most commonly sold in our study area, modifications to the identified product sizes for the price assessment (described below) were among the most common revisions. Pretesting also resulted in many other clarifications (e.g., not to consider instant or “boil in the bag” types of rice for availability or price assessment; unseasoned fresh ground beef and turkey only; navel oranges only for price assessment) to promote observation consistency.
We measured availability as presence or absence of each of 35 varieties of fruits (fresh, canned, and frozen), 73 varieties of vegetables (fresh, canned, and frozen), and 25 other food products, representing the food types described above. Table 1 shows the specific food products assessed.
We measured prices for a subset of 13 fresh fruits and vegetables and the 25 other food products (Table 1). For fresh fruits and vegetables, we recorded prices as either price per pound or per item, depending on how the item was sold at the store. With 3 exceptions (avocado, mango, head of iceberg lettuce) that were most commonly sold per item, the assessed fruits and vegetables were typically sold per pound. For those stores at which the fruit or vegetable was sold in the less common unit (e.g., apples sold per item), we converted prices to the more common unit (e.g., apples sold per item were converted to per pound) using average gram weights per product from the United States Department of Agriculture (USDA) (the weight of a medium-sized item was used when more than 1 weight was provided in the database).9, 20 Across the 13 produce varieties and the 2 time points, 11.6% of the prices were converted using this method. For grains, meats and beans, and dairy, we measured prices of each food product using a pre-selected size (e.g., 20 ounce loaf of bread, gallon of milk) but not brand. Pretesting informed the sizes selected with the goal of attaining sizes commonly available and comparable for more and less healthful choices. If more than 1 brand was available at the chosen size, then the price of the lowest cost brand was recorded.
Using measures from a prior study,13, 21 we assessed quality for a subset of fresh fruits and vegetables (n=8) (Table 1). Briefly, for each of these produce varieties, a unique high-quality description for external appearance and condition that covers the domains of color, texture, form, and damage or defects is available. These quality descriptions were developed from standards provided by the USDA.22–24 Using these high-quality descriptions, observers rated each produce variety on a 4-point scale based on the estimated proportion of items at the store that did not meet the high-quality standard: excellent (0–4%), good (5–24%), fair (25–49%), or poor (50–100%). After reverse-coding the quality scores so that higher scores correspond to higher quality, we calculated the mean score for each store; this mean score was used in the analysis. In a prior study using this measure, based on data from 3 stores (1 assessed at each of 3 points during the 5-week data collection period: day 1, end of week 1, end of week 3), Spearman rank correlation coefficients comparing 2 observers’ ratings ranged from 0.82 to 0.85.13
We collected data over 11 weeks from May to July 2006. Two observers completed 20 hours of training which included didactic sessions, completion of practice stores in a group with the field coordinator and alone, and feedback and problem-solving focused on areas of low agreement on practice stores. We mailed each store an introductory letter that included the principal investigator’s telephone number to contact if the owner had questions or did not want to participate. With the second visit exactly 2 weeks after the first, 1 of 2 observers then visited each store twice during regular business hours, between 8:30 am and 5:00 pm. Each observer visited approximately half of the stores. As designed for the data collection and confirmed by chi-square tests post-data collection, the store types or Community Areas visited did not differ by observer. For over 95% of the stores, the store was visited at both time points by the same observer. We selected 2 weeks in order to capture within-season variability in retail food characteristics. By visiting the store precisely 2 weeks later, our design controlled for any differences in food delivery and shelf-stocking by day of the week. Approximately halfway through the field period, the field coordinator rated 4 stores with each of the 2 observers (8 stores total); we used data from these stores to evaluate inter-rater agreement. Following the study, we mailed each store a thank you letter. Institutional Review Board approval was not required for this study.
Analyses were focused on testing whether the measures of retail food characteristics (availability, prices, quality) were temporally stable, or consistent at the 2 time points. Temporal stability was tested for availability using a McNemar test and for mean price and quality using a Wilcoxon signed rank test. Percent agreement was used to assess inter-rater agreement for food availability and prices; Spearman’s rank correlation was used to assess inter-rater consistency in ratings for the ordinal measure of fresh produce quality.
We identified 195 food stores in the 5 Community Areas. Of these 23 stores refused any visit, with 3 additional stores refusing at time 2. Overall, the participation rate was 86.7%. Twelve stores were not visited at time 2 because either the time 1 survey revealed that they did not carry any of the foods of interest (n=9) or the store was permanently closed at time 2 (n=3). Thus, 157 food stores comprised the sample for this analysis, including 29 grocery stores, 88 convenience/corner stores, 27 liquor stores, and 13 other food stores. The mean length of time for data collection per store was 20.8 minutes and ranged from less than 5 to 90 minutes. On average, data collection lasted twice as long at grocery stores as at convenience/corner, liquor, and other food stores (35.7 versus 15.9, 16.9, and 13.9 minutes respectively at time 1).
Overall, inter-rater agreement, comparing the observers to the field coordinator, was satisfactory. The mean percent agreement was at least 87.5% for nearly all availability and price measures and none were below 75%. Perfect agreement was achieved for availability and price for 77% and 56% of the food products, respectively. The only measures for which the mean percent agreement was less than 87.5% were 100% whole wheat bread availability, frozen beet availability, frozen artichoke availability, and white pasta price. On average, the Spearman’s rank correlation coefficient for quality was 0.72.
Availability of the food products was generally consistent at the 2 time points (Table 2). In other words, a store that sold a food product (e.g., skim milk) at time 1 tended to also sell that food product 2 weeks later at time 2. Canned fruits and vegetables were commonly carried at stores, whereas fresh and frozen options were less commonly available (Table 2). For vegetables, only the availability of 1 frozen variety (corn) and 3 canned varieties (asparagus, green beans, jalepenos) differed significantly between the 2 time points. Availability was similar at the 2 time points for all varieties of fresh vegetables. For fruits, only the availability of 3 canned varieties (peaches, pears, pineapple) differed significantly between the 2 time points. All varieties of fresh and frozen fruits had similar availability for the 2 time points.
With few exceptions, grocery stores were significantly more likely to carry all the food products than convenience/corner stores and liquor/other food stores (results not shown). When comparing convenience/corner stores and liquor/other food stores, no consistent pattern of availability was evident (results not shown). Stores were generally more likely to carry less healthful grains, meats and beans, and dairy than more healthful choices (Table 2). For example, at time 1, 71.3, 23.1, and 82.8% of stores carried white bread, regular ground beef, and whole milk when compared with 17.8, 3.2, and 15.9% of stores that carried whole wheat bread, extra lean ground beef, and skim milk, respectively. With the exception of dried kidney beans, availability of all assessed grains, meats and beans, and dairy products was consistent on the 2 observations (Table 2).
Comparisons of mean food product prices and mean fresh fruit and vegetable quality at the 2 time points revealed no statistically significant differences (Table 3).
Despite burgeoning interest in contributions of the neighborhood retail food environment to dietary behaviors and health disparities, development of methodologies to assess the retail food environment is in its infancy. While it is not possible to fully disentangle, our interest here was to evaluate “real change” in the availability, prices, and quality of food products over a short time period, rather than change due to unreliability (low “test-retest reliability”) in the measurement procedure or change in the measurement method.25 We found that the measures of food product availability and prices as well as fresh produce quality were generally consistent at 2 time points, 2 weeks apart.
Our study extends prior work using multiple observations of retail food characteristics. In a market basket survey in Los Angeles and Sacramento designed to account for seasonal variations, Jetter and Cassady conducted observations on food availability and prices at 25 grocery stores 3 times over 12 months, but did not report the temporal stability of these food characteristics.7 In contrast, our study specifically examined within-season temporal stability of retail food characteristics. In a study of 88 stores (24 grocery, 61 convenience) in metropolitan Atlanta, Glanz and colleagues conducted 2 observations of each store between 7 and 28 days apart with a mean of 9.1 (+/− 4.8) days separating their observations.5 They reported high agreement between the 2 time points in availability of fresh fruits (10 varieties), fresh vegetables (10 varieties), and “healthful” options in 8 other indicator food categories: milk, ground beef, hot dogs, frozen dinners, baked goods, beverages, bread, and snack chips. Our findings on food availability are generally consistent with theirs, providing additional evidence of short-term temporal stability of availability. Like Glanz and colleagues who found high agreement in produce quality (assessed for each of the ten varieties as “acceptable” or “unacceptable”) between times 1 and 2, we also found that mean quality scores were similar at the 2 time points. Our study extends their work through a larger sample of stores in communities with multiethnic populations (Black, Hispanic, and White), use of a consistent time frame of 2 weeks to evaluate temporal stability, and evaluation of temporal stability for food product prices as well as the availability of a larger number of food products including canned and frozen fruits and vegetables.
A major strength of this study is that it is among the first empirical examinations of short-term, within-season temporal stability of retail food characteristics. However, this study has several limitations. First, this study measured short-term, within-season temporal stability based on 2 observations, 2 weeks apart. Data collected at more than 2 time points or more frequent or extended sampling points (e.g., 6 weeks between observations) may have yielded different results. Second, although our sample of stores is large compared with many studies using detailed observations of retail food characteristics,2, 5, 7–9 the study was underpowered to detect differences between the 2 occasions for some measures. This was particularly true for prices of more healthful food products, due to low availability at stores in the sample that was further compounded for some products by the need to have a particular size. Rather than pricing the lowest cost brand of a particular size, unit pricing (pricing the brand and size that was the lowest price per unit) may have increased our sample sizes in this study. Third, whereas our measurement instrument included a fairly comprehensive list of fruits and vegetables, only selected products were assessed for other food types. Fourth, we did not assess prices for any canned or frozen fruits or vegetables. Fifth, our inter-rater agreement estimates were based on a small number of stores. While arguably even low inter-rater agreement would not have affected this study’s conclusions regarding temporal stability in food product availability, price, or quality (assuming individual observers rated consistently over time), more rigorous testing of inter-rater agreement through, for example, double-rating a larger number of stores would have been appropriate and is warranted in future studies.
Sixth, our quality assessment was limited to fresh produce. A potential advantage of the quality measure used in this study – and Likert-type quality items and scales more generally -- is that it may capture finer differences in quality when compared with dichotomous quality indicators. Nonetheless, it is more difficult to train observers and attain high inter-rater agreement using Likert-type scales. Furthermore, the quality measure was developed based on USDA standards for external condition and appearance, which may not correspond with how residents of the study area judge quality. Given that poor quality foods including fresh produce may be a significant problem in Black and low-income neighborhoods, development of reliable, valid food quality measures that are grounded in the community’s understanding is needed.
Our findings have implications for future research on neighborhood retail food characteristics and for consumers. Our finding of temporal stability in food availability, prices, and fresh produce quality at 2 time points, 2 weeks apart suggests that a single observation at stores may be sufficient in future studies to accurately represent these aspects of the food supply within season. Based on the 2 observations, the few observed differences in availability, though not beyond what would be expected by chance alone, were found mostly for canned fruits and vegetables. Interestingly many of the varieties whose availability varied on the 2 occasions (canned green beans, peaches, pears, pineapples) are among the most commonly carried at stores in the study area and most commonly consumed in the U.S. The observable difference in availability may be the result of stores that stock smaller amounts of canned goods including these frequently purchased foods.
The overall finding of short-term temporal stability in food availability, food prices, and fresh produce quality also has implications for consumers and community change efforts. The results suggest if a store carries a food product, consumers can rely on the store for those items, at least over the short-term. However, it also means that stores in our study area have, for example, consistently low availability of fruits and vegetables as well as more healthful choices of grains, meats and beans, and dairy. This highlights the need to increase healthful food options in these Community Areas.
In informal conversations, some owners of small food stores in the study area cited beliefs that Blacks do not like certain types of foods as reasons for not carrying some items. Owners of small food stores also expressed that efforts to sell fresh fruits and vegetables have been unsuccessful: fresh produce is not purchased, resulting in spoilage and ultimately loss of revenue. This may have contributed to our finding of relatively low quality produce (scores of 1.70 and 1.89 on a four-point scale, with 4 being the highest quality) at stores in the study area. For efforts to increase the sale of fresh produce at small stores to be successful, it is important to maintain a high-quality supply. This requires a combination of educating store owners of community interest in healthful foods including fresg produce, marketing to advertise fresh produce availability, and intervening with consumers to increase demand (possibly by changing preferences), leading to increased sales and product turnover.
In conclusion, this study suggests that a single observation may be sufficient to accurately characterize within-season food availability, food prices, and fresh produce quality. More research is needed to evaluate short-term stability for longer time intervals, as well as seasonal variations in retail food characteristics.
We thank Elizabeth Baker and the Community Health Councils, Inc. for sharing their food store measurement instruments with us. The market assessment instrument by Community Health Councils, Inc. was developed in partnership with REACH 2010 African Americans Building a Legacy of Health collaborative. We thank Jung Kim, Phillip Kramer, and Jonathan Fowler for research assistance. This study was conducted in connection with the Illinois Prevention Research Center at the University of Illinois at Chicago, which is a member of the Prevention Research Centers Program and supported by the Centers for Disease Control and Prevention cooperative agreement number 1-U48-DP-000048. Post-doctoral training funding to the first author through the National Cancer Institute’s (NCI) Cancer Education and Career Development Program (5 R25T CA57699) supported the data collection.
Shannon N. Zenk, University of Illinois at Chicago, College of Nursing, Department of Health Systems Science.
Diana S. Grigsby-Toussaint, University of Illinois Urbana-Champaign, Department of Kinesiology and Community Health.
Susan J. Curry, University of Iowa, College of Public Health.
Michael Berbaum, University of Illinois at Chicago, Institute for Health Research and Policy.
Linda Schneider, University of Illinois at Chicago, Institute for Health Research and Policy.