|Home | About | Journals | Submit | Contact Us | Français|
Many cancers have long latency periods, and dietary factors in adolescence may plausibly affect cancer occurrence in adulthood. Because of a lack of prospective data, retrospective collection of data on adolescent diet is essential. The authors evaluated a 124-item high school food frequency questionnaire (HS-FFQ) assessing diet during high school (15–35 years in the past) that was completed in 1998 by 45,947 US women in the Nurses' Health Study II (NHSII) cohort. To assess reproducibility, the authors readministered the HS-FFQ approximately 4 years later to 333 of these women. The mean Pearson correlation for 38 nutrient intakes was 0.65 (range, 0.50–0.77), and the mean Spearman rank correlation for food intakes was 0.60 (range, 0.37–0.77). Current adult diet was only weakly correlated with recalled adolescent diet (for nutrient intakes, mean r = 0.20). For assessment of validity, 272 mothers of the NHSII participants were asked to report information on their daughters' adolescent diets using the HS-FFQ. In this comparison, the mean Pearson correlation was 0.40 (range, 0.13–0.59) for nutrients, and the mean Spearman rank correlation for foods was 0.30 (range, 0.10–0.61). While further studies are warranted, these findings imply that this food frequency questionnaire provides a reasonable record of adolescent diet.
Many common cancers have long latency periods that may span several decades between the onset of the carcinogenic process and clinical detection (1). Dietary factors in adolescence may plausibly affect cancer occurrence in adulthood by enhancing or deterring carcinogenic processes (2). Adolescence is characterized by hormonal changes and rapid proliferation of incompletely differentiated tissues in several organs. Thus, adolescence may be a more etiologically relevant period than adulthood for the study of potential causal and preventive determinants of some cancers (3-5). A better understanding of which dietary factors are important in the etiology of cancer and the period of life in which they act is critical.
Although prospective studies assessing food intake among children and adolescents have begun, most will require many more decades of follow-up to reach clinical endpoints (1). A more timely though potentially less ideal way of assessing the relation between adolescent diet and cancer is collection of data retrospectively from adults. If these dietary data are collected before disease occurrence, recall bias is avoided.
A crucial component in the conduct and interpretation of studies using retrospective dietary assessment is evaluation of the questionnaire instrument. Recall of adolescent diet as an adult will be prone to measurement error, because it relies primarily on memory of diet in the distant past. Several studies have reported reasonable validity and reproducibility of data on recalled diet up to 10 years in the past (6-9). However, greater uncertainty exists for recall exceeding 10 years (10, 11).
We evaluated a food frequency questionnaire that asked women in the Nurses' Health Study II (NHSII) cohort about foods they had eaten in high school, between the ages of 13 and 18 years. In this paper, we report on 1) the reproducibility of this questionnaire, using recalled information on adolescent diet provided by participants at two different time points; 2) a maternal comparison in which these recalled data were compared with information on high school diet provided by mothers of NHSII participants; and 3) the influence of current adult diet on recall of adolescent diet by NHSII participants.
The NHSII is an ongoing prospective study of a cohort of 116,671 US female registered nurses. When the study was initiated in 1989, participants were between the ages of 25 and 42 years. Every 2 years, participants have been sent a follow-up questionnaire asking about the use of hormones, lifestyle practices, and diagnoses of chronic disease. Every 4 years, participants also receive a semiquantitative food frequency questionnaire with which to report their current diet. The study has maintained a response rate of 90 percent or greater (12).
The high school food frequency questionnaire (HS-FFQ), a supplementary questionnaire administered in 1998, was completed by 45,947 NHSII women. For assessment of reproducibility, 400 women were randomly selected from these initial participants to complete a second HS-FFQ in 2002. To minimize recall bias due to existing disease, participants who had cancer, heart disease, or asthma were excluded from the second sample. This second HS-FFQ was completed by 347 (87 percent) of 400 women. Fourteen women were subsequently excluded on the basis of established dietary criteria (caloric intake <600 kcal/day or >5,000 kcal/day, more than 70 food items left blank, or more than one food section left blank, other than dairy or meat sections), leaving a total of 333 women for the reproducibility analysis.
Maternal reports of NHSII adolescent diet were obtained from participants in the Nurses' Mothers Cohort Study. This study was begun in 2001 to investigate the effects of peri-natal and early-life exposures on adult disease, and it includes 35,830 mothers of NHSII participants. To select participants for the comparison of recalled high school diets between the NHSII women and their mothers, we randomly selected 400 NHSII participants who completed the initial HS-FFQ and whose mothers were respondents in the Nurses' Mothers Cohort Study. Those NHSII women who had cancer, heart disease, or asthma were excluded. We also excluded participants who were selected for the reproducibility substudy in order to reduce repondent burden. In addition, to obtain the best possible independent comparison of responses to the HS-FFQ, we included only mothers who were early respondents in the Nurses' Mothers Cohort Study and who said they had completed that questionnaire without the help of their daughters.
Among the 400 selected NHSII participants, 358 (90 percent) gave permission and provided current address information with which to contact their mothers. These mothers were then sent an HS-FFQ with instructions not to discuss their responses with their daughters before returning it. Of the 358 contactable mothers, 302 (84 percent) completed the questionnaire. Six mothers were excluded from the analysis on the basis of the established dietary exclusion criteria (described above). Another 24 mothers were excluded because they skipped two or more consecutive questionnaire pages. Thus, a total of 272 mothers were analyzed.
This study was approved by the Partners Institutional Review Board at Brigham and Women's Hospital (Boston, Massachusetts).
The HS-FFQ is a 124-item, self-administered food frequency questionnaire (available online) (13). Questions posed to NHSII participants included how often, on average, they had consumed a specified food, beverage, or vitamin (described hereafter as “foods”) when they were between the ages of 13 and 18 years, or approximately high school age. This food frequency questionnaire was modeled on other validated questionnaires administered in the Nurses' Health Study and NHSII cohorts (11, 14, 15). Foods included were those commonly consumed by American adults during the years when the participants were in high school (1960–1982), as assessed in earlier investigations (15). Foods of interest to cancer researchers, such as major contributors of fat, fiber, and antioxident vitamins, were included. We took secular changes in food formulation into account by using an NHSII participant's year of birth to assign different nutrient profiles for specific foods. Serving sizes were listed in natural units whenever possible (e.g., one apple, one glass of milk, or one slice of bread) and otherwise were based on the most common portion size reported in the US Department of Agriculture's Nationwide Food Consumption Survey (1977–1978) (16). The response choices for food items consisted of nine possible frequencies, ranging from “almost never” to “six or more times per day.” Questions about the use of multivitamin supplements and vitamin C supplements had five possible response choices, ranging from a frequency of zero to 10 or more per week.
Nutrient intakes for each individual were calculated by multiplying the nutrient content of each food and supplement by the frequency of consumption relative to once per day and then summing the contribution from all foods and supplements (described hereafter as “nutrients” for convenience, recognizing that constituents such as caffeine are not nutritive components). The database for the nutrient analysis was constructed primarily from information provided by US Department of Agriculture handbooks and bulletins for foods consumed during the period when NHSII participants were in high school (17-19).
We adjusted nutrient data for energy intake using the residual method described by Willett and Stampfer (20), to account for variation in nutrient intakes due to total energy intake. We calculated mean values and standard deviations to characterize intakes and between-person variation in nutrient and food intakes. We transformed nutrient data by natural logarithm to improve their normality for the correlation analyses (11).
To examine reproducibility, we calculated intraclass correlations for nutrients and Spearman rank correlations for foods from the two HS-FFQs completed by the NHSII women.
In addition, we evaluated the potential for confounding of reported high school diet by current diet by calculating Pearson correlations between NHSII participants' nutrient intakes in the first HS-FFQ and their current nutrient intakes in 1995 (the last adult diet measurement prior to the 1998 HS-FFQ).
We also assessed the influence of misreporting of dietary intake on the reproducibility correlations. To identify under-reporting, we used the Goldberg cutoff for the ratio of energy intake to basal metabolic rate or physical activity level (21, 22). The calculation of this cutoff has been reviewed by Black (21) and Goldberg et al. (22). We chose a physical activity level of 1.73 based on doubly labeled water energy expenditure data for adolescent girls and then calculated the lower confidence limit (cutoff) for it using values cited by Black et al. (21, 22). To further assess misreporting, we used the sex- and age-specific equations developed by the World Health Organization to calculate the ratio of reported intake to the predicted energy expenditure for NHSII participants when they were adolescents (23). To do this, we calculated the basal metabolic rate for each participant on the basis of her self-reported weight at age 18 years. This value was then multiplied by a physical activity level of 1.5 on the basis of data from the World Health Organization, assuming 2.5 hours of daily moderate physical activity (23). We next calculated the ratio of reported energy intake (using reported calories from the first administration of the HS-FFQ) to this predicted energy expenditure for each individual. Using these ratios, we classified women as “underreporters” (ratio values in the lowest 20 percent of the distribution) or “high reporters” (ratio values in the top 20 percent of the distribution) and the remaining women as “acceptable reporters” for total energy intake and compared the reproducibility correlations between these three groups. Since physical activity level was the same for all participants, the percentage cutpoints for this second method depended on the value for the ratio of energy intake to the basal metabolic rate.
To evaluate the comparability of adolescent diets reported by NHSII participants and their mothers (maternal comparison), we calculated Pearson correlations for nutrients and Spearman rank correlations for foods. We also used Pearson correlations to assess associations between NHSII participants' current diets and their mothers' recall of their adolescent diets.
The mean age of the NHSII participants at the first administration of the HS-FFQ was 43.8 years (range, 33.6–53.3); thus, diet recall exceeded an average of 25 years in the past. The mean age of the subsample at the administration of the second HS-FFQ approximately 4 years later was 48.9 years (range, 38.9–56.4). The women in both the first administration of the HS-FFQ and the second administration were similar with regard to several demographic variables (table 1) and were also similar to the entire NHSII cohort, from which they were originally sampled.
The nutrient correlations between the first and second NHSII participant recalls were moderate to good, with an average correlation of 0.65 and a range of 0.50–0.77 (table 2). Highly reproducible nutrient values included total vitamin C (r = 0.77), total vitamin B2 (r = 0.76), and caffeine (r = 0.74). The nutrients measured with the least precision were alcohol (r = 0.50) and vitamin B12, both total (r = 0.52) and without supplements (r = 0.51).
The correlations between nutrient intakes calculated from the 1995 current diet and those calculated from the first recall of high school diet were low, with an average correlation of 0.20 and a range of −0.11 to 0.43 (table 2). Moreover, the correlations remained low when we used current diet as reported in 1999, 1 year after the first high school recall was administered (mean r = 0.20; range, 0.01–0.44).
In our analysis of misreporting, we found no appreciable underreporting on either the group level or the individual level using the Goldberg cutoff (21, 22). Our calculated lower limit (the Goldberg cutoff for underreporting) for the ratio of reported energy intake to predicted basal metabolic rate was 1.70 for the overall group. Our study's group mean of 1.9 was higher than this cutoff, suggesting that the reported energy intakes of NHSII participants were reasonable in relation to underreporting. On the individual level, our calculated lower limit ratio value was 1.26, and only 9.0 percent of NHSII participants were below this lower limit. After exclusion of this 9.0 percent, our findings did not change appreciably (r = 0.64) for nutrients. When we used another method based on values cited by the World Health Organization (23), the average nutrient correlations were similar for “underreporters” (r = 0.63; 20 percent prevalence), “high reporters” (r = 0.66; 20 percent prevalence), and “acceptable reporters” (r = 0.64; 60 percent prevalence).
To examine reproducibility further, we jointly classified nutrient intakes from the two administrations of the HS-FFQ into quintiles and calculated the percentage of responses plus or minus one quintile. Eighty percent of the nutrient values from the second administration of the HS-FFQ were within one quintile of values from the first administration.
The correlations for foods were slightly lower than those for nutrients, with an average of 0.60 and a range of 0.37–0.77. Foods with highly reproducible values included iced tea (r = 0.77), diet soda with caffeine (r = 0.76), and milk (r = 0.76). Foods with the lowest reproducibility were diet soda without caffeine (r = 0.37), onion eaten as a vegetable (r = 0.42), and raw spinach (r = 0.42). Individual correlations for all foods are available online (24). When intakes were grouped into food categories, the mean correlations between the first and second administrations were good: for dairy foods, r = 0.64; for (nondairy) beverages, r = 0.70; for main dishes, r = 0.57; for bread/cereals/grains, r = 0.48; for fruit, r = 0.67; and for vegetables, r = 0.64. Red meat consumed within main dishes had a mean correlation of 0.52.
The mean age of the mothers who responded was 73 years (range, 58–89 years). The NHSII participants represented by the mothers were similar in terms of several demographic variables to the 45,947 respondents in the first high school diet recall (table 1) and also similar to the entire NHSII cohort.
The nutrient correlations between the NHSII participants' recalls and their mothers' recalls were moderate, with a mean of 0.40 and a range of 0.13–0.59 (table 3). Nutrients with the highest correlations were animal fat and vegetable fat: Both had a correlation of 0.51. Nutrients with the lowest correlations were total calories (r = 0.13), retinol (r = 0.30), and monounsaturated fat (r = 0.30). NHSII participants' current nutrient intakes, as assessed in 1995, were only weakly correlated with their mothers' recall of their high school diets (mean nutrient correlation: r = 0.13).
Overall, the correlations comparing mothers' reports with their daughters' reports were lower for foods than for nutrients, with a mean of 0.30 and a range of 0.10–0.61 for foods. The foods with the highest correlations were iced tea (r = 0.61) and orange juice (r = 0.52). Those with the lowest correlations were brownies (r = 0.10) and soda without caffeine (r = 0.10). Individual correlations for all foods are available online (24).
In this study, we evaluated the reproducibility of a food frequency questionnaire that asked adult participants, at an interval of 4 years, about their diet in high school, 15–35 years in the past. We also compared participants' recalls with information on high school diet provided by their mothers. The mothers' reports were intended as independent estimates of their daughters' high school diets and thus a measure, though not an ideal one, of validity.
Our results indicate moderate-to-good reproducibility for foods and nutrients and appear to be consistent with the handful of studies to date that have examined remotely recalled adolescent diet. Previously, we examined the reproducibility of a shorter 24-item adolescent diet questionnaire administered twice at an interval of 2 years to participants in the Nurses' Health Study, a cohort that is similar to but older than the women in the NHSII (12, 14). We reported average correlations of 0.57 for 24 foods (range: from 0.38 for beef to 0.73 for orange juice) and 0.48 for nutrients (range: from 0.34 for vitamin E to 0.68 for cholesterol). Wolk et al. (25) examined the short-term reliability (9–12 months) of adolescent diet recalled over 20 years later by healthy controls in a Swedish case-control study; they reported a correlation of 0.46 for both foods and nutrients for a 45-item food frequency questionnaire.
The influence of current diet is an important possible source of bias for the assessment of remote diet. For instance, our reproducibility results could potentially be overestimated if NHSII participants simply reported their current diet at both administrations of the questionnaire. However, the low correlations between current diet and recalled diet (for nutrients, r = 0.20) suggest that our reproducibility results were not substantially inflated by current adult diet. The timing of assessment of current diet, whether before or after the administration of the HS-FFQ, also did not influence the results.
Other investigators have reported a larger correlation between current diet and remotely recalled diet (26-29). For instance, Bakkum et al. (27) reported a 0.72 food correlation for men and a 0.64 correlation for elderly men and women. Wu et al. (26) reported food correlations of 0.54 for men and 0.56 for women. One explanation for this difference is that in some studies, participants' current diets were assessed at the same time as their recalled diets, which could have influenced recall and artificially inflated their results due to correlated error (26, 27, 29). Alternatively, these reports may truly reflect stability of adult diet over time.
Our low correlations between current diet and recalled diet, together with the stronger correlations between two recalls of high school diet, suggest that participants may have eaten differently during high school (14). For instance, the greatest decrease in nutrients was for fats; there was a 60 percent decline in total saturated fat intake, which is consistent with national trends. Reported calories from total fat also decreased from 40 percent to 29 percent, and calories from carbohydrates and protein increased. Because of presumed diet stability, some authors have suggested that current diet be used as a surrogate measure of past diet (7, 29). However, our results imply that the best measure of past adolescent diet (in the absence of original data) is recalled, not current, diet—a conclusion consistent with previous investigations (8, 14, 30).
Dietary data may be prone to systematic underreporting of food and nutrient intakes and, to a lesser extent, systematic overreporting (21, 31). We did not find evidence for underreporting using the Goldberg cutoff for the ratio of energy intake to basal metabolic rate. Furthermore, our analysis did not indicate appreciable differences in the correlations for subjects classified as “underreporters,” “high reporters,” and “acceptable reporters” with the use of 20 percent cutoffs for energy intake.
Correlated error is probably present between the two NHSII HS-FFQ reports, which will tend to produce overestimation of the reproducibility correlations. This underlies the importance of having an independent estimate of intake, which was our intention in comparing mothers' reports with their daughters' reports in this study.
The correlations for the maternal comparison were modest for foods and moderate for nutrients. Other studies reported similar or weaker results. Wolk et al. (25) examined adolescent diet recalled by study participants with the adolescent diet remotely recalled by their adult siblings as a proxy external comparison. The average correlation was 0.30 for foods and nutrients. Several other studies have examined the validity of distant diet (>10 years), comparing diet that was recalled with diet recorded at the time of interest (often called original diet); these studies have been reviewed elsewhere (10, 14). Dwyer et al. (32) examined the validity of adolescent diet using diet histories recorded in childhood and found a low median nutrient correlation of 0.12 for recalled foods eaten at age 18 years. This low correlation could be due to the rather crude original assessment of diet. Other studies addressing the validity of diet during adulthood (recalled 11–24 years in the past) have reported average correlations that were moderate for food intakes (range of average correlations, 0.29–0.40) and higher for nutrient intakes (range of average correlations, 0.23–0.59) (26-29, 33, 34). Although we did not have data on original diet, our correlations appear to be consistent with these reports.
Correlated error between NHSII reports and mothers' reports could have led to overestimation of validity if, for example, the mothers discussed their responses with their daughters before returning the questionnaire. We took precautions to minimize this possibility (as detailed in the Materials and Methods section). Although we cannot completely exclude this bias, we believe it is unlikely that a large portion of mothers ignored our instructions and discussed their responses with their daughters. In addition, the daughters had completed the questionnaires more than 4 years earlier, and it is unlikely that they remembered specific responses.
An important limitation of this study is that we did not have actual diet information obtained from participants when they were in high school. The mothers' reports provided some measure of validity, although not a perfect one (35). The fact that mothers may not have been aware of all their nurse-daughters' food habits outside the home would have resulted in error in reporting. For instance, the mothers tended to underreport caffeine and fat more than fruit- and vegetable-related nutrients in comparison to the NHSII participants' reports, which supports the idea that they did not know all that their daughters were eating. In addition, mothers' fading memories may have also contributed to error in reporting of diet, which would have attenuated correlations. In making this comparison between mothers' and daughters' reports, we recognize that there are virtually no true measures of absolute intake for adolescent diet decades in the past, only imperfect standards. This underscores the methodological challenges of evaluating retrospective recall of diet in the distant past. In the absence of actual diet information from the NHSII participants, a more rigorous validation study would be desirable—for example, administration of the same questionnaire to a group of participants for whom diet was recorded when they were in high school.
Lastly, the NHSII participants represented in the reproducibility and maternal comparison components of this study consisted largely of Caucasian women. Thus, these findings are not necessarily generalizable to men or to other women with different ethnic backgrounds, age, or education. However, our study subsample was representative of the full NHSII cohort with respect to age, body mass index (weight (kg)/height (m)2), smoking status, and reproductive variables.
This study also had several advantages over other investigations of this topic. First, it is one of the few that has examined diet during high school. Adolescence may be a particularly important time for the study of chronic diseases, and this remains a relatively unexplored area of investigation. The period of time between repeated questionnaire administrations was long enough (4 years) that it is unlikely that participants would have remembered their initial responses and been influenced by them in the second administration. The potential of recall bias in estimating food and nutrient intakes underlies the importance of prospective studies, such as the present study, in which data are collected before disease occurs. Lastly, as an implication of our results, the low correlations between recalled high school diet and current diet suggest that our information on high school diet was almost independent of data on adult diet; thus, the high school dietary information from it has the potential to add new insights on disease etiology. While further studies are warranted, our findings suggest that this food frequency questionnaire, completed in adulthood, provides a reasonable record of diet during adolescence for use in assessing associations with adult disease.
This research was funded by the National Institutes of Health (grant CA050385), the National Cancer Institute (contract N01-RC-17027 and purchase order 263-MQ-219520), and the Breast Cancer Research Foundation. Sonia Maruti was supported by a US Army Training Grant (DAMD17-00-1-0165), and Dr. Graham Colditz was supported by the American Cancer Society.
The authors thank Mike Atkinson, Gary Chase, and Karen Corsano for their technical and logistical assistance.