|Home | About | Journals | Submit | Contact Us | Français|
Evidence of the benefits of soy on cancer risk in Western populations is inconsistent, in part because of the low intake of soy in these groups.
We assessed the validity of soy protein estimates from food-frequency questionnaires (FFQs) in a sample of Adventist Health Study-2 participants with a wide range of soy intakes.
We obtained dietary intake data from 100 men and women (43 blacks and 57 nonblacks). Soy protein estimates from FFQs were compared against repeated 24-h recalls and urinary excretion of daidzein, genistein, total isoflavonoids (TIFLs), and equol (measured by HPLC/photodiode array/mass spectrometry) as reference criteria. We calculated Pearson and Spearman correlation coefficients (with 95% CIs) for FFQ–24-h recall, 24 h-recall–urinary excretion, and FFQ–urinary excretion pairs.
Among soy users, mean (± SD) soy protein values were 12.12 ± 10.80 g/d from 24-h recalls and 9.43 ± 7.83 g/d from FFQs. The unattenuated correlation (95% CI) between soy protein estimates from 24-h recalls and FFQs was 0.57 (0.32, 0.75). Correlation coefficients between soy protein intake from 24-h recalls and urinary isoflavonoids were 0.72 (0.43, 0.96) for daidzein, 0.67 (0.43, 0.91) for genistein, and 0.72 (0.47, 0.98) for TIFLs. Between FFQs and urinary excretion, these were 0.50 (0.32, 0.65), 0.48 (0.29, 0.61), and 0.50 (0.32, 0.64) for daidzein, genistein, and TIFLs, respectively.
Soy protein estimates from questionnaire were significantly correlated with soy protein from 24-h recalls and urinary excretion of daidzein, genistein, and TIFLs. The Adventist Health Study-2 FFQ is a valid instrument for assessing soy protein in a population with a wide range of soy intakes.
Published reports from clinical and experimental studies in both humans (1) and animals (2, 3) suggest a protective effect of soy intake against cancer, although a few studies report no (4) or an opposite effect (5). A number of cohort and case-control studies have demonstrated a reduction of cancer risk associated with soy intake among Asian populations (6-9); however, scientific evidence of its benefits in the US and other Western populations (10-12) is limited and inconsistent, in part owing to the relatively low intake of soy in these groups (13).
The Adventist Health Study-2 (AHS-2) is a prospective cohort study designed primarily to investigate the relation between diet and cancer outcomes. Unlike subjects in other Western populations, many AHS-2 participants have high intake of soy as a source of protein. Many are life-long soy consumers, but approximately 30% do not eat soy products at all (K Jaceldo-Siegl, unpublished data, 2006), which results in a wide range of dietary exposure to soy in this population. In addition, because the church also prohibits the use of tobacco, alcohol, and pork, these potential confounders are nearly eliminated.
The postulated cancer-preventive properties of soy foods may be linked to the high concentration of isoflavones associated with soy protein (14-16). The 3 most important isoflavones found in soybeans are daidzein, genistein, and glycitein. Daidzein can be metabolized further by colonic bacteria into equol and O-desmethylangolensin and other products (17). A full assessment of dietary intake for these compounds is often limited by incomplete or uncertain quality of quantitative data on isoflavones content in soy foods. Fortunately, isoflavonoids (IFLs) and their metabolites also occur in relatively high concentrations in plasma, urine, and to a smaller extent in feces and therefore can easily be measured. IFLs in urine and plasma, in particular, have been used in a number of studies as a biomarker of soy food, soy protein, or isoflavones intake in both Asian and Western populations (18-23).
Dietary assessment in large populations can be costly and complicated but are necessary in dietary epidemiology. The most practical dietary assessment instrument for use in prospective cohort studies is the food-frequency questionnaire (FFQ); therefore, it is important to evaluate the extent to which a questionnaire can measure true intake (24). Because measurement errors associated with questionnaire assessment attenuate relative risk estimates in studies of diet and disease risk, such studies should incorporate a validation of the questionnaire in which both the questionnaire and a more accurate reference method are obtained and compared (25, 26). In this report we investigate how well our estimates of soy protein intake from FFQs correlate with soy protein intake from repeated 24-h recalls and urinary excretion of specific isoflavonoids as reference measures and evaluate whether soy protein is a valid index of soy isoflavone intake in a sample population with a wide range of soy intake.
Subjects included in this report are part of an ongoing calibration study for the Adventist Health Study-2 at Loma Linda University. This cohort comprises 97 000 subjects located throughout the United States and Canada. The profile and recruitment methods of the cohort have been described elsewhere (27). The goal of the calibration study is to enroll 500 blacks (subjects of African descent) and 500 nonblacks randomly chosen from participating AHS-2 members around the country. At the time the analysis for this report was performed, ≈70% of the entire cohort and 50% of the calibration study sample were enrolled. For the purpose of this report, we describe briefly the procedures for enrollment to this calibration study. First, a list of potential subjects from the parent study is generated by using a two-stage random selection method, which involves the size of the church and then subjects within the church. Research assistants then telephone potential subjects to provide an overview of the calibration study requirements and to inquire of their availability for a scheduled clinic at their local church. Those who are unavailable to attend a clinic or have a current diagnosis of active cancer, Alzheimer’s disease, or pregnancy are excluded. Unqualified subjects and those who refuse participation (about one-third) at the initial invitation are replaced with individuals randomly chosen from the same church and matched by race, age (within 5 y of the original subject’s age), and sex. There were no significant or practical differences in the distribution of sex, age, education, or vegetarian status between the calibration study and the AHS-2 cohort. The study was approved by the institutional review board of Loma Linda University, and all subjects gave written informed consent.
The duration of the calibration study was 9–12 mo. Calibration study participants were each sent a structured, self-administered FFQ, which was completed by each participant at home and then mailed back to AHS-2 study personnel. On receipt of the questionnaire, study personnel reviewed the questionnaire for completeness and, as necessary, followed-up by telephone to clarify any ambiguous or incomplete information. Each subject provided 6 repeated 24-h dietary recalls, which were obtained in 2 blocks of 3 recalls, each block separated by ≈6 mo for a particular subject. Within 4 mo of enrollment into the calibration study, each subject attended a scheduled clinic (28) for the collection of biologic samples. A cash payment of $100 was given to each subject who completed all calibration study requirements.
The AHS-2 FFQ is a comprehensive 22-page instrument consisting of 204 foods, 54 questions about food preparation, and 46 fields for open-ended questions. Respondents are asked to report on their intake over the previous 1 y. Foods are grouped under the following categories: seasonal fresh fruit, canned or cooked fruit, dried fruit, fruit and vegetable juices, salads and raw vegetables, cooked vegetables, legumes, soups, breads, cooked cereals and grains, seeds and nuts, pasta, dressings and sauces, eggs, dairy products and oils, beef, chicken, lamb, or pork, fish, beverages, alcoholic beverages, sweets and desserts, snacks, seasonings and additives, vitamins and mineral supplements, soy supplement, cold breakfast cereals, meat substitutes, and soy-based drinks. The frequency section consists of 9 categories: never or rarely, 1–3 per month, 1 per week, 2–4 per week, 5–6 per week, 1 per day, 2–3 per day, 4–5 per day, and 6+ per day. Portion sizes include 3 levels: standard, ½ or less, and 1½ or more. Standard portion sizes are based on serving sizes with use of familiar units such as cup, tablespoon, slice, patty, link, and others. Completed questionnaires were optically scanned by using an NCS 5000i Image Scanner with ScanTools Plus software (Pearson NCS, Bloomington, MN). Standardized processing of open-ended questions was done with The Food Write-In Processing software (Adventist Health Study-2, Loma Linda, CA), a network-based application created in Microsoft Access. This software provides a TIFF image of any page of the FFQ and allows the operator to quickly interpret the hand-written response and select an appropriate code for each item.
The AHS-2 FFQ includes 51 items, mostly commercial products, to assess soy intake. At the time the questionnaire was developed, foods defined in the literature as traditional soy foods were not known to be consumed regularly in this population; thus, soybeans, tofu, and soybean curd were combined into one item in the questionnaire. Thirty-four commercial foods and 6 open-ended questions comprise the meat analogues section, 15 beverages and 2 open-ended questions comprise the soy milk section, and 1 question asks about soy isoflavone supplement intake. Meat analogues include foods such as Worthington FriChik, Loma Linda Fried Chicken, Worthington Chili, Vibrant Life Vegeburgers, Worthington Vegetable Skallops, Cedar Lake Deli Franks, Loma Linda Dinner Cuts, Morningstar Farms Chik Nuggets, and Morningstar Farms Burger Style Recipe Crumbles. Soy beverages include products such as Better Than Milk powder, Eden Soy, West Soy Plus, So Good, Trader Joe’s Soy-Um, and Vita Soy.
We obtained 3 variably timed, unannounced 24-h dietary recalls (one Saturday, one Sunday, and one weekday intake) during each block of recalls. Interviews were conducted by trained research nutritionists using Nutrition Data System for Research software version 5.03 (The Nutrition Coordinating Center, Minneapolis, MN), a time-related database that updates analytic data while maintaining nutrition profiles true to the version used for data collection (29). Recalls were obtained by telephone with standard probes and a multiple-pass approach methodology to collect detailed information on all foods, beverages, and supplements consumed by each subject during the previous 24 h (30). Recipes were created for foods not found in the database. Soy protein content of meat analogues not in the database was estimated on the basis of information obtained from published data (31, 32), the US Department of Agriculture database, manufacturers, or nutrient labels.
From the first 186 individuals who successfully completed the first block of 3 recalls, 100 were selected for urinary IFL determination. Selection was stratified by sex, race, age, geographical location, and estimated soy protein intake (high, medium, or low) from recalls. Data from the latter comprised the analytic population of this report. A week before the clinic, each subject was sent a urine container with instructions to collect overnight urine over the 12 h before their morning appointment. The urine sample was brought to the clinic and processed. Clinic staff recorded the volume. After careful agitation to thoroughly mix the settled contents, two 90-mL specimen cups were each filled with up to 70-mL aliquots, and the remaining urine was discarded. One cup was further processed by adding 1 mL hydrochloric acid to prevent oxidation of labile compounds. Both specimen cups were securely sealed and labeled with identifiers and treatment, secured together, and immediately placed in a refrigerated (4 °C) biohazard container. Within 4 h from the time clinic staff collected and processed the specimens, urine samples were transported in insulated double-lined containers with frozen gel freezer packs to the nearest airport and arrived at our central laboratory at Loma Linda University for processing within 24 h. The treated urine sample, which was used for IFL determination, was further processed by preparing 4 5-mL aliquots, which were then stored in liquid nitrogen at −70 °C.
Isoflavonoid analyses in urine were performed according to previously established methods through use of HPLC with photodiode array and mass spectrometry detection for daidzein, genistein, and glycitein (33) with the inclusion of equol (34). Urinary creatinine levels were determined by a clinical autoanalyzer (Roche-Cobas MiraPlus; Roche Diagnostics, Chicago, IL) with use of a test kit based on the Jaffé reaction (Randox Laboratories, Crumlin, United Kingdom). Detection limits for all analytes were 10 nmol/L or 2–50 pg/mg creatinine. IFL levels were adjusted for creatinine concentrations, and IFL excretion rates were expressed as picomoles per milligram creatinine. All laboratory analyses were performed by the Cancer Research Center of Hawaii.
Each of the 24-h dietary recall days was weighted appropriately to produce a synthetic week (Saturday intake + Sunday intake + 5 × weekday intake) and then divided by 7 to obtain mean daily soy protein estimate (in g/d) for each individual. Soy protein estimates from FFQ data were calculated by the product-sum method; thus, total soy protein per subject = sum [(weighted frequency of use of a food) × (weighted portion size consumed of that food) × (g of soy protein in a standard serving size of that food)]. Arithmetic means ± SD or geometric means (95% CI) were calculated for levels of soy protein assessed by 24-h recalls (R), FFQ (Q), and urinary excretion of daidzein (MDE), genistein (MGE), and total IFLs (MTIFL). Mean differences in soy protein intake were compared between 24-h recalls and FFQs and between blacks and nonblacks with use of Student’s t test. The chi-square test was used to determine differences in the distribution of males and females and between blacks and nonblacks.
Total protein was energy-adjusted by using the residual method (35). Daidzein and genistein excretion rates were log transformed before calculation of Pearson correlation coefficients and 95% CIs. Spearman rank correlations were performed for equol. Validity coefficients were calculated between 1) soy protein from 24-h recalls and soy protein from FFQs [corr(Q,R)]; 2) soy protein from 24-h recalls and excretion rates of each of daidzein, genistein, total IFLs, and equol [corr(R,M)]; 3) soy protein from FFQs and excretion rates of each of daidzein, genistein, total IFLs and equol [corr(Q,M)]; 4) energy-adjusted total protein from 24-h recalls and excretion rates of each of daidzein and genistein [corr(R,M)]; and 5) energy-adjusted total protein from FFQs and excretion rate of each of daidzein and genistein [corr(Q,M)]. We determined first the crude correlation for corr(Q,R) and then corrected the correlation coefficients for attenuation from within-person variations in the recalls (36). Attenuation-adjusted correlation coefficients were calculated between soy protein (from 24-h recalls and FFQs) and each of daidzein, genistein, total IFLs, and equol and also between total protein (from 24-h recalls and FFQs) and each of daidzein and genistein for all subjects and soy users. We calculated 95% CIs for all validity coefficients using bootstrap resampling and the BCa method (37). We carried out descriptive analyses using Statistical Analysis Software, release 9.1 (SAS Institute Inc, Cary, NC) and calculated correlation coefficients and 95% CIs using S-PLUS 6 (Insightful Corporation, Seattle, WA).
Of the 100 calibration study subjects selected from the analytic population, 43 were blacks and 57 were nonblacks (Table 1). On average, nonblacks were 12 y older than blacks, and in both groups there were more females than males, both typical of the AHS-2 cohort. Differences in mean age and sex distribution by race were not significant.
Soy protein estimates from 24-h recalls and FFQ data are shown in Table 2. FFQ and recall results were not significantly different in all subjects and among soy users (n = 58 from 24-h recalls). Mean differences in soy protein between blacks and nonblacks were significant for 24-h recalls but not for FFQ, in all subjects and among soy users. Differences in the geometric means of daidzein, genistein, and total IFLs by race were significant (P < 0.05; Table 3). The proportion of subjects with equol excretion rates of 0, 0.04-500, and >500 pmol/12 h were not significantly different between blacks and nonblacks, either in all subjects or among soy users.
The crude correlation coefficient (with 95% CI) between soy protein from 24-h recalls and FFQ in all subjects was 0.67 (0.46, 0.81), which increased to 0.88 (0.57, 1.00) after adjustment for attenuation from within-person variation in the recalls. Correlations were lower among soy users (Table 4). Pearson correlation coefficients (Table 5) between urinary excretion of specific IFLs and soy protein intake from 24-h recalls (after correction for attenuation) were 0.75, 0.71, and 0.75 for daidzein, genistein, and total IFLs, respectively. Spearman correlation coefficient was 0.03 for equol. Data from FFQs and among soy users were lower. Correlation coefficients between energy-adjusted protein intake (from 24-h recalls and FFQs) and daidzein and genistein individually ranged from −0.041 to 0.097 (not shown in tables).
In this study sample, soy protein intake in all subjects and among soy users fell within the range of intakes of those observed among populations in Singapore, Shanghai, North Korea, and Japan (20, 23, 38). Mean urinary excretions of daidzein and genistein were higher among the soy users in our sample than in either of the populations in Singapore or Shanghai (20, 23). Seven percent of soy users in our sample produced >500 pmol/12 h equol.
The small sample in this analysis was chosen from the AHS-2 calibration study specifically to represent a group of individuals with a wide range of soy intake (low, medium, or high). Following this criterion, we demonstrated that mean estimates of soy protein from FFQs and recalls are not significantly different and that soy protein from FFQs moderately correlates with estimates from multiple 24-h recalls. The correlation coefficients between specific urinary IFLs (daidzein, genistein, and total IFLs) and soy protein estimates from 24-h recalls or FFQs are relatively high, especially with the use of recall values.
In comparison, another study in which soy protein from questionnaires was validated against 24-h recalls (39) reported a Spearman correlation coefficient of 0.38 (P < 0.05), which is lower than the Pearson correlation of 0.67 that we observed. Our findings for the correlation between soy protein from questionnaires and urinary IFLs (daidzein, genistein, and total IFLs) are within the range (0.55–0.57) of those observed in other studies that evaluated the validity of soy protein from questionnaires with total IFL excretion rates in overnight urine samples. For example, previous studies reported statistically significant correlation coefficients of 0.53 with FFQs in a sample of women in Shanghai (23) and 0.32 or 0.61 with FFQs for the previous year or previous 24 h, respectively, from a multiethnic population (Caucasian, Native Hawaiian, Chinese, Japanese, and Filipino) of women in Hawaii (19).
Many studies that validated soy intake from questionnaires against biomarkers of IFL concentration as the reference method preferentially assessed intake of isoflavones from questionnaires (40-42), and few measured soy protein intake (19, 23, 39). When isoflavone intake was assessed, correlation coefficients were 0.28 in a population of women in the United States (32) and ranged from 0.21 to 0.30 in women from the European Prospective Investigation of Cancer and Nutrition-Norfolk UK study (41). Correlations were higher in a Chinese population living in Shanghai (range: 0.42–0.54) (39) and in a multiethnic population in Hawaii (0.31 or 0.62) (19). The weak correlations observed in Western countries probably result from a small range of soy intake in these populations compared with that in Asian populations. That the correlations observed between urinary IFL excretion and soy protein intake (range: 0.31–0.56) are rather stronger than those between isoflavone intake and excretion (range: 0.24–0.30) suggests that current dietary assessment methods of isoflavone intake may be less accurate than for soy protein intake.
Assessment of isoflavone intake in populations represents a challenge for researchers primarily because of the paucity of data on the isoflavone content of many foods. Moreover, isoflavone databases that are available have been developed with the assumption that the isoflavone content in foods remains constant, despite the recognition that great variability exists. Setchell and Cole (43) demonstrated that isoflavone content in the samples of soy protein isolates used in commercial foods, collected over a period of 3 y, varied markedly (by 200–300%) with time, by manufacturer, and by processing method. During the same 3-y period of sample collection, however, the authors reported only a 3% variation in the protein content of the soy protein isolates, which indicates that manufacturers have better control over the protein than the isoflavone content in soy foods. Data from the USDA-Iowa State University Database on Isoflavone Content of Foods show that total isoflavones in soy protein concentrate can vary from 102 mg/100 g of edible portion by using the aqueous wash method of extraction to 12.47 mg/100 g of product when produced by alcohol extraction (44). Finally, isoflavone content in soybean seeds varies depending on a variety of factors such as environmental, genetic, harvesting, and processing conditions (45). It is not surprising, therefore, that estimates of soy protein intake perform at least as well as, if not better than, estimates of isoflavone intake when correlated with IFL excretion.
US consumption of soy is increasing, and use of commercial ingredients such as soy protein concentrate, soy protein isolate, and soy flour is becoming more prevalent among manufacturers of cereals, energy bars, soy-based iced desserts, and, in particular, meat analogues. If soy proteins in these products remain more stable over time than isoflavones, studies that investigate relations between soy and disease risk should, therefore, evaluate both isoflavone and soy protein intake.
This study provides data on soy protein intake and measured biologic markers in a Western population with a wide range of intake of soy foods and at levels comparable to those of Eastern populations. In addition, we demonstrated that the AHS-2 FFQ is a valid instrument for assessing soy protein intake and that these estimates are a good index of soy isoflavone intake. Findings from this analysis may be useful in bringing greater clarity to the links between soy intake and cancer incidence in the AHS-2 cohort.
We acknowledge our research team who helped collect the 24-h dietary recalls and biologic specimens and express our sincerest thanks to Ken Burke for his expert guidance on estimating soy protein from meat analogues and to Laurie Custer (Cancer Research Center of Hawaii) for technical assistance with urinary analyses.
2Supported by NIH grants RO1 CA94594 and CA71789.
None of the authors had a personal or financial conflict of interest.