|Home | About | Journals | Submit | Contact Us | Français|
The relative importance of biochemical pathways has not been previously examined when considering the influence of diet on breast cancer risk. To address this issue, we utilized interview data from a population-based sample of 1,463 breast cancer cases and 1,500 controls. Dietary intake was assessed shortly after diagnosis using a 101-item food frequency questionnaire. Age- and energy-adjusted odds ratios (ORs) for individual micro- and macronutrients were estimated with logistic regression. Hierarchical modeling was employed to account for biologically plausible nutrient pathways (one-carbon metabolism, oxidative stress, glycemic control and phytoestrogens). Effect estimates from hierarchical modeling were more precise and plausible compared to those from multivariable models. The strongest relationship observed was for the glycemic control pathway, but confidence intervals (CI) were wide [OR (95% CI): 0.86 (0.62, 1.21)]. Little or no effect was observed for the one-carbon metabolism, oxidative stress and phytoestrogen pathways. Associations were similar when stratified by supplement use. Our approach that emphasizes biochemical pathways, rather than individual nutrients, revealed that breast cancer risk may be more strongly associated with glycemic control factors than those from other pathways considered. Our study emphasizes the importance of accounting for multiple nutrient pathways when examining associations between dietary intake and breast cancer.
The risk of developing breast cancer on Long Island has been shown to be inversely associated with antioxidants (1), folate and other B-vitamins (2), choline (3) and flavonoids (4) and positively associated with foods associated with poor glycemic control (5). Synthesizing these findings into a coherent public health message is challenging, since it is unclear if there is a specific underlying mechanism that is driving each of these associations, or if they are an artifact of the high collinearity between food constituents, such as nutrients and phytochemicals, in an ad libitum diet (6).
We revisited these associations by focusing our analysis on groups of nutrients with strong biologic rationale for an association with breast cancer risk. Antioxidants, including carotenoids, vitamins C and E and the mineral selenium, have the ability to inhibit the activity of free radicals, thereby preventing the oxidative damage to DNA linked to carcinogenesis (7). One-carbon metabolism is vital in DNA methylation and synthesis, with the nutrients folate, methionine and choline acting as methyl donors (8) as well as the B-vitamins, B2, B6 and B12 which function as important cofactors in this pathway. Given the mitogenic effect of insulin and its potential role in tumor proliferation (9), dietary factors that affect glycemic control, including macronutrient consumption, as well as nutrient intakes related to glucose metabolism, including calcium, magnesium and zinc (10-12), could affect the carcinogenic process. Flavonoids, a group of naturally occurring phytochemicals in fruits and vegetables and other plant-based sources have antiestrogenic and antioxidant activity (13, 14), which are believed to be important for the prevention of hormone-dependent cancers, such as breast.
Traditional dietary analysis that considers constituents individually fails to account for the complexity of the inter-related biological pathways involved in carcinogenesis and is plagued by the collinearity of dietary intakes as well as the issue of multiple comparisons. Recently an approach based on hierarchical regression has been advocated, which accounts for the correlation within the biochemical pathways involved in the effect of dietary constituents on disease (15), however this approach has not been employed in studies of breast cancer.
For the present analysis, we utilized data from the Long Island Breast Cancer Study Project (LIBCSP) to evaluate the relative importance of each pathway by employing hierarchical regression analysis. Our objective was to estimate effects for each individual nutrient, accounting for the clustering according to putative biologic pathway, as well as estimate fixed-effects for each pathway to determine if one or more of these mechanisms were more critical in the etiology of breast cancer.
Details of the parent study, the LIBCSP population-based case-control study, have been published previously (16), Institutional Review Board approval was obtained for this ancillary study.
Cases were women newly diagnosed with either primary in situ or invasive breast cancer between August 1, 1996 and July 31, 1997, and were English-speaking residents of Long Island, New York (Nassau and Suffolk counties) at the time of diagnosis. To reduce the lag time between breast cancer diagnosis and interview date, newly diagnosed cases were ascertained using a ‘super-rapid’ identification network where study personnel contacted the pathology departments from participating hospitals either 2-3 times per week or daily (for hospitals with the largest numbers of newly diagnosed cases). Permission to contact eligible case women was obtained via physicians. Control women were sampled using Waksberg's method of random digit dialing (17) for those under 65 years of age, and the Health Care Finance Administration (HCFA) rosters for those 65 years and older. The final study sample included 1,508 cases (82.1%) and 1,556 controls (62.7%).
The main questionnaire was pilot tested among residents of Long Island prior to implementation, and was administered by the interviewer. A total of 3,064 individuals completed the main LIBCSP questionnaire (cases = 1,508 and controls = 1,556). On average, study participants were interviewed within 3 months of their diagnosis date (cases) or within 5.5 months of identification (controls). All case-control respondents were asked about their demographic characteristics, pregnancy history, menstrual history, hormone use, medical history, family history of cancer, body size changes, alcohol use, active and passive cigarette smoking, physical activity, occupational history, and other environmental exposures.
A modified 101-item Block food frequency questionnaire (FFQ), which has been previously validated (18-20), was used to assess average dietary intake one year prior to interview. The instrument was self-administered and completed by 1,481 (98.2%) of the cases and 1,518 (97.6%) of controls. The instrument was specifically modified to include additional food sources of phytoestrogens (21). For this analysis, those with total energy intake 3 standard deviations above or below the log-transformed mean were excluded (n=36), resulting in 1,463 cases and 1,500 controls with FFQ data available for this analysis.
Frequency and portion size data from the LIBCSP FFQ were used to estimate daily nutrient intakes, from dietary and supplement sources using the National Cancer Institute's DietSys version3 (1, 2), for: carbohydrates (g/day), calcium (mg/day), fiber (g/day), magnesium (mg/day), zinc (μg/day), alpha-carotene (μg/day), beta-carotene (μg/day), cryptoxanthin (μg/day), iron (mg/day), lutein (μg/day), lycopene (μg/day), oleic acid (g/day), pro-alpha carotenes (μg/day), vitamin C (mg/day), vitamin E (a-TE), riboflavin (vitamin B2, mg/day), cobalamin (vitamin B12, μg/day), pyridoxine (vitamin B6, mg/day), folate (μg/day), betaine (mg/day), free choline (mg/day), glycerophosphocholine (mg/day), methionine (g/day), free phosphocholine (mg/day), phosphotidylcholine (mg/day), and sphingomyelin (mg/day). Published protocols (3, 21, 22) were used to derive intake of: anthocyanidins (mg/day), flavan-3-ols (mg/day), flavonones (mg/day), flavones (mg/day), flavonols (mg/day), isoflavones (mg/day) lignans (mg/day), choline (mg/day) and betaine (mg/day).
To examine the effect of nutrient- and nutrient pathway-specific effects on breast cancer risk, 33 nutrients were considered, and were characterized according to four pathways (Table 1) : (1) one-carbon metabolism : zinc, riboflavin (vitamin B2), cobalamin (vitamin B12), pyridoxine (vitamin B6), folate, betaine, free choline, glycerophosphocholine, methionine, free phosphocholine, phosphotidylcholine, sphingomyelin; (2) antioxidants: zinc, alpha-carotene, beta-carotene, cryptoxanthin, iron, lutein, lycopene, oleic acid, pro-alpha carotene, vitamin C, vitamin E, anthocyanidins, flavan-3-ols, and lignans; (3) glycemic control: total carbohydrate, calcium, fiber, magnesium, zinc; and (4) phytoestrogens: anthocyanidins, flavan-3-ols, flavonones, flavones, flavonols, isoflavones, and lignans. Nutrient values were analyzed as continuous variables and standardized to have a mean = 0 and standard deviation = 1 in order to remove the influence of varying units across intakes. Models with nutrient intakes in their common units did not produce materially different results (data not shown).
Using a Directed Acyclic Graph (DAG) (23) the following variables were considered as potential confounders: age at diagnosis, total energy intake, parity, menopausal status, body mass index, physical activity, hormone replacement therapy, oral contraceptive use, smoking, alcohol intake. Inclusion in the model of age at diagnosis (continuous) and total energy intake (kcals/day) changed the effect estimate by more than 10%. Thus, all models (see below) were adjusted for these two variables.
Three models, single exposure, multivariable, and hierarchical (15), were used to calculate odds ratios (ORs) (and corresponding confidence intervals (CI)) as estimates of the effect of these nutrients on breast cancer risk. The single exposure and multivariable models used logistic regression to model the risk of breast cancer as:
where p denotes risk of breast cancer, X is the matrix of nutrient intake, and W is the matrix of covariates (age at diagnosis and total energy intake). The nutrient intakes X for the single exposure model included only one nutrient at a time, whereas in the multivariable model X included all nutrients.
To improve our estimates of breast cancer risk, we used hierarchical regression which incorporates prior knowledge to develop a regression model. The second level of the hierarchical regression for the logistic coefficients β of the nutrient intakes was:
where Z is the second-stage design matrix (Table 1) which includes our prior information about the nutrients and their respective pathways, π is the vector of coefficients corresponding to the effects of the second stage covariates on breast cancer (i.e. the effect of each individual pathway on breast cancer risk), and δ is the residual error of each nutrient which is assumed to be normally distributed with mean = 0 and variance = τ2. The columns of the Z matrix represent each of the 4 pathways of interest (i.e. one-carbon, antioxidant, glycemic control, and phytoestrogen); the nutrients were scored as 1 if they positively impacted the pathway, −1 if they negatively impacted the pathway, and 0 if they were not associated with the pathway. These weights were chosen by the authors, after careful review of the literature (9, 14, 24-27). For example, vitamin C is an antioxidant that favorably affects the antioxidant pathway (24), and thus it was scored 1; in contrast, iron increases oxidative stress (26), and was therefore scored −1. Further, carbohydrates negatively impact glycemic control (9), and were thus scored −1, whereas fiber positively impacts glycemic control (27) and was scored 1.
The first (β) and second-stage (XZπ) estimates are then combined to provide posterior estimates of each nutrient's effect (according to which pathway they belong) and breast cancer risk:
where the X, Z, W matrices are the same as defined previously, the π and γ coefficients are fixed, and the δ coefficient is random with mean = 0 and variance = τ2.
A semi-Bayes approach was used in the hierarchical regression analysis and τ2 was set at a constant of 0.1225 (τ = 0.35), which allows the corresponding 95% confidence intervals for a 1 SD increase in nutrient intake to have a range of e(3.92*τ) ≈ 4.0 if τ = 0.35 (28). For an analysis including 33 first-stage covariates (nutrients) and 4 second-stage covariates (biologic pathways), setting τ = 0.35 provides adequate 95% CI coverage for residual effects not accounted for by the first and second stage (29). All analysis was conducted using SAS Version 9.2 (SAS Institute, Cary, NC). Hierarchical regression was conducted in SAS using the GLIMMIX macro (30).
We explored differences in effect of nutrient intake on breast cancer risk by supplement use through separate hierarchical models on non-supplement users and supplement users. A total of 1,986 individuals had data available regarding supplement use. The remaining 977 individuals did not respond; effects, however, were not estimated separately for subjects missing supplement use because it is an inappropriate method for dealing with missing data (31). A sensitivity analysis was conducted for the hierarchical model; different proportions of individuals with missing supplement use were stochastically assigned from 0-100% and categorized as either non-users or users of supplements.
The effect estimates (and 95% CIs) derived from the single exposure, multivariable, and hierarchical models are presented in Table 2. Interpretation of traditional dietary analysis that examine multiple nutrients individually (single exposure model) is not straightforward given this approach does not take into consideration multiple comparisons and high collinearity between the nutrients. The multivariable model considers all nutrients simultaneously in a single model; however, since the nutrient intakes in our study were highly correlated (data not shown) the estimates from the multivariable model must also be interpreted with caution. ORs from the multivariable model for nutrients representative of the 4 major nutrient/biological pathways (one-carbon metabolism, oxidative stress, glycemic control, phytoestrogen) appear to be inflated when compared to those derived from single exposure models. The magnitude of the effect estimates for folate intake (one-carbon metabolism), for example, increase sharply from the single exposure model (OR = 0.97; 95% CI = 0.89, 1.07) to the multivariable model (OR = 1.49; 95% CI = 1.04, 2.13) as do the effects for intake of pro-alpha carotenes (oxidative stress; single exposure model OR = 0.96; 95% CI = 0.89, 1.04; multivariable model OR = 1.47; 95% CI = 0.53, 4.11). The inverse effects associated with intake of magnesium (glycemic control) in the multivariable model also appear inflated on the opposite side of the null (single exposure model OR = 0.88; 95% CI = 0.78, 0.99; multivariable model OR = 0.73; 95% CI = 0.51, 1.04). All estimates derived from the multivariable models are less precise than those from the single exposure models as evidenced by the wider confidence intervals (see Supplementary Table 5 for a presentation of confidence limit ratios).
In contrast, as shown in Table 2, the effect estimates for the dietary factors in the hierarchical model, which incorporates the second-stage nutrient pathway information, were improved; the hierarchical estimates appeared to have better precision and to have provided better control for the apparent bias (due to multiple comparisons and collinearity) than those derived from the more simplistic models. In the hierarchical model, for example, the effect of folate was attenuated (OR = 1.31; 95% CI = 0.97, 1.76) compared to that from the multivariable model. Further, the estimates for pro-alpha carotenes (hierarchical OR = 1.10; 95% CI = 0.71, 1.70) and magnesium (hierarchical OR = 0.77; 95% CI = 0.57, 1.04) were also pulled towards the single exposure estimates.
The fixed effects from the hierarchical model are presented in Table 3. The strongest association was noted for glycemic control, with a 14% decrease in risk of breast cancer per standard deviation increase of any nutrient positively affecting this pathway (fixed effects OR: 0.86; 95% CI = 0.62, 1.21), although its confidence interval overlapped with those from the other pathway-specific effects. The pathway-specific effects for the one-carbon metabolism, oxidative stress and phytoestrogen pathways were near unity.
Table 4 provides the nutrient-specific effects from the hierarchical regression analysis stratified by supplement use. The estimates among supplement users tended to be higher than those among non-supplement users for: calcium, fiber, zinc, lutein, vitamin C, vitamin E, folate, free choline, glycerophosphocholine, phosphotidylcholine, and flavan-3-ols. However, supplement users had lower risks than non-supplement users for: carbohydrates, cryptoxanthin, iron, riboflavin, cobalamin, pyridoxine, betaine, free phosphocholine, sphingomyelin, anthocyanidins, flavonones. The supplement-use stratified estimates for pro-alpha carotenes, magnesium, and flavones were similar between users and non-users. Among non-supplement users, folate was associated with lower risk of breast cancer (hierarchical OR = 0.95; 95% CI = 0.61, 1.48) than among supplement users (hierarchical OR = 1.35; 95% CI = 0.89, 2.05). The pathway-specific effects for the stratified analysis (Table 3) did not differ substantially by supplement use.
A sensitivity analysis was conducted, given that one-third of subjects had not reported information on supplement use (N=977; approximately 33% of total subjects). Among non-supplement users, the sensitivity analysis was robust to any missing data. Similarly, among supplement users, the effect estimates did not change substantially by varying the supplement use among those with missing data. The fixed effect estimates for the pathway-specific effects yielded similar results across scenarios among non-supplement and supplement users (data not shown).
Additional sensitivity analyses were undertaken to consider the impact of potentially influential model assumptions, including: changes in specification of the Z matrix; categorization of nutrient intake into quartiles; modeling only those nutrients that are exclusive to one biologic pathway; modeling all nutrients exclusively to a single pathway (no overlapping pathways per nutrient), and use of a Fully Bayesian approach using a Markov Chain Monte Carlo method to estimate posterior effects (23). However, none of these additional considerations substantially influenced the results presented here (data not shown).
This study is the first, to our knowledge, to evaluate the relative effects of biochemically-based nutrient pathways and breast cancer risk. Using hierarchical modeling and data from a large, population-based case-control study, we observed a decrease in breast cancer risk with increasing overall intake of nutrients involved in glycemic control, although the effect estimates were imprecise. In general, estimates from the hierarchical model were less inflated, and more precise compared to those from a multivariable model that included all nutrients, and this modeling approach allowed for estimation of an overall effect of groups of nutrients that operate along a specific biological pathway. We observed essentially null associations with breast cancer risk with increasing intake of nutrients involved in one-carbon metabolism, in phytoestrogens and antioxidants. A stratified analysis yielded similar results regardless of supplement use.
The finding of an effect for exposures related to glycemic control is in agreement with previous findings from our research group that indicated that high consumption of sweets was associated with increased breast cancer risk (5). The importance of glycemic control in cancer etiology is gaining attention. Recent evaluations of dietary glycemic index/glycemic load have yielded generally null findings (32-34), yet the utility of this measure is controversial (35); instead, our analysis considers a set of nutrients in this pathway rather than this specific measure. Regular intake of foods that increase serum glucose, such as dietary carbohydrate, increases insulin activity, result in chronic hyperinsulinemia, which is in turn associated with increased levels IGF-1 (36). Insulin, released from the pancreas in response to elevated serum glucose, is used for glucose transport and utilization (37, 38), protein synthesis, and cellular proliferation (39). In addition to its direct mitogenic effects, insulin enhances growth hormone (GH) stimulated insulin-like growth factor-1 (IGF-1) synthesis (36) which independently promotes tumor development by increasing cell proliferation and inhibiting apoptosis (40). Previous epidemiologic studies have shown elevated serum levels of IGF-1 to be associated with breast and colorectal cancer (41, 42), implicating this hormonal milieu in carcinogenesis. While excess dietary carbohydrate intake may be one factor in the hyperinsulinemic state (36), several other nutrients, which were considered in our hierarchical analysis, are thought to play important roles. Fiber slows the absorption of glucose into the small intestine thereby blunting the ensuing insulin response (43) while calcium, magnesium, and zinc are important cofactors involved in glucose homeostasis and are primarily involved with insulin secretion (10-12, 44).
In the hierarchical models, although we observed a biologically plausible increased risk for developing breast cancer in association with the glycemic control pathway, effects were not observed in the other pathways considered in this study. These include the one-carbon metabolism, anti-oxidant, and phytoestrogen pathways, all of which have strong biologic plausibility and for which we had previously reported inverse associations with breast cancer incidence when each was considered separately (1-4). Stratification by supplement use did not alter our findings. Thus, reasons for these unexpected findings are unclear, but could be due to the issues that motivated us to conduct hierarchical regression modeling, namely to reduce the impact of artifactual influences of multicollinearity and multiple comparisons, which are always of concern in epidemiologic studies of nutrition and cancer. Our findings suggest that future research efforts should also utilize methods to address these concerns.
Our analysis employed hierarchical regression, a semi-Bayesian modeling approach that allows prior information about parameter estimates to be incorporated into the model that relates multiple related exposures to a disease outcome. This technique has been useful in analyses with multiple correlated exposures (45-47) and has been successfully applied to several analyses of dietary factors and cancer risk (28, 30, 48) by considering the effect of individual foods by incorporating nutrients in the second stage. The univariate model does not consider examination of multiple nutrients that may be highly correlated, and the multivariable model is often overparameterized. The hierarchical model improves upon the univariate and multivariable models by allowing for incorporation of multiple correlated nutrients in a single model, while simultaneously considering biologic pathways. Recently, Carmichael and colleagues (15) have successfully applied it to a study of the effects of dietary exposures and neural tube defects, utilizing relevant biochemical pathways as the second-stage grouping, which is the strategy we adapted, given that no previous breast cancer study has explored this nutrient-pathway based approach. The authors of this study found that hierarchical modeling resulted in more precise and less inflated estimates of the dietary intakes in their analysis when compared to the multivariable approach. Hierarchical modeling allows the researcher to consider relationships between multiple exposures that may or not be related to each other. Serial analysis of a set of related exposures is likely to yield spurious associations, since an association may be found for a benign exposure that is simply correlated with another one which is responsible for a true effect. Including all of the correlated exposures simultaneously in a traditional regression framework is similarly ill-advised as the resultant multicollinearity will induce instability in the model. The hierarchical approach employed here “shrinks” parameter estimates of exposures that operate along a similar pathway towards a common value, while allowing them to have their own individual effect on risk of breast cancer. This approach improves model stability and yields more plausible effect estimates compared to a simple multivariable technique.
In addition to our use of a method to incorporate prior biologic knowledge into our analysis, this study benefitted from a large, population-based sample that includes pre- and post-menopausal women and the comprehensive assessment of anthropometric, lifestyle, and dietary factors. While these notable strengths lend credibility to our findings, they should be evaluated in light of a few limitations. With an even larger sample, future studies could consider stratification by breast cancer subtype and the potential for effect modification by other important exposures such as alcohol. The use of a FFQ to ascertain usual diet has a number of well-known issues, however they have been shown to adequately rank dietary intakes across individuals (6), which is the approach utilized for the current analysis. Differential recall in case-control studies is always a possibility, in that the controls may over-report healthy behaviors with greater frequency than cases in an effort to appear more socially acceptable. However, the fact that a null effect was observed for three of the four pathways considered here would appear to be in contrast with that scenario—it would be unlikely that the cases would differentially report behaviors related to a single biochemical pathway. Additional considerations for future studies would include utilizing different methodologies, including: factor analysis, principal components, and consideration of food intake. However, these alternative approaches do not incorporate prior information regarding biologic pathway, and therefore would not provide an estimation of biologic-pathway based effects.
In conclusion, this analysis is the first to consider, in a single model, nutrients and food constituents that operate along specific biological pathways involved in the etiology of breast cancer. Our findings suggest that dietary factors related to glycemic control may be relatively more important than those related to one-carbon metabolism and antioxidant activity or those with estrogenic properties. Additional research should focus on the importance of the glycemic control pathway in the etiology of breast cancer, and future analyses of diet could benefit from a similar pathway-based approach to highlight relevant biologic mechanisms, while reducing the impact of issues related to multicollinearity and multiple comparisons.
This work was supported by grants from the National Cancer Institute (R01CA109753 and 3R01CA109753–04S1) and in part by grants from Department of Defense (BC031746 and W81XWH-06–1-0298) and National Cancer Institute and National Institutes of Environmental Health and Sciences (UO1CA/ES66572, UO1CA66572, P30CA013696, P30ES009089, and P30ES10126) and the Marilyn Gentry Fellowship from the American Institute for Cancer Research. The authors declare no financial or nonfinancial conflicts.
The authors would like to thank Drs. John Witte, Charles Poole, David Richardson, and Stephen Cole, for their thoughtful comments on an earlier version of this work
The authors declare no financial or non-financial conflicts.