|Home | About | Journals | Submit | Contact Us | Français|
Lipid-associated genetic variants discovered through genome-wide association studies (GWAS) do not account for the majority of heritability estimated for these traits. Epidemiological studies have long indicated that certain environmental factors are capable of shaping lipid distributions. However, environmental modifiers of known genotype-phenotype associations are just recently emerging in the literature.
We genotyped GWAS-identified variants in samples collected for the National Health and Nutrition Examination Surveys (NHANES). NHANES is a cross-sectional survey of Americans representing non-Hispanic whites (n=2,435), non-Hispanic blacks (n=1,407), and Mexican Americans (n=1,734). Along with lipid levels, NHANES contains an abundance of environmental variables, including serum vitamin A and E levels, both of which are antioxidants that may play a role in lipid metabolism. Gene-environment interactions were modeled between either vitamin A or ln(vitamin E) and 23 GWAS-identified lipid-associated variants for HDL-C, LDL-C, and ln(TG) levels.
After adjusting for age, sex, and marginal effects, three SNPxvitamin A and six SNPxvitamin E interactions were identified at a significance threshold of p<2.2×10−3. The most significant interaction was APOB rs693xvitamin E (p=8.9×10−7) for LDL-C levels among Mexican Americans; this same interaction was significant in non-Hispanic whites (p=2.67×10−4) but not non-Hispanic Blacks (p=0.11). The nine significant interaction models individually explained 0.35–1.28% of the variation in one of the lipid traits.
Our results suggest that the vitamins A and E impact GWAS-identified associations for lipid traits; however, these significant interactions account for only a fraction of the overall variability observed for HDL-C, LDL-C, and TG levels in the general population.
The importance of both genetics and environment in shaping an individual’s lipid profile is intuitively obvious. However, the search for gene-environment interactions that influence levels of HDL-C, LDL-C, and triglycerides has only been relatively recent. One driving force for expanding beyond the standard single-variant models is the observation that single-variant main effects do not account for the majority of the heritability attributed to additive genetics for most complex human traits . For the lipid traits, heritability estimates are as high as 80% [17,31,36], yet the largest and most comprehensive lipid meta-analysis to date was only able to explain about 25–30% of the genetic variance . The identification of gene-environment interactions may help find a proportion of this “missing heritability”.
Within a statistical framework, a gene-environment interaction describes the effect of a genotype and an environmental factor that deviates from their additive effects. Within a biological framework, the environment (or its by-product) modifies the function or amount of a gene product . The latter approach to identify gene-environment interactions is difficult in outbred populations such as humans given that both genetic background and environmental exposures vary within and across populations. Model organisms are more suited to identify biological interactions, but it is difficult to automate these studies, and the findings of these experiments may not generalize to humans . In contrast, methods to identify statistical interactions can be automated, making them an attractive option for detecting gene-environment interactions important for complex human traits .
A number of candidate environmental factors affect lipoprotein phenotypes, including diet and nutrition. More specifically, the fat-soluble micronutrients vitamin E and vitamin A may influence lipid metabolism by way of their antioxidant properties. For example, vitamin E may play a role in the prevention of atherosclerosis, through inhibition of oxidation of LDL. While one randomized control trial demonstrated an inverse association between vitamin E intake and relative risk of coronary artery disease , others were unable to replicate this protective effect, as reviewed in Nicolosi et al . Discrepancies between studies may be due to the fact that vitamin E can also function as a prooxidant . The antioxidant and anti-atherogenic properties of vitamin A are less studied, although it is known that high doses of vitamin A in the form of isotretinoin increase triglycerides and cholesterol levels and lower HDL-C levels [4,26,29,43].
Despite evidence that genetic variants and environmental factors are independently associated with lipid traits, relatively few studies have been published investigating the interaction between the two [3,10,11,16,22,40]. And, to our knowledge, no studies explicitly testing for interactions between lipid-associated SNPs and vitamin E or vitamin A have been published. We present here an investigation of the effects of 23 lipid-associated SNPs in the context of dietary intake of vitamins A and E using data from the National Health and Nutrition Examination Surveys (NHANES). Analysis of ~15,000 participants from this diverse population-based survey revealed nine significant interactions between lipid-associated SNPs and dietary intake of vitamins A and E. These significant interactions explained 0.35–0.39%, 0.67–1.28%, and 0.36–0.80% of the variability in HDL-C, LDL-C, and triglyceride levels, respectively. Overall, these data provide the first steps in finding the “missing heritability” for lipid traits by accounting for nutritional variables.
Study samples were drawn from three National Health and Nutrition Examination Surveys (NHANES III, NHANES 1999–2000, and NHANES 2001–2002). Participant ascertainment and data collection for NHANES has been previously described [7,8]. Only fasting adults (age ≥ 18 years) were included in this analysis. Race/ethnicity was self-described.
All procedures were approved by the CDC Ethics Review Board and written informed consent was obtained from all participants. Because no identifying information was accessed by the investigators, this study was considered exempt from Human Subjects by Vanderbilt University’s Institutional Review Board.
Serum HDL-C, triglycerides, and total cholesterol were measured using standard enzymatic methods. LDL-C was calculated using the Friedewald equation, with missing values assigned for samples with triglyceride levels greater than 400 mg/dl. Serum levels of vitamin E and vitamin A were measured with isocratic high-performance liquid chromatography [6,9].
A total of 23 SNPs were considered in this analysis (Table 1). All SNPs were previously associated with HDL-C, LDL-C, and/or triglycerides in published (as of early 2009) candidate gene and genome-wide association studies [1,19,20,24,35,41] and were subsequently analyzed for single-SNP associations with lipid levels in a large meta-analysis by the Population Architecture using Genomics and Epidemiology (PAGE) study . The 23 SNPs tested for gene-environment interactions were either accessed from existing data in the Genetic NHANES database  or were directly genotyped by the Epidemiological Architecture of Genes Linked to Environment (EAGLE), one of the four large population-based studies of the PAGE network, using Sequenom or Illumina BeadXpress. Genotyping was performed in the Vanderbilt DNA Resources Core and in the laboratory of Dr. Jonathan Haines. In addition to genotyping experimental NHANES samples, we genotyped blind duplicates provided by CDC and HapMap controls (n=360). All EAGLE SNPs considered here were genotyped in all three NHANES (NHANES III, NHANES 1999–2000, and NHANES 2001–2002), had minor allele frequencies >5% in all three racial/ethnic populations, passed CDC quality control metrics, and are available for secondary analyses through NCHS/CDC.
Regression modeling was used to investigate the effect of interactions between lipid-associated variants and vitamin levels on HDL-C, LDL-C, and triglycerides. Gene-environment interactions were modeled using a multiplicative interaction term between the environmental variable and the additively-encoded SNP. All models were adjusted for the main effect of the SNP and the environmental variable, along with age and sex. Triglycerides and vitamin E levels were natural-log transformed due to a skewed, non-normal distribution. HDL-C, LDL-C, and vitamin A levels were left as continuous and untransformed. All statistical analyses were conducted unweighted and remotely in SAS v9.2 (SAS Institute, Cary, NC) using the Analytic Data Research by Email (ANDRE) portal of the CDC Research Data Center in Hyattsville, MD. Associations were deemed significant if the p-value was less than or equal to the Bonferroni corrected threshold of 2.2×10−3 (=0.05/23 SNPs). Aggregate statistics related to this work will be available via dbGaP as part of the PAGE study.
Table 2 displays descriptive statistics for the key variables in this study. Both vitamin A and vitamin E levels were significantly different among the three racial/ethnic groups (p<0.001, one-way ANOVA). Non-Hispanic whites had both higher mean vitamin A and vitamin E levels (60.6 ug/dl and 1,322 ug/dl, respectively) compared to non-Hispanic blacks (53.1 ug/dl and 1,002 ug/dl) and Mexican Americans (52.8 ug/dl and 1,135 ug/dl). Non-Hispanic blacks and Mexican Americans had similar mean vitamin A levels, although vitamin E levels are higher in Mexican Americans.
It is important to note that vitamins A and E were highly correlated with the majority of lipid levels in all three NHANES populations (Table 3). More specifically, vitamin A was associated with all three lipid traits in the majority of participants. For triglycerides, the amount of variance explained (R2) by vitamin A was as high as 14% in non-Hispanic whites. R2 was smaller for the other two lipid traits (max R2<5% between LDL-C and vitamin A in Mexican Americans; Table 3) although it was still larger than the average amount of variance explained by single common genetic variants (~3%). Vitamin E was also very strongly correlated with LDL-C and triglyceride levels (p<4.05×10−45) across all racial/ethnic groups. Furthermore, vitamin E levels explained 17–24% of the variance in LDL-C levels and 25–40% of the variance in triglyceride levels (Table 3).
We tested for gene-environment interaction effects between our 23 lipid-associated variants and vitamins A and E on HDL-C, LDL-C, and triglyceride levels. A total of nine gene-environment interactions were statistically significant at p<2.1×10−3 and are summarized in Table 4. Full association results are reported in Supplementary Tables S1–S6. The association between LDL-C and APOB rs693xvitamin E in Mexican Americans was the most significant at p=8.94×10−7. This same interaction was significant in non-Hispanic whites (p=2.67×10−4) but not in non-Hispanic blacks (p=0.11, Table S5). Additionally, other interactions with this APOB variant (rs693xvitamin A and rs693xvitamin E) were significantly associated with triglyceride levels among non-Hispanic whites at p=2.16×10−3 and 4.65×10−5, respectively.
Interactions between ANGPTL3 rs1748195 and both vitamin A and E were associated with HDL-C levels in non-Hispanic whites (p=1.16×10−3 and p=2.06×10−3). The ANGPTL3 rs1748195xvitamin A interaction trended towards significance in non-Hispanic blacks (p=0.01) but was not associated with HDL-C in Mexican Americans (p=0.64, Table S1). Similarly, the rs1748195xvitamin E interaction was not associated with HDL-C in the other two populations.
Two interactions with a variant in PCSK9 are also listed in Table 5.4. The PCSK9 rs11206510xvitamin A interaction was associated with LDL-C in Mexican Americans at p=7.65×10−5. In addition, the PCSK9 rs11206510xvitamin E interaction was associated with transformed triglycerides in non-Hispanic whites at p=1.27×10−3. Lastly, the only significant gene-environment interaction observed in non-Hispanic blacks was between the APOA1/C3/A4/A5 cluster variant rs3135506 and vitamin E, which was associated with triglyceride levels at p=2.45×10−4.
The nine significant interaction models individually explained 0.35–1.28% of the variation in one of the lipid traits. Interactions rs693xvitamin E and rs11206510xvitamin A had the greatest R2 values and contributed to 1.28% and 1.26%, respectively, of the variation in LDL-C among Mexican Americans. The seven other interaction terms had R2 values <1%.
In this study we have identified three novel SNPxvitamin A and six novel SNPxvitamin E interactions. A majority of the significant interactions were associated with triglycerides (4/9) and were among non-Hispanic whites (6/9). Our most significant finding (APOB rs693xvitamin E), however, explained less than 1.3% of the variance in LDL-C among Mexican Americans, a trait that is up to 80% heritable. In comparison, the effect of age and sex together accounted for 5.9% of the variance in LDL-C among Mexican Americans.
All of the genes implicated here play key roles in lipid metabolism. The gene products of APOB, apoB-48 and apo-100, are the main apolipoproteins of chylomicrons and LDL particles, respectively. ANGPTL3 encodes a protein which can suppress lipoprotein lipase (LPL) activity, leading to increases in plasma triglycerides and HDL-C. PCSK9 encodes protein convertase subtilisin kexin 9, a protein that binds the LDL receptor and induces its degradation. Lastly, the APOA1/C3/A4/A5 gene cluster lies within a 17kb region on chromosome 11. Proteins made by this gene cluster are major constituents of very low density lipoprotein (VLDL) and/or HDL, act to inhibit LPL activity, and influence dietary fat absorption and chylomicron synthesis .
Both vitamin E and A are incorporated into lipoproteins and are delivered to peripheral tissues. Additionally, both are found exclusively in plasma lipoproteins (VLDL, LDL, and HDL) . The interdependence of these vitamins and lipids (as demonstrated in Table 3) suggests that the interactions described in this study may be either just reflective of the strong correlation between vitamins and lipids or biological relevance. In support of the latter interpretation, micronutrients such as vitamin A and E have previously been implicated in affecting the gene expression of import lipid-metabolizing genes [15,16,27,28,33]. For example, Mooradian et al demonstrated that high concentrations of vitamin E were associated with significant decreases in apoA-I expression (which is sensitive to the oxidative state of the cell) in hepatic HepG2 cells by reducing apoA-I promoter activity .
It has been argued that gene-environment heterogeneity may be, in part, to blame for the lack of replication among GWAS studies and among different ancestral populations [23,32]. In the single-SNP PAGE meta-analysis detailed in Dumitrescu et al , APOB rs693 was strongly associated in European Americans (p=3.38×10−21), marginally associated with LDL-C in African Americans (p=0.02), but not associated in Mexican Americans/Hispanics (p=0.18). However, in this analysis, which represents a subset of the PAGE study sample, the main effect of rs693 was significantly associated in Mexican Americans (p=1.17×10−6, Table 4) after adjusting for the interaction with vitamin E. Accounting for environmental modifiers in genetic studies of lipid levels may not only uncover new biology, it may also improve the generalizabilty of findings from genome-wide association studies.
In interpreting our findings, we should consider several aspects. First, NHANES is a cross-sectional study and, therefore, we are unable to determine the temporal sequence of our results. Second, the issue of sample size and the ‘curse of dimensionality’  is relevant to this study. As the number of factors under study increases (as with the addition of interaction terms), so do the number of strata. With a set sample size, increasing the number of terms in the model quickly increases the degrees of freedom and reduces the per-stratum sample size, thus decreasing statistical power. For this reason, even with relatively large sample sizes in NHANES, we had to restrict our analysis to SNPs with minor allele frequencies greater than 5%. To better study less-common variants, collaborative studies and/or other non-regression based approaches (such as multifactor dimensionality reduction)  may be appropriate, although they are not without their own limitations. Lastly, other potential confounding environmental factors, such as physical activity and alcohol consumption, were not included in the analysis.
A major strength of the study is that NHANES systematically collects environmental exposures in a diverse population. It is important to keep in mind that, beyond sample size, the power to detect gene-environment interactions is influenced by the accuracy of the measurement of the outcome and the environmental exposure . In general, environmental variables are notoriously difficult to collect and quantify. Most environmental factors are assessed by questionnaire, which can lead to certain biases, including under-reporting of risky behaviors. Therefore, biomarkers as quantitative measures of the environmental exposures are preferred. Measures of dietary intake may be assessed by collection of daily food diaries or 24-hour dietary recalls. From these recall data, calculation of fat, vitamin, and mineral content is available in NHANES but these estimates are subject to poor recall. However, serum vitamin A and E levels are easily measured from a blood draw and may be used as a measure of dietary compliance.
The results presented here highlight the fact that effect sizes of gene-environment interactions tend to be small and large sample sizes are needed to detect them. Nevertheless, understanding the mechanism of the interaction between these lipid-associated variants and environmental factors, such as smoking and dietary vitamin E and A intake, is imperative to determining the etiology of a poor lipid profile and could, therefore, have implications in clinical care.
Genotyping in NHANES was supported in part by The Population Architecture Using Genomics and Epidemiology (PAGE) study, which is funded by the National Human Genome Research Institute (NHGRI). Data included in this report were resulted from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, as part of the NHGRI PAGE study (U01HG004798-01). We at EAGLE would like to thank Dr. Geraldine McQuillan and Jody McLean for their help in accessing the Genetic NHANES data. The Vanderbilt University Center for Human Genetics Research, Computational Genomics Core provided computational and/or analytical support for this work. The NHANES DNA samples are stored and plated by the Vanderbilt DNA Resources Core, managed by Cara Sutcliffe. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Institutes for Health or the Centers for Disease Control and Prevention.
LD and DCC contributed to conception and design of the study, interpretation of data, and drafting the manuscript. PM, MA, NS-B,, HJ, NI, and HD contributed to the collection of the data, and RG and KBG contributed to the analysis of the data.