|Home | About | Journals | Submit | Contact Us | Français|
Nutritional genomics offers a way to optimize human health and the quality of life. It is an attractive endeavor, but one with substantial challenges. It encompasses almost all known aspects of science, ranging from the genomes of humans, plants and microorganisms, to the highest levels of food science, analytical science, computing and statistics of large systems, as well as human behavior. The underlying biochemistry that is targeted by the principal issues in nutritional genomics is described and entails genomics, transcriptomics, proteomics and metabolomics. A major feature relevant to nutritional genomics is the single nucleotide polymorphisms in genes that interact with nutrients and other bioactive food components. These genetic changes may lead to alterations in absorption, metabolism and functional responses to bioactive nutritional factors. Bioactive food components may also regulate gene expression at the transcriptome, protein abundance and/or protein turnover levels. Even if all of these variables are known, additional variables to be taken into account include the nutritional variability of the food (unprocessed and processed), the amount that is actually eaten, and the eating-related behaviors of those consuming the food. These challenges are explored within the context of soy intake. Finally, the importance of international co-operation in nutritional genomics research is presented.
Although many people have become familiar with the term genomics, being the flagship of science at the National Institutes of Heath at the transition into the 21st century, other –omics have also been introduced in our vocabulary. Perhaps naïvely, it was assumed by many in the 1980s and 1990s that if the human genes could be defined, then the causes of diseases and syndromes could be understood sufficiently well to develop strategies (gene replacement/repair, optimal therapeutics or lifestyle changes involving what we eat) that would improve human health. When the sequencing approaches adopted by the National Human Genome Research Institute (1) and latterly Celera (2) only yielded 20,000–24,000 genes instead of the 80,000–100,00 that were expected, it caused many investigators to rethink how the cell really operates. Besides genomics, many other –omics, transcriptomics, metabolomics, physiological genomics, proteomics, epigenomics, and now nutritional genomics, have been born. Most recently, data from the ENCODE (ENCyclopedia Of DNA Elements) project have suggested that the concept of a gene and the resulting transcriptome may require substantial revision (3,4). The goal of the ENCODE project is to identify all the functional elements in the human genome sequence. Results reported in June 2007 revealed that the majority of the genome is transcribed, including non-protein encoding regions, and that genes extensively overlap each other (3).
This review describes the underlying biochemistry, introduces the variation in the human genome and the role of nutrition associated with nutritional genomics, and presents the role of timing of specific dietary components on which parts of the genome are expressed. As an example, the role of dietary polyphenols in health is discussed from a nutritional genomics and dietetics point of view. Finally, the importance of the national and international efforts that are in progress to deliver the fruits of this new field are briefly discussed. For overviews of the impact of nutritional genomics on dietetics, see Afman and Müller (5) and Trujillo et al. (6).
In the classical biochemical paradigm, genetic information encoded in genes in DNA (deoxyribonucleic acid) is transcribed by RNA (ribonucleic acid) polymerases to form messenger RNAs (the transcriptome) (Fig. 1). Recent research in the ENCODE project has revealed that, rather than information being drawn from one gene, the transcribed RNA may be a product of more than one adjacent gene (7). The mRNAs are exported from the nucleus and form complexes with ribosomes; these ribonuclear proteins synthesize polypeptides using the mRNAs as templates (Fig. 1). The triplet codon sequence of the mRNA is translated into amino acids one at a time to form polypeptides (Fig. 2). The polypeptides fold to form proteins with a wide variety of properties that are essential for life. These include cytoskeletal structures (actin, tubulin, etc.), membrane transporters, and enzymes that utilize externally derived compounds (glucose, amino acids, fats) to generate energy to power the cell and to synthesize biochemically important intermediates (e.g., amino acids, Coenzyme A thioesters, deoxyribonucleotides and ribonucleotides) needed for the synthesis of the materials required for cell division (that include DNA and RNA) (Fig. 1). Although for multicellular organisms the total available genes are the same for each cell, only a fraction of them is expressed to create the cellular phenotype (the effective genome of that cell type over the life of the cell). Of these expressed genes, while many are converted to proteins (structural proteins, enzymes for intermediary metabolism) that are used throughout the lifetime of a cell, others that are needed to control the timing of the cell cycle are only present transiently. The set of proteins recovered at any moment in the life of a cell is termed the proteome.
A complex set of small molecules in a cell represents its metabolome. The metabolome can be measured in a cell, in tissues (brain, heart, kidney, liver, muscle, ovary, testis, etc.), or in biological fluids (serum/plasma, urine, bile, etc.). It is a function of the genes available in the genome, the expressed genes in the transcriptome, the transporters that move extracellular and intracellular compounds across the cell membrane, and the catalytic activity and organization of the enzymes within a cell. The metabolome is constantly changing. Maintaining the elements of the metabolome within certain ranges is called homeostasis. For some compounds the range that is normal is very narrow, e.g., intracellular ATP (adenosine triphosphate). For others, e.g., plasma glucose, it is broader. However, for glucose there is a lower limit at which point various mechanisms are called into play to supply glucose and maintain homeostasis. As glucose levels rise after a meal, signals from insulin released from the pancreas serve to increase glucose transport into cells and to store it as glycogen and fat. Persistent high levels of plasma glucose lead to the complications of diabetes mellitus due to low levels of insulin secretion or failure to respond to insulin.
Humans use food as the fuel to run their metabolic engines–the food keeps the lights on and allows the metabolic system to create the pieces that sustain a cell so it can perform its necessary functions, and for undifferentiated cells to undergo division. Is there a particular set of foods needed to sustain metabolism? Fortunately, the answer is no. Foods consist of many different polymers of carbohydrates, fats and proteins. These are broken down to their sugar, fatty acid and amino acid monomeric units through hydrolysis by saccharidases, lipases, and proteases released into the intestine from the pancreas. The value of a food can vary. In order to be optimal, the food should generate sufficient glucose and provide essential fatty acids and amino acids. It is unusual for one food to satisfy all these requirements; thus a diet built on a variety of food sources is usually necessary. For some foods, external fermentation (pre-digestion by microorganisms) is used to make a food more bioavailable or palatable. Milk is converted to cheese and yogurt, whereas soybeans are converted to soy sauce, miso, tempeh and soy paste. An advantage of fermentation can be a shift of the amino acid content towards a more optimal value due to the addition of proteins from the microorganisms to the food.
Besides saccharides, fatty acids and amino acids, the human body has requirements for other essential compounds in the diet including water, vitamins and minerals. Other bioactive food components, many of which are phytochemicals (plant chemicals), are being identified as important to human health. Thus, the complete diet is a complex matrix of food components often supplemented with herbals, botanicals and dietary supplements containing bioactive components and vitamins/minerals. Delivering that complex matrix is the challenge for the dietitian who not only determines the nutrients that need to be combined to make a healthy diet, but also formulates methods based on knowledge of several sciences that ensure that the individual foods are safe and palatable (within the standards of the community in which they operate).
So, if humans have a given set of genes, why aren’t we all the same? And if we all ate a healthy diet, why aren’t we all uniformly healthy? The answer to both questions is that, although Homo sapiens is a discrete species, there are differences in our individual genomes. In the case of well-described single-gene disorders, there are distinct mutations, deletions and additions in certain genes that lead to the absence or dysfunction of the proteins derived from them. Since each parent contributes one copy of every gene (an allele) to their children, in most cases (but not always) both alleles have to be dysfunctional in order for the associated condition to become manifest.
In addition, there are other site-specific differences that occur throughout the genome, on average every 300 bases. These are termed single nucleotide polymorphisms (SNPs). Even if they are in the open-reading frame that is converted to messenger RNA and translated into a protein, the differences may be silent if they do not lead to a change in the amino acid that is translated from each codon. For instance, if the codon for glycine shown in Figure 2 (GGU) is mutated at the third nucleotide to form GGA, GGC or GGG, all the mutations will still be translated as glycine. On the other hand, for lysine (AAA in Figure 2), while mutation to AAG has no effect, mutation to AAU or AAC results in a change in translation to asparagine. This amino acid remains hydrophilic, but it loses the charge possible on the lysine residue. The outcome may be to alter rather than destroy the enzyme activity of the protein. Mutations of the first or second nucleotides in a triplet codon almost always cause a change in the amino acid that is translated. Gene polymorphisms may also occur in non-coding DNA regions that regulate expression of a gene. If the mutation stops gene expression, the outcome is severe and may result in classification as a genetic disease. However, a mutation that modulates (i.e., causing non-zero gene expression) the change in expression of a gene may be much harder to detect since its effects may fall within what is regarded as physiologically normal.
Proteins have a wide range of sizes from ~50 amino acids (e.g., insulin) to >2,500 (e.g., fatty acid synthase – accession number P49327 at http://www.expasy.org). These proteins are encoded by genes having 150 to >7,500 nucleotides. Accordingly, some genes have no SNPs, whereas others have several. For individuals with SNPs in multiple genes, the number of SNP combinations is restricted – the observed blocks of SNPs are termed haplotypes, combinations of alleles at multiple linked loci that are transmitted together. The pattern of SNPs is a reflection of genetic heritance, and certain haplotypes may be typical of those located within a particular region. Since the Y-chromosome in males is present in only one copy, it is not subject to shuffling via recombination and the pattern of SNPs therein establishes the paternal bloodline and the geographic origins of an individual’s patrilineal ancestors.
As noted above, SNPs typically lead to altered function of the protein product rather than severe impairment or total loss of function. Therefore, the population of a country may carry a high preponderance of a particular SNP. Of course, with the intermixing of populations that has occurred from migration, new SNPs may be introduced into a population. An example of a gene with a well-known SNP relevant to nutrition and disease is the gene that encodes the enzyme methylenetetrahydrofolate reductase (MTHFR) (accession number P42898 at http://www.expasy.org). This gene has a polymorphism at residue 677 (C or T) resulting in an alanine (677C) or valine (677T). Thus, an individual may carry a CC, CT/TC, or TT genotype. The TT genotype is common in northern China (20%), southern Italy (26%) and Mexico (32%) (8). Its frequency is low in those of African ancestry (9). MTHFR is important in the remethylation of homocysteine to methionine. It catalyzes the conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate. Unlike many of the other mutations in MTHFR, the 677TT genotype lowers, but does not abolish, enzymatic activity. Thus, individuals with this genotype have a mild form of hyperhomocysteinemia since they have less efficient conversion of homocysteine to methionine than the common genotype. Alcohol consumption also increases total plasma homocysteine. Both the MTHFR 677TT genotype and alcohol effects on plasma homocysteine can be offset by increasing the intake of folate (10). Interestingly, 677TT individuals have a lowered risk of colon cancer if they have a folate-supplemented diet (11). This polymorphism exemplifies how knowledge of the disadvantages of a specific genotype can be prevented by adjustment in nutrient intake. This association, and the subsequent interventions, is described as nutrient-gene interaction and is part of the field of study of nutritional genomics. It provides an explanation for why a nutritional recommendation that is optimal for a large group, may not benefit an individual in the group.
In the above scenario, the mutations in the coding regions of the gene of interest are the dominant issue. However, as noted in Fig. 1, there are feedback mechanisms from the distal part of the gene-transcript-protein pathway. A mutant protein may lead to alteration of the way the expression of genetic information is regulated. Similarly, small molecules produced by concerted enzyme action in the various pathways of the cell can feed back to regulate the activity of the protein and also gene expression via hormonally-sensitive receptors. For example, endogenously produced steroids and eicosanoids interact with a variety of receptors that, once activated, alter the expression of a large number of relevant genes. Receptor activation is not restricted to endogenous compounds; some ligands are derived directly from the diet. Plant-derived estrogens (phytoestrogens), such as genistein, coumestrol and zearalenone bind to the estrogen receptor (12) and may switch on a similar set of genes such as 17β-estradiol, the physiologic estrogen (13–15). Thus the diet becomes an important regulator of gene expression with the potential to offset the effects of the body of SNPs that an individual may have. The matrix in which the phytoestrogen is delivered may also have a role in gene expression. Su et al. (16) showed that the effects of soy protein isolate (containing the isoflavones daidzein, genistein and glycitein and their β-glycosides as well as other soy matrix components) and genistein on gene expression in rat mammary epithelial cells were essentially independent. Therefore, some of the benefits of a phytochemical may be lost if it is removed from its usual matrix.
Microarray analysis is used to assess changes in the transcriptome. While microarray technology has improved substantially over the past 10 years, it still suffers from being too expensive to overcome the major limitation of its use – the small number of replicates versus the number of parameters (genes) being tested. Under the best analytical circumstances, the number of expected differences between a control and treatment group under the null hypothesis (i.e., there is no difference) with α set at 0.05 for an array where there are 10,000 features (genes) is 500. This issue is not peculiar to microarray analysis – it also applies to proteomic and metabolomic analyses. Early publications of DNA microarray data focused on genes whose expression changed (up or down) 2-fold. These parameters may have led to selecting for genes whose expression changes considerably, but without biological importance (17). The key issue is to have sufficient biological replicates that the variances can be calculated. Another way of looking at microarray data is to determine whether a whole pathway is affected, which may help to identify where a critical gene or its gene product is located.
Another way to sort out likely true-positives is to map the observed changes onto the metabolic, synthetic and signaling pathways in the cell. A group of similar changes in the same pathway may be reasonably regarded as significant. An important feature of microarray research is the standardization and recording of every aspect of an experiment (18). By following guidelines laid down for such research (the MIAME [Minimum Information About a Microarray Experiment] standards) (19), investigators can download publicly available microarray data (National Center for Biotechnology Information Gene Omnibus) (20) from experiments that are similar to the ones they are planning in order to calculate the statistical power needed for their experiment. Similarly, statistical tests of the genes and pathways changed by a nutrient can be strengthened by combining data sets. This approach has not yet come into practice in proteomics and metabolomics, but presumably this will occur just as it has for microarray analysis.
A feature of growth and development is the programmed switching on and switching off of sets of genes that are necessary at different stages of life. Not all genes in a cell are “on” at any one time. When they are switched off, the intention is often that they stay that way. The on/off expression of a gene is regulated by the extent to which it is bound to chromatin in the nucleus. Nuclear DNA is packaged by histone proteins into a complex. The “openness” of the complex (to allow RNA polymerase to transcribe a gene) is regulated by acetylation (to relax chromatin) and methylation (to bind the DNA more tightly). Since there are several modification sites on histones, the concept of a histone code to augment that of gene transcription and translation has been proposed (21). Recent evidence suggests that nutrients/phytochemicals can alter the degree of histone acetylation. Administration of the sulphoraphane present in Brassica (broccoli) to human embryonic kidney cells and to human colorectal tumor cells inhibited histone deacetylase (22). Importantly, sulphoraphane was not active itself, rather its metabolites formed after reaction with glutathione. Sulphoraphane has also been shown to have histone deacetylase activity in human subjects, albeit transiently (23,24). This type of modulation of gene expression is termed epigenetics (changing expression without altering the DNA nucleotide sequence) and may turn out to be crucial in terms of the benefit of the diet in preventing chronic disease.
Polyphenols enter the diet from a wide variety of edible plants. As part of the activities of the University of Alabama at Birmingham’s Center for Nutrient-Gene Interaction, three polyphenols (the isoflavone genistein from soy, the stilbene trans-resveratrol from grapes and epigallocatechin-3-gallate from green tea) are being investigated for their roles in the prevention of breast cancer. Evidence from rodent studies suggests that the chemopreventive effects of genistein (25) and proanthocyanidin-rich grape seed extract (26) depend on the animals being exposed to genistein or soy at the time of weaning and puberty. Epidemiological data suggest exposure to soy (in the form of tofu) in adolescence is critical for beneficial effect of soy in lowering breast cancer risk (27,28). Similar to genistein, trans-resveratrol administered in the diet (1000 ppm) from birth to weaning in rats and then from 100 days of age caused a 50% reduction in the number of mammary tumors induced by carcinogens (29). In contrast, epigallocatechin-3-gallate administered in the drinking water had no effect in this model.
Do polyphenolics have a significant effect in preventing chronic diseases such as cancer? Many of the variables mentioned earlier that constitute nutritional genomics come into play and are not limited to the human subject. Even the soybean produces variable amounts of isoflavones. This has a genetic basis that is a function of the soybean strain, an environmental basis (temperature, humidity, altitude and sunlight) (30,31), and the time of harvesting (32). Thus a gram (or ounce) of a soybean or a soy food is not a unit of amount of isoflavones and therefore food frequency questionnaires should be augmented by analysis of what is being eaten by the subject group under study (33). Furthermore, the processing that takes place in the production of a soy food alters the composition of the isoflavones (Fig. 3). As examples, many of the Asian-style forms of soy foods (soy paste, miso, tempeh, soy sauce) are fermented products (34). Fermentation removes the glycosidic moiety that is chemically bound to isoflavones in soybeans. The unconjugated isoflavone is rapidly absorbed in the upper gut (35). In addition, fermentation leads to altered chemistry of the isoflavone, typically hydroxylation of the A-ring. Soy sauce can also be prepared by chemical treatment of soybeans. However, this leads to almost total loss of isoflavones. Most American-style forms of soy involve recovery of the soy protein fraction from soybeans without fermentation and tend to preserve the glycosidic conjugates; indeed, heating (toasting) of the soy protein can convert the isoflavone-6″-O-malonylglucoside to the 6″-O-acetylglucoside (36,37). This form must enter the colon before it can be hydrolyzed to unconjugated isoflavones, which leads to a more rapid conversion to secondary bacterial metabolites of isoflavones (dihydrodaidzein, O-desmethylangolensin and S-equol). Even the colonic bacteria are a variable, as these are a function of what we eat and drink. Approximately one in three individuals is an equol-producer. Some have suggested that equol production is tightly linked to the beneficial effects of a soy diet (38–41). This is believed to be due to the interaction of S-equol with the estrogen receptor beta (42). Perimenopausal women who are equol producers retain bone better than non-equol producers (38). In a recent randomized, controlled, parallel study design of 62 adults with hypercholesterolemia on a step II diet containing 80 g of pasta with or without soy germ isoflavones, equol producers had improved in serum lipid and other cardiovascular parameters compared to non-producers (43)
A polyphenol-rich diet is consistent with the recommendations for 8–10 servings per day of fruits and vegetables, as recommended by both the Centers for Disease Control and Prevention (44) and the National Heart, Lung and Blood Institute (45). Isoflavones are the major polyphenols in soybeans and soy foods; proanthocyanidins are prominent in apples, grapes and many berries; catechins (flavanols) are in teas, particularly green tea; resveratrol is in peanuts and grape skin. Integrating these foods into a healthy and tasty diet is attainable. In 2007 fast food outlets and schools began to provide fruit and salad alternatives to the ubiquitous French fry. This may be in response to increasing awareness of obesity and changes in eating-related behavior by the public.
What becomes clear as the description of nutritional genomics unfolds is that it consists of numerous dimensions – the genome, the complement of genes and their mutations that are available to be expressed, the transcriptome (both protein-encoding and non-protein encoding RNAs), the proteome (primary polypeptides, their posttranslational modifications and protein complexes), the epigenome (methylated DNA and methylated and acetylated histones), the metabolome (the small endogenous molecules that create energy and are used for the synthesis of complex bioactive intermediates including lipids, complex carbohydrates, proteins, and nucleic acids), the nutribiome (food-derived xenobiotics) and the xenobiome (other compounds derived from sources outside of the body –including pollutants). Analyzing all of these variables is a substantial challenge to the investigator.
As for all studies of populations, their genes and the factors from the diet that interact with them, a major hurdle to surmount is the need to have sufficient statistical power to be able to draw firm conclusions. Researchers need a large enough study group to satisfy the statistical requirements and sufficient grant support to carry out such large studies. The ability to conduct suitably powered clinical studies is a problem even in the USA and at the most prestigious institutions. Koushik et al. (46), using data gathered from the Nurses’ Health Study and the Health Professionals Follow-up Study cohorts, were able to show in a study of 376 men and women with colorectal cancer and 849 control subjects that patients with the MTHFR 677TT genotype had a reduced risk of colorectal cancer (odds ratio 0.66; 95% confidence interval, 0.43–1.00), but could not discern a relationship with dietary methyl status. They concluded that their work lacked the requisite statistical power.
Nutrient-gene interaction is becoming an important part of public health policy throughout the world as each country strives to most efficiently ensure the health and productivity of its peoples through prevention strategies. Accordingly, a group of investigators with expertise in nutritional genomics from 21 different countries and 5 continents called for an international alliance to address the most important questions (47). Coordination of research in nutritional genomics is provided by the European Nutritional Genomics Organization (48) and the Nutritional Genomics Society (49). What is envisaged by these groups is cooperative, multinational studies of the major questions in nutritional genomics. This was discussed in depth at a 2007 Conference entitled, “Who We Are and What We Eat: The Role of Metabolomics and Nutritional Genomics in Creating Healthful Foods and Healthy Lives” (50). This cooperation is analogous to the advantages provided by binocular vision or large array radio telescopes. However, it depends on international standards for how nutritional genomics projects are to be conceived, executed and analyzed. For now, it is work in progress.
Support for the UAB Center for Nutrient-Gene Interaction in Cancer Prevention is provided by a grant-in-aid (U54 CA100949, S. Barnes, PI) from the National Cancer Institute. Support for research on botanicals and dietary supplements at the Purdue University-University of Alabama at Birmingham Botanical Center for Age-related Disease is provided by a grant (P50 AT00477, Connie M. Weaver, PI) from the National Center for Complementary and Alternative Medicine and the NIH Office of Dietary Supplements. The editorial advice provided by Ruth DeBusk during the writing of the manuscript is warmly appreciated.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.