|Home | About | Journals | Submit | Contact Us | Français|
Epigenomes are comprised, in part, of all genome-wide chromatin modifications including DNA methylation and histone modifications. Unlike the genome, epigenomes are dynamic during development and differentiation in order to establish and maintain cell type-specific gene expression states that underlie cellular identity and function. Chromatin modifications are particularly labile, providing a mechanism for organisms to respond and adapt to environmental cues. Results from studies in animal models clearly demonstrate that epigenomic variability leads to phenotypic variability including susceptibility to disease that is not recognized at the DNA sequence level. Thus, capturing epigenomic information is invaluable for comprehensively understanding development, differentiation, and disease. Herein, we provide a brief overview of epigenetic processes, how they are relevant to human health, and review studies utilizing technologies that enable epigenome mapping. We conclude by describing feasible applications of epigenome mapping, focusing on epigenome-wide association studies (eGWAS), which have the potential to revolutionize current studies of human diseases and will likely promote the discovery of novel diagnostic, preventative, and treatment strategies.
In contrast to the genome, which remains largely unchanged in most cells of an organism, epigenomes are the product of a gradual commitment of cell lineages to more constrained patterns of gene expression throughout development that is, in part, shaped by the environment. An epigenome can be defined as the combination of all genome-wide chromatin modifications in any given cell type that directs its unique gene expression pattern. Epigenomes are labile during development1, are responsive to extrinsic factors2, are altered in disease3, and even differ among individuals with identical genetic composition4.
The principal chromatin modifications include DNA methylation and post-translational modifications of histones that package the DNA in nucleosomal units. Although both DNA methylation and histone modifications appear to be meiotically and/or mitotically heritable, only the former epigenetic process is backed with strong mechanistic support for heritability5. As we discuss in this review, accumulating evidence from mammalian studies indicate that variability of epigenetic modifications of chromatin during development and in response to distinct environmental factors directly contribute to adult phenotypic variability and disease susceptibility that could not previously be accounted for by DNA sequence alone. Thus, characterizing genome-wide chromatin modification patterns (i.e. epigenome mapping) may aid in the discovery of disease-causing genes in humans for which non-genetic factors are clearly involved and confound DNA sequence-based association studies of disease6, 7. Epigenome mapping across different cell types and developmental periods from normal individuals, in a manner analogous to the efforts of sequencing the human genome, is an essential prerequisite.
Technological advances allowed characterization of the first human epigenome in CD4+ T cells, which describes 38 histone modifications including both histone methylation and acetylation8–10. Similar technologies have now enabled truly genome-wide analyses of DNA methylation11, 12. Taking advantage of these and other technologies, the recently launched International Human Epigenome Consortium (IHEC) aims to map 1000 reference epigenomes within a decade13. Herein, we provide a brief overview of epigenetic processes and how they are relevant to human health and review initial studies utilizing epigenome mapping. In particular, we discuss the evidence supporting that epigenetic processes offer a mechanistic link between genetic determinants and environment during development, and address how results of epigenome mapping might be applied to measure previously unobservable potential inter-individual variability that could account for differences to disease susceptibility.
In general, epigenetic processes including DNA methylation and histone modifications are thought to modulate the accessibility of cis-DNA elements to trans-acting factors via regulating chromatin structure14. In plants, RNA interference contributes to epigenetic regulation, however whether there is a conserved mammalian equivalent remains unclear15 and will not be further discussed here. Similarly, there is considerable debate of whether chromatin remodeling1 should be considered an epigenetic mechanism as it is unknown that this process can transmit memory of cell fate from one generation to the next.
Epigenetic regulation of chromatin structure consequently influences gene expression. This type of control is exemplified by the phenomenon of genomic imprinting, the mono-allelic expression of genes in a parent-of-origin-dependent manner. Critical cis-elements that regulate expression of imprinted genes called imprinting control regions (ICRs) exhibit parental-specific chromatin modifications including DNA methylation and histone modifications that govern their activity16. Similarly, as a form of dosage compensation between female and male eutherians, X-chromosome inactivation (XCI), mediated primarily by Xist and Tsix genes, is accompanied by chromosome-specific histone modifications associated with heterochromatin and gene silencing at the two-cell stage of early embryonic development17. Once established, XCI is maintained by DNA methylation-mediated gene silencing in cells of the embryo proper. As genomic imprinting and XCI have been reviewed extensively elsewhere16, 17, we focus on the global distribution and functional importance of DNA methylation and histone modifications as they relate to cellular phenotype.
Perhaps the best understood epigenetic mechanism is DNA methylation, which in mammals occurs almost exclusively within 5’-cytosine-guanine-3’ dinucleotides (CpGs), although CpNpG methylation has also been detected18. In general, DNA methylation is typically associated with gene silencing by affecting the binding of methylation-sensitive DNA binding proteins and/or by interacting with various modifications of histone proteins that alter DNA accessibility to promoters19. Once established by the de novo methyltransferases DNMT3a and DNMT3b20, DNA methylation is maintained through mitosis primarily by the DNMT1 enzyme which associates with PCNA and the replication foci and has a significant preference for action on hemi-methylated DNA following DNA replication21. This mechanism allows for perpetuation of the DNA methylation state in newly formed cells, a well-established mode of epigenetic inheritance.
Several studies suggest multiple functional roles for DNA methylation including silencing transposable elements22, mediating developmental gene regulation23, and reducing transcriptional noise24. Indeed, DNA methylation in mammals is essential for embryonic development25, differentiation26, and cell cycle control27. Additionally, DNA methylation plays critical roles in maintaining transcriptional silencing of genes on the inactive X-chromosome and imprinted genes28. Human diseases have been associated with abnormal DNA methylation patterns including cancer29, ICF (immunodeficiency, centromeric instability, facial anomalies) syndrome, ATRX (alpha-thalassemia, mental retardation) syndrome, and fragile X syndrome30. Even mutations in a gene that encodes a protein that ‘reads’ DNA methylation signals and thought to act as a transcriptional silencer, MeCP2, are associated with the autism spectrum disorder, Rett syndrome31. In plants, DNA methylation deficiency results in spurious transcription initiation from cryptic sites, demonstrating a role for methylation in reducing transcriptional noise24. A central theme of these findings is that DNA methylation functions to maintain a repressed chromatin state, thereby stably silencing promoter activity32. It is important to note, however, that DNA methylation is not always associated with gene silencing. Some studies have demonstrated that DNA methylation can augment expression of an imprinted gene by blocking the binding of repressor proteins to silencer elements within the gene33. Furthermore, additional novel roles of DNA methylation have been suggested by recent epigenome mapping studies (discussed below).
In addition to the covalent modification of DNA, histone proteins that package nuclear DNA are known to be subjected to a wide array of post-translational modifications on specific residues along their NH2-terminal ‘tails’ that project outside of the nucleosome core. Such modifications include acetylation, methylation, phosphorylation, ubiquitination, sumoylation, and potentially others that remain to be discovered34, 35. Since these modifications contribute to chromatin conformation, the potential information stored within different combinatorial patterns of these modifications led to the hypothesis of a ‘histone code’, which proposes that specific combinations of these modifications dictate locus-specific transcriptional competence36. Although the specific functional consequences of each of these modifications are insufficiently understood, many of the enzymes responsible for adding and removing them have been identified35. In addition, alterations of the genes encoding histone-modifying enzymes and perturbations to the modification patterns they produce are associated with disease37, underscoring the importance of histone modifications in normal development. For example, the Polycomb-group protein EZH2, a histone H3 lysine 27 (H3K27) methyltransferase, was overexpressed in prostate cancer38, while global loss of H4K16 acetylation and H4K20 trimethylation (H4K20me3) was observed in lymphoma and colon cancer39.
One of the best understood histone modification is the reversible acetylation of lysine residues on histones H3 and H440. Acetylation is accomplished by specific histone acetyltransferases (HATs) that neutralizes the net positive charge of the histone tail and facilitates access of transcription factors to the underlying DNA41. While histone acetylation is clearly associated with transcriptional activation42, the reverse reaction catalyzed by histone deacetylases (HDACs) increases the tail’s positive charge, lowers the transcription potential of the underlying DNA, and is associated with a transcriptionally repressed state43. Distinct from acetylation, methylation of histones can be associated with transcriptional activation or repression depending on the specific residue targeted and degree of methylation35, 36. For example, H3K4me3 is frequently enriched at gene promoters, many of which are active44, 45. Localization of H3K4me3 at inactive promoters, however, facilitates the transient binding of HATs and HDACs to maintain the promoters in a repressed but poised state for future activation46. How histone methylation accomplishes gene repression is not fully understood, however it is known that methylation of certain lysine residues, such as H3K9, acts as a docking site for heterochromatin protein 1 (HP1) which in turn recruits histone methyltransferases47. Other modifications are not restricted to promoters, including the ‘activating’ modification H3K36 which is distributed over the gene bodies of actively expressed genes8, 48. Although histone phosphorylation, ubiquitylation, and sumoylation of specific serine, threonine, or lysine residues are involved in transcriptional regulation, modifications on other residues are associated with mitosis, DNA repair, and apoptosis34. The precise functions of many of these modifications remain under investigation.
New evidence suggests the possibility that histone modifications could be mitotically heritable, though this is debated. In yeast for example, the heterochromatic states mediated by the interactions of hypoacetylated histones and silent-information-regulator proteins or H3K9 methylation and the Swi6 chromodomain are maintained through cell division49. In flies, H3K27 and H3K4 methylation, catalyzed by Polycomb-group and trithorax-group protein complexes respectively, mediate mitotic inheritance of lineage-specific gene expression patterns50. This in part accomplished by EED, a constituent of the Polycomb repressive complex 2 (PRC2), which binds specifically to histone tails, activates the methyltransferase activity of PRC2, and facilitates propagation of the H3K27me3 mark upon mitosis51, 52. The extent to which other histone modifications are heritable remains unclear. However, histone modifications along with DNA methylation undergo considerable changes during development53 when the epigenome is particularly vulnerable to environmental influences, as later discussed.
Recent technological advances in DNA sequencing, in part, has enabled epigenome mapping. Data derived from these technologies provide unprecedented insight into the distribution, interplay, and potential novel functions of chromatin modifications and associated proteins. While other reviews explain these technologies in greater detail54–58, we focus on the currently available methods that utilize next-generation sequencing to map epigenomes and discuss unexpected findings.
Several strategies that allow detection of DNA methylation include digestion of DNA by methylation-sensitive or –insensitive restriction endonucleases59, the chemical modification of DNA by sodium bisulfite60, immunoprecipitation of 5-methylcytosine to separate unmethylated and methylated fractions of the genome61, and enrichment of methylated DNA using DNA binding proteins62. These strategies have been coupled to high-throughput technologies. Although the genomic coverage and resolution varies among the different strategies, each offers unique advantages. Thus, these approaches are complementary and would potentially generate mutually reinforcing and comprehensive genome-wide DNA methylation data.
Of these strategies, one that provides the best resolution of cytosine methylation incorporates sodium bisulfite treatment of DNA. Sodium bisulfite chemically converts all unmethylated cytosines to uracil by hydrolytic deamination of the 5,6-dihydrocytosine-6-sulphonate product at the C4 position of the pyrimidine ring. However, a methyl group at the C5 position inhibits this reaction, protecting methylated cytosines from conversion63, 64. Upon PCR amplification using primers corresponding to the bisulfite-converted sequences of interest, the products are traditionally cloned and sequenced to evaluate CpG methylation throughout the amplified region65. Several applications exploiting this ‘gold-standard’ bisulfite-conversion method have been adapted to evaluate methylation globally using next-generation sequencing technology including a shotgun-sequencing approach66, a reduced representation shotgun-sequencing approach67, and a targeted approach using bisufite padlock probes68. Bisulfite treatment of genomic DNA combined with ultra-high-throughput sequencing using Illumina technology allows sensitive measurement of cytosine methylation on a genome-wide scale within specific sequence contexts at single nucleotide resolution66. This technology has been successfully applied to the Arabidopsis thaliana genome66, 69, to the mouse genome66, and more recently to the human genome11, 12, impressively accounting for >90% of all cytosines genome-wide. However, in order to yield comprehensive and accurate methylation data, a major prohibitive disadvantage of this approach is the high cost since a very large amount of sequencing is required. To overcome this limitation, a semi-random approach termed ‘reduced representation bisulfite sequencing’ (RRBS) has been devised67. In RRBS, genomic DNA is first digested with specific restriction enzymes whose recognition sequences are disproportionately represented in GC-rich DNA, such as BglII or MspI, fragments of a certain size range are then selected and subjected to adaptor ligation, treated with sodium bisulfite, PCR amplified and sequenced. Although this approach provides useful nucleotide-level quantitation of cytosine methylation, the selection of fragments in a given size range explicitly excludes entire genome coverage and introduces a bias towards CpG-rich regions of the genome that typically include unmethylated CpG islands in normal cells. However, RRBS has proven useful for cancer studies that require the rapid, high-throughput analysis of aberrant DNA methylation patterns in multiple samples70. An alternative approach that eliminates the potential bias of methods targeting CpG islands yet offers nucleotide-level resolution involves the use of bisulfite padlock probes (BSPPs) to capture selected locations of the genome for methylation profiling68. However, the genome-wide coverage of DNA methylation is limited to the number of BSPPs used per reaction, thus precluding a complete analysis of DNA methylation throughout the genome.
Other strategies avoid the use of sodium bisulfite altogether and instead enriches for methylated DNA. One approach takes advantage of an antibody targeted to single methylated cytosines in order to directly capture methylated sequences within genomic DNA. Termed methylated DNA immunoprecipitation (MeDIP), this approach allows for relative quantitation of DNA methylation genome-wide when coupled to comparative genomic hybridization microarrays as was applied to human61 and plant71 samples. More recently, MeDIP has been adapted to high-throughput sequencing using Illumina technology (MeDIP-Seq)72, providing a more complete genome-wide coverage of DNA methylation than microarrays can achieve. Another way to enrich for methylated DNA is using a methyl-binding domain (MBD) fused to human IgG73 or MBD proteins bound to a sepharose matrix62 before hybridizing onto an array. Irrespective of the platforms used to measure output DNA methylation signals, limitations of these affinity approaches include the bias that methylated CpG-rich sequences give higher enrichments than equally methylated yet CpG-poor sequences, the inability to measure DNA methylation of individual repetitive elements, and the lower resolution of methylation detection relative to bisulfite-sequencing. To overcome these limitations, a novel approach called Methyl-MAPS (methylation mapping analysis by paired-end sequencing) has recently been developed that provides single nucleotide resolution of DNA methylation, covers >80% of CpG sites within mammalian genomes, and enables quantitation of methylation at repetitive elements in addition to single-copy loci by combining enzymatic fractionation and deep sequencing74.
These epigenome mapping techniques afforded an unprecedented opportunity to observe the distribution of DNA methylation over the majority of the genome leading to some surprising discoveries. The majority of gene promoters were depleted of DNA methylation, whereas there was markedly increased cytosine methylation within gene bodies. The large scope of DNA methylation within gene bodies, as observed for a third of all genes in the Arabidopsis genome66, 69 and for a substantial number of genes in humans11, 12, 68, 75, suggests a potentially novel role for DNA methylation that seemingly contradicts the dominant model for DNA methylation-mediated gene silencing. Specifically, higher levels of DNA methylation were detected within gene bodies of actively expressed genes compared to silent genes. Suppression of spurious initiation of transcription within highly active genes76, modulation of transcriptional elongation77, regulation of pre-mRNA splicing12, and tissue-specific alternate promoter usage (Maunakea, A.K. et al, Nature, in press) are potential novel roles of gene body methylation. That substantial gene body methylation is an evolutionarily conserved feature of eukaryotic genomes78–80 further supports a biological role and warrants continued investigation.
Another unexpected common finding of these genome-wide DNA methylation studies was that, in contrast to the long-held view that cytosines within CpG islands remain free of DNA methylation, a subset of CpG island-containing promoters were in fact methylated in normal cells81–84. Consistent with previous gene-centric studies in cancer, this normal promoter methylation was, in general, negatively correlated with the expression state of the underlying gene. In studying normal embryonic stem cell differentiation, gain or loss of promoter methylation was observed concomitant with loss or gain of expression, respectively, of the associated gene11, 12, many of which have previously defined roles in embryonic stem cell function. Interestingly, it was also recognized that unlike somatic cells, embryonic stem cells harbored substantial levels of non-CpG cytosine methylation11, 12. Genome-wide, non-CpG methylation was diminished in human ES cell-differentiated fibroblasts12 and significantly gained in induced pluripotent embryonic like-stem cells generated from primary human fibroblasts11, which together indicates a novel role of non-CpG methylation in the origin and maintenance of the pluripotent stem cell state.
Finally, a landmark study using a more limited genome-wide DNA methylation fingerprinting technique called AIMS, observed remarkable differences in the DNA methylation profiles of monozygotic twins, particularly those that were older, had different lifestyles, and had spent less of their lives together4. These results implicate a probabilistic accumulation of epigenetic variability likely due to environmental differences experienced by these genetically identical individuals during their lifetime and may be involved in the etiology of monozygotic twin discordance for common diseases and traits85. Further support of this notion comes from a recent DNA methylation profiling study that uncovered epigenetic differences that might contribute to the discordance for the autoimmune disease systemic lupus erythematosus (SLE) among monozygotic twins86.
Additional epigenome mapping of DNA methylation profiles across a larger number of cell types and individuals, coupled with global gene expression data, should reveal a more comprehensive understanding of the prevalence of normal promoter methylation, non-CpG methylation, and inter-individual variability to elucidate the epigenetic determinants of normal differentiation and disease susceptibility. Notably, the studies of DNA methylation described above focus exclusively on 5-methylcytosine. However, 5-hydroxymethylcytosine has recently been detected in mammalian DNA87, 88, yet its in vivo function remains unclear in part because current technologies, including the gold-standard bisulfite-sequencing approach, fail to distinguish between the two types of DNA modifications89. MeDIP approaches using recently introduced anti-5-hydroxymethylcytosine antibodies may overcome this critical limitation.
Most of the existing methods for studying histone modifications on a genomic scale combine the use of chromatin immunoprecipitation (ChIP) with high-throughput technologies including DNA microarrays (ChIP-chip) or massively parallel sequencing (ChIP-Seq). ChIP technique relies on the isolation of individual chromatin fragments using an antibody specific to a particular feature of the chromatin fragments, including DNA-binding proteins, histone modifications, and nucleosomes. Since these techniques have been reviewed elsewhere56, 90, 91, we will only mention that applications of these techniques across multiple species including yeast92–94, fly95, mouse96, 97, and human44, 98–100 have revealed that distinct genomic regions such as enhancers, promoters, and gene bodies have distinct histone modification patterns8, 9, 101–104. Thus, mapping these modifications globally may provide a more precise functional annotation of genomes. In addition, detecting lesser-known histone modifications including phosphorylation and sumoylation using chemical approaches such as peptide synthesis, mutagenesis, and in vitro nucleosomal arrays could complement in vivo ChIP results in elucidating their function(s)105.
Our epigenome mapping data of 38 histone modifications including methylation and acetylation using ChIP-Seq in normal primary human CD4+ T cells have revealed remarkably consistent categorical patterns across gene regions depending on the transcriptional activity of the underlying gene8, 9. For example, histone modifications across actively expressed genes can be separated into at least four categories based on their general patterns and may correspond to their distinct functions in transcription (Fig. 1). ‘Active’ histone modification marks highly enriched within gene promoters, for instance, may be involved in transcription initiation (e.g. H3K4me3), whereas those that are intragenic may be involved in elongation, termination, or pre-mRNA splicing (e.g. H3K36me3)8, 9, 46, 106, 107. Similarly, other histone modifications across silent genes adopt distinct patterns possibly impairing or preventing transcription initiation when enriched at promoters (e.g. H3K27me3) or disrupting elongation when enriched throughout gene bodies. ‘Silent’ histone marks such as H3K27me3 seem to function dominantly over ‘active’ histone marks such as H3K4me3, as exemplified by the co-localization of both marks (i.e. bivalent domains) over inactive promoters8, 44, 97, 103. These results suggest that histone modifications delineate functional features of genes, including their structure and intrinsic regulatory elements. Indeed, an integrated analysis of these epigenome profiles with that of non-promoter DNase hypersensitive sites, putatively marking cis-regulatory elements, showed enrichment of specific histone modifications within these sites including H3K9me1 and H3K4me1/me2/me38. More generally, based on their association with gene expression status, all histone acetylations and most histone methylations we examined were considered as ‘active’ marks, while only five histone methylations (i.e. H3K9me2, H3K9me3, H3K27me2, H3K27me3 and H4K20me3) were considered as ‘repressive’ marks9. Additionally, previous observations indicated that certain histone acetylations and methylations are enriched at and potentially mark enhancer elements8, 100, 102; however there is some debate over the precision of these marks. For example, one study found that enhancers are associated with H3K4me1 but not H3K4me3 marks101, whereas we and others have shown that H3K4me2/3 and H3K9me1 marks are present at many potential distal regulatory regions including well-characterized enhancers for IFNγ, Th2 cytokine genes, and FOXA18, 9, 108. These studies suggest that enhancers may be associated with a variety of epigenetic patterns in likely a context-dependent manner. It should be noted, however, that although these studies show consistent association of specific histone marks with transcriptional states, the precise functional role of these modifications with transcription and whether or not they are a cause or consequence of gene expression states are still unclear.
In addition to these static observations, comparative ChIP-Seq analyses in stem/progenitor cells and differentiated cells revealed the remarkably dynamic processes of histone modifications genome-wide10, 109. The co-localization of the transcriptionally permissive H3K4me3 and repressive H3K27me3 histone modification marks were detected at promoters in a subset of genes in embryonic stem cells97, 110 and somatic cells8, 44. This bivalent configuration is presumed to maintain the underlying gene in a transcriptionally silent but poised state8, 97. In ES cells, bivalent domains target and potentially regulate the expression of key developmental genes as resolution of bivalent marks to a monovalent state appear to be associated with the establishment of lineage-specific gene expression patterns during differentiation97, 111, 112, a potentially universal paradigm applicable to multipotent cells. One recent example of such a resolution occurred exclusively on the transcriptionally competent tissue-specific imprinted allele of Grb10 whose promoter is bivalently marked in multipotent embryonic neuronal precursor cells and appears only to resolve to a monovalent state (i.e. H3K4me3) upon neuronal commitment permitting Grb10 expression in differentiated neurons, whereas the allele remained bivalent and transcriptionally silent in all other somatic cells113. While some bivalent domains resolve upon differentiation, others arise de novo as observed in ES cell-differentiated neuronal progenitor cells45, 114. For example, in the neuronal progenitor cell stage bivalent domains were established at the promoters of glial-specific genes, including those the encode for the myelin basic protein and glial fibrillary acidic protein, which resolved to a repressive chromatin state (i.e. H3K27me3) when these cells differentiated into neurons45. In addition, genes that maintain stem cells in a pluripotent state, including Oct3/4 and Nanog, become silenced by H3K9 methylation upon differentiation into neuronal progenitors and non-bivalent promoters of lineage-specific genes exhibited changes in histone modification patterns according to their transcriptional status during differentiation45. Similar dynamics were recapitulated in hematopoietic stem cells and their more lineage-restricted erythrocyte progenitors115 (depicted in Fig. 2). Additionally, our recent epigenomic data on various T helper cell subsets demonstrated that signature cytokine genes adopted a histone modification pattern consistent with their expression status within their respective lineage upon differentiation, whereas promoters of key transcription factor genes were marked by both H3K4me3 and H3K27me3 even when they were not expressed in the cell116. These bivalently modified transcription factor genes can be induced under appropriate growth conditions and underlie the plasticity of T cells. Altogether, these studies support a recurring theme for histone modifications in restricting multipotency and defining the developmental potential of progenitor cells. These modifications also interact with DNA methylation to reinforce lineage commitment as was observed in neuronal progenitors114.
During and following the programmed changes to the mammalian epigenome in development, epigenetic processes respond to extrinsic factors including environment, diet, and even behavior. This plasticity is thought to allow the organism to respond and adapt quickly to external stimuli, yet also confer the organism, and even in some cases its offspring, with the ability to ‘memorize’ contacts with such stimuli into adulthood117, 118. Recent studies in mammalian systems provide explicit examples of this metastability that contrasts with the neo-Darwinian concept of inheritance119. In particular, evidence that fetal environment influences epigenetic processes that subsequently alter susceptibility to chronic disorders are mounting. These results have important implications for human health and provide further impetus to map human epigenomes.
Perhaps the most compelling initial data demonstrating that epigenetic modifications are sensitive to environmental stimuli comes from studies of so-called metastable epialleles identified in mice. As the name suggests, metastable epialleles are alleles that are variably expressed due to epigenetic modifications that are established in early development, giving rise to heritable phenotypic mosaicism between cells and between individuals in the absence of genetic heterogeneity120. So far, such alleles include murine Avy, AxinFu, and CabpIAP, all of which are associated with contraoriented IAP insertions whose activities are regulated by DNA methylation and influence gene expression121. DNA methylation levels at these IAP elements vary stochastically and depend upon maternal nutrition and environmental exposures during early development, contributing directly to variable expressivity and phenotypic variegation122, 123. For instance, agouti dams fed a diet high in folic acid and betaine, which are metabolized into methyl donors required for the catalytic processes of DNA and histone methylation, shifts the coat color distribution of offspring from yellow toward brown and decreases their risk for obesity by increasing DNA methylation near the viable yellow agouti (Avy) locus124. Maternal dietary supplementation with genistein found in soy produced similar results125. Surprisingly, opposite results were observed in offspring of dams fed a diet comprising an endocrine active compound bisphenol A (BPA)126, linking the impact of environmental exposure to xenobiotics to potentially deleterious consequences on the epigenome and health of mammals. Indeed, fetal exposure to xenobiotics alters epigenetic states in the offspring that produces deleterious affects on the health of the offspring and on the health of their descendants through transgenerational transmission of the epigenetic (i.e. DNA methylation) perturbation in the germline127, 128. Intriguingly, BPA-induced hypomethylation in the fetal epigenome could be abolished by maternal dietary supplementation with methyl donors, demonstrating that changes in diet can offset the potentially deleterious effects of environmental toxicants on the developing fetus. How this works at the molecular level is still not quite understood, but other examples of the plasticity of the fetal epigenome abound and are not necessarily restricted to metastable epialleles.
Animal models mimicking human pathology by alterations in maternal nutrition during pregnancy and weaning have consistently demonstrated significant responses on epigenetic modifications to chromatin and subsequent effects of offspring phenotype. Of particular interest are models recapitulating hypertension, which is a major risk factor for cardiovascular and cerebrovascular disease129 and to which genetic and environmental factors clearly contribute130. In rodents, subjecting mothers to relative undernutrition during pregnancy may predispose the developing offspring to cardiovascular disease later in life131. For example, uteroplacental insufficiency reduced the activity of DNMT1 in the kidneys of offspring, causing hypomethylation of the p53 promoter and increased p53 expression, presumably contributing to elevated p53-mediated apoptosis thereby reducing glomeruli number and causing hypertension132. In primates, a maternal high-fat diet contributes to increased histone H3 acetylation and decreased histone deacetylase activity consistent with increased expression of genes relevant to impaired lipid metabolism in the fetus133. These and other similar studies unequivocally show that maternal nutrition environment can significantly impact the fetal epigenome contributing directly to the health of the offspring.
Although maternal nutrition contributes to epigenetic modifications of the developing embryo in utero, maternal behavior can alter epigenetic patterns of gene expression in young neonates that, once established, also persists into adulthood. So far, examples of this phenomenon are limited to studies of rodents. Rat offspring of mothers with low pup licking and grooming (LG) behavior showed high levels of DNA methylation of the hippocampal glucocorticoid receptor (GR) gene promoter, concomitant with low expression levels of the associated gene, and exhibited poor response to stress later in life, whereas the opposite was observed in offspring whose mothers displayed more frequent LG behavior134, 135. Additionally, cross-fostering experiments, where pups of low LG mothers were swapped with pups of high LG mothers and vice versa, convincingly revealed that the offspring GR methylation pattern was mediated by maternal behavior, rather than genetics, in the early postnatal environment during a critical window of postnatal development. That is, the DNA methylation pattern of the GR promoter was exclusively labile during the first week of birth, but once established remained unchanged into adulthood134, 135. Whether maternal phenotype contributes to epigenetic and phenotypic variability of offspring in humans, however, has yet to be directly determined.
Taken together, results from these studies lead to the conclusion that epigenetic modifications caused by the environment of the organism at particularly vulnerable or epigenetically labile periods of development are involved in the etiology of adult disease, which may be easily prevented with dietary supplementation at these critical developmental windows. Importantly, in specific cases, the induced metastable epigenetic alteration is transgenerationally heritable. A priori knowledge as to which epigenetically labile loci are associated with disease outcome in adulthood is essential for preventative or therapeutic intervention. This can be achieved by epigenome mapping studies in development.
The above results from animal models are relevant to humans and are entirely compatible with the developmental origins of disease hypothesis postulating that nutrition and other environmental factors during human prenatal and early postnatal development influence ‘developmental plasticity136 and alter susceptibility to disease including adult cardiovascular disease, type 2 diabetes, and obesity137, 138. The role of epigenetic variability in contributing to disease susceptibility in humans has largely been unexplored. Yet results of epidemiological studies implicate epigenetic mechanisms in the etiology of disease.
Analogous to rodent studies of maternal behavior and offspring’s response to stress134, 135, the severity of symptoms arising from post-traumatic stress disorder (PTSD) that influences maternal cortisol production during pregnancy can predict the levels of cortisol excretion in babies, while women who experience severe stress during pregnancy give birth to offspring that experience altered activity of the hypothalamic-pituitary-adrenal (HPA) axis, which is regulated by the GR gene, in childhood139, 140. That low offspring cortisol levels are significantly associated with maternal PTSD implicate the involvement of epigenetic mechanisms in mediating this response. Also, consistent with the described animal studies supporting a major role of the intrauterine environment contributing to the health of progeny via epigenetic alterations, lower birth weight in African American populations, in particular, predicts higher blood pressure, elevated cortisol reactivity, and early signs of diabetes as children141, 142 and other related cardiovascular conditions as adults143. Additionally, studies of multigenerational trends of birth outcomes suggest a potentially non-genetic mode of inheritance since maternal fetal growth rate predicts that of their offspring144, 145. Together these and other epidemiological data provide preliminary support that environmental influences impacting on one generation, a likely source of non-genetic variability, can have lasting effects on the phenotype of subsequent generations thereby contributing to the inheritance of disease risk.
In cancer, substantial alterations of DNA methylation and histone modifications have been described. Global hypomethylation and site-specific gene hypermethylation of DNA are so widespread that they are now considered hallmarks of cancer146 and environmental exposure to carcinogens have been linked to these alterations147, 148. Importantly, changes in the epigenome that alter the expression of MLH1 and MSH2, tumor-suppressor mismatch repair genes, have been shown to be inherited through the germline149, 150. Also, loss of imprinting, associated with aberrant DNA methylation of critical cis-regulatory elements, of IGF2 is associated with increased cancer risk in children with Beckwith-Wiedemann syndrome151 and is associated with increased cancer incidence and family history of cancer in adults152, 153. These studies demonstrate that altered epigenetic processes might be causally linked to tumorigenesis. Indeed, animal models with deficient DNMT1 are highly susceptible to cancer development154, 155. The molecular mechanism(s) underlying this susceptibility requires further elucidation, however there is evidence linking the bioavailability of methyl donors to cancer risk either through diet and/or variants in the activity of enzymes involved in one-carbon metabolism156.
The hypothesis that epigenetic processes are key mediators that integrate and interpret environmental factors and regulate the expression of target genes involved in human disease is gaining momentum and has also been reviewed elsewhere7, 117, 157, 158. Persistent epigenetic variability that occurs early in development in response to maternal nutrition and the environment are associated with disease susceptibility, at least in animal models. Incorporating epigenetic variation into genetic studies of common diseases would be a powerful approach that could effectively take into account confounding factors such as environment that sequence-based studies of disease do not7. Such studies will require epigenome mapping.
Genomic studies attempting to identify susceptibility genes for diseases, particularly those involving complex systems and traits, are technically demanding and most often yield poor results primarily due to the confounding influences of variables, such as environment, that are not directly accounted for159. Epigenome mapping provides a measurable readout of most extrinsic and intrinsic factors impacting gene expression variability related to phenotype7. Thus, integrating DNA sequence, transcriptome, and copy number variation information with epigenome mapping data will provide a powerful approach, called epigenome-wide association studies (eGWAS), towards comprehensively understanding disease. Indeed, accounting for non-genetic factors improves the statistical power of studies measuring gene expression variability in populations160. Furthermore, transcriptional variability can be inherited, presumably by epigenetics161. To complement proposed theoretical frameworks7, 162 and ongoing eGWAS efforts163, 164 for integrating genetic and epigenetic information on a genome-wide scale, we briefly highlight three feasible additional applications of eGWAS given current technology.
There is now precedence of inter-individual epigenetic variability4, 170, which may influence disease susceptibility7, 86. In fact, three classes of epigenetic variation and examples of each have been described119. As we have discussed, tools for detecting global epigenetic diversity are already in place, with many of them continually being improved upon. We expect that a large number of normal and disease epigenomes will be determined soon, as the IHEC is currently underway13. Such efforts will provide valuable resources for enabling epigenome-wide association studies. One of the challenges we will face is how to integrate this vast amount of epigenomic data with other datasets that include gene expression, DNA sequence, and even high-throughout genome-wide 3-dimentional chromatin structural information171, 172 in order to extract useful biological insights. This may be resolved as the genome-wide techniques continuously improve in sensitivity, throughput, and cost-efficiency in addition to the availability and standardization of relevant computational tools. Epigenome mapping in normal and disease states will revolutionize our understanding of normal development, provide key mechanistic insights into the processes of cellular differentiation and gene regulatory networks, and, importantly, enhance our knowledge of the epigenetic contribution to disease, thereby enabling the discovery of novel diagnostic, preventative, and treatment strategies.
Sources of Funding
This work was supported by the Division of Intramural Research Program of the NIH, National Heart, Lung, and Blood Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
 Functional genomics,  Gene expression,  Gene regulation