|Home | About | Journals | Submit | Contact Us | Français|
The epigenome plays the pivotal role as interface between genome and environment. True genome-wide assessments of epigenetic marks, such as DNA methylation (methylomes) or chromatin modifications (chromatinomes), are now possible, either through high-throughput arrays or increasingly by second-generation DNA sequencing methods. The ability to collect these data at this level of resolution enables us to begin to be able to propose detailed questions, and interrogate this information, with regards to changes that occur due to development, lineage and tissue-specificity, and significantly those caused by environmental influence, such as ageing, stress, diet, hormones or toxins. Common complex traits are under variable levels of genetic influence and additionally epigenetic effect. The detection of pathological epigenetic alterations will reveal additional insights into their aetiology and how possible environmental modulation of this mechanism may occur. Due to the reversibility of these marks, the potential for sequence-specific targeted therapeutics exists. This review surveys recent epigenomic advances and their current and prospective application to the study of common diseases.
Epigenomic control enables cells, whilst possessing identical genomes, to differentiate into more than 200 distinct types within the human body . Without this system of specialization, by the coordination of gene activity and nuclear architecture, cells would be unable to correctly proceed through development and to synergistically perform their vital functions. In order to maintain this lineage specific identity, this mechanism must be able to be conserved through somatic replication. Epigenetics is therefore defined as this process: the mitotically stable heritable transfer of information that does not require a mutagenic change of the underlying nucleotide sequence . Broadly, this term is used to encapsulate DNA methylation, histone variants and post-translational modifications, and non-coding RNAs. However, strickly, a definitive mechanism for mitotic heritability is only currently clear for the first of these, DNA methylation, which this review will predominately focus on .
The mechanism by which the epigenetic state can be established has recently been defined and divided into three broad processes or mechanistic steps, as detailed in Berger et al. . These involve the interplay of firstly an epigenator, which can be an environmental change or trigger for the cell that occurs prior to any modification to that cell’s epigenome. Secondly, an epigenetic initiator, these include DNA binding proteins and non-coding-RNA, which show sequence specific responses set in motion by the aforementioned initiator. Finally, the epigenetic maintainer, being the persistent marks such as DNA methylation and post-translational histone modifications that enable influence on structure and expression of the genome. Aberrations of any facet of this system can have the potential to lead to non-viability or subtle phenotype variations.
The epigenetic device of DNA methylation is the addition of a methyl group to the 5-carbon of the pyrimidine base cytosine. Normal embryonic developmental differentiation, gene transcription, genomic imprinting, X chromosome inactivation, genome stability by repression of repetitive sequences and chromatin structure are all, to varying degrees, dependent upon this mechanism . In humans this mark almost exclusively occurs in the context of a CpG dinucleotide, of which there are only ~28 million in the haploid genome. This under-representation is due to the hypermutability of methylated cytosines and consequent inadequate repair system . However those regions of relatively high CpG density, CpG Islands (CGIs), are found to be predominately hypomethylated and these regions co-localize with over 70% of gene promoter regions that are for the most part those of housekeeping or key developmental genes . This lack of methylation is usually indicative of the active state of these promoters. Conversely gene bodies are hypermethylated in expressed genes , so this requisite is coupled with the disadvantage of increased mutability within critical coding sequence .
During mammalian development there are dramatic changes in DNA methylation . Primordial germ cells begin with very low levels, then with gametogenesis parental imprinting tags are established, with substantially methylated but differing methylomes in the sperm and egg. In the preimplantation early embryo there is a wave of genome-wide demethylation that occurs, which is rapid in the paternal genome, except for centromeric, repetitive and paternally imprinted genes, with a comparative slow process occurring in the maternal genome . This is then followed by heavy de novo methylation, particularly in the somatic lineages as these are established, but to a lesser extent in the trophoblast leading to the placenta and yolk sac, and excludes the primordial germ cells.
Non-CpG cytosine methylation has been identified at a high level in stem cells and reprogrammed progenitor cells, indicating that loss of this form of methylation may be critical in the path from pluripotency to differentiation [12, 13]. The total level of global methylation and the degree of non-CpG methylation is inversely proportional to the level of differentiation . Although, as yet, no mitotically heritable mechanism for this non-CpG methylation has been identified, so the question remains as to whether these marks are truly epigenetic . Moreover of interest is the recent discovery of a further DNA modification to 5-hydroxymethyl-2′deoxycytosine (hmdC), which was initially identified at significant levels in the brain (0.2% in granule cells and 0.6% in Purkinje cells) . Whether this alteration plays a critical role within the CNS, or not, or additionally in other tissues remains for now terra incognita.
Histone modification and DNA methylation are coordinated and correlated processes . That this is the case can be illustrated by observations such that the CpG-binding protein Cfp1 specifically binds to non-methylated CGIs and directly leads to the formation of peaks of the promoter chromatin mark H3K4me3 . Furthermore, the ability to deliver de novo DNA methylation to a target promoter by Zinc finger peptides has been demonstrated with subsequent affects on expression via formation of a repressive chromatin signature . Additionally, non-methylated CGIs are preferentially recognized by the KDM2A protein, which is a H3K36-specific lysine demethylase, thereby completely depleting this chromatin mark at these loci .
At least eight different types of post-translational histone modification exist, with an ever-increasing number of amino acid residues able to be affected by these alterations. Although they are involved in epigenetic processes, definitive heritability in their own right is uncertain and it has been proposed that RNA may be a possible determinant in this process . Mechanisms have been proposed for direct transmission including the recycling of histones following replication and the interplay of trans-acting factors .
Within an individual, tissue-specific methylation differences can be strongly identified, but additionally inter-individual epigenomic variation in these tissue-specific patterns exist . Epialleles, or allelic variation in epigenetic state, can be divided into three logical classes : obligatory, those purely determined by genotype; facilitative, that can only occur in conjunction with a certain genotype but in a probabilistic manner; and finally pure epialleles, that are not connected to genotype.
Allele-specific methylation (ASM), in non-imprinted region of the genome, is acknowledged and associated with allele-specific expression (ASE). Cis-sequence genetic effects are recognized to be influential in this , with these combined haplotypes and epitypes termed ‘hepitypes’ . Both cis- and trans-genetic effects on DNA methylation, and additionally expression were recently identified in the brain of mice . In the human genome, ASM patterns are widespread  and furthermore the genetic polymorphic background with respect to CpG-creating or abrogating SNPs (CpG-SNPs) also have a considerable effect on this . Haplotype-Specific Methylation (HSM) whereby the coordinated phase of CpG-SNPs, without detectible influence on the non-variant surrounding CpGs, can reveal methylation differences detected over kilobases (Bell et al., manuscript in Review). Additionally a recent study by McDaniell et al.  identified allele-specific chromatin states that were due to underlying genetic variation and thereby heritable, although it could not differentiate between the cause of this heritability being genetic or epigenetic.
Whilst, tissue specific DNA methylation and chromatin states are used to delineate cell type and function, this epigenetic position can be modified by, for example, the external environment by diet or toxins, or internally by hormones . The dramatic effect of gross environmental influence was first identified by early studies in dietary manipulation, for example, rats fed with a carcinogenic methyl-deficient diet demonstrating the hypomethylation of hepatic nuclear DNA .
The epigenome therefore performs the role as the interface between the environment and the genome. These alterations can be considered in two major though overlapping means, those that occur at critical early windows of epigenetic calibration and those that accumulate over time. The former includes the periconceptual period which is important during the reprogramming phase of epigenome . In a study investigating individuals 60 years after the Dutch Hunger Winter, DNA methylation levels in the IGF2 imprinted gene locus were reduced in those who had been exposed prenatally in comparison to their same sex siblings. These persistent lifelong methylation differences were identified specifically in those with periconceptional exposure to the famine . However, other early stages of embryonic development, and even later pregnancy and post-natal influence may also led to longstanding effects on methylomes. Although, imprinting indicates a trans-generational epigenetic transfer mechanism is possible, generational ‘spillover’ in other loci has only potentially but not definitively documented via the maternal line in agouti mice, due probably to the strong erasure that occurs in sperm .
The common disease genetic and epigenetic model, as put forward by Feinberg  proposes epigenetic modulation due to environmental influences can affect the outcome of deleterious genes. These epigenetic changes may accrue over years due to diet, stress, ageing or environmental exposures and subsequently affect gene regulation over time and possibly disease susceptibility. Gradual loss from epigenetic set points, due to accumulated environmental influence, may affect the regulation of susceptibility gene variants with subsequent changes in the expressivity in these complex diseases.
Aberration of the normal DNA methylation pattern is a noted hallmark of cancer . Some striking abnormalities that develop in the epigenome of malignant cells have aided in our understanding of the pathogenic potential and processes that can occur via epigenetic means. Classically DNA methylation is reduced globally, hypomethylating oncogene promoters, reducing defence against repetitive sequences leading to susceptibility to genome instability and potentiating chromosome structural translocation, and additionally decreasing gene body methylation with subsequent effects on transcription . Locus hypermethylation of critical tumour-suppressor gene promoters is also significant in the neoplastic process. Past low resolution studies have indicated both that global hypomethylation is an early event in cancer, and that the degree of this process can be associated with cancer stage [34, 35]. Recent more comprehensive analysis has indicated that CpG island shores (regions 2kB either side of CGIs) may play an even more dynamic role in these promoter region methylation changes . It is additionally noteworthy that these shore regions are also found to be those that are more significant in tissue-specific variability.
Although, progressive hypomethylation is observed globally, some loci are hypothesized to be specifically involved in the initiation of carcinogenesis, or that transient early methylation aberrations may be selected for their propensity to instability, to later produce the most aggressive tumour cell populations . This initial demethylation and early gene-silencing actions can led to cell canalization down certain oncogenic pathways . This may be due to a number of events including the passage of premalignant genomes towards senescence before they acquire the ability to escape this , an adaptation to hypoxia, which is experienced very early in tumour development [40–42], or increased cellular plasticity and acquirement of the ability for lineage switching or a more stem-cell like state .
This improvement in the resolution of examining the cancer methylome has started to draw out previously undetected occurrences, such as, already mentioned shores  and specific hypomethylation of satellite repeats in peripheral nerve sheath tumours (Feber et al., manuscript in Review) and this will become more detailed, phenotype- and pathway-specific as further cancer methylomes are analysed.
Abnormalities in the imprinting process are well defined in rare developmental disorders such as Prader Willi and Angelman Syndrome , but, additionally aberration of the normal imprinted DNA methylation state has been identified in cancer tissues . Additionally the existence of rare constitutional epimutations, soma-wide allele specific promoter methylation, has been revealed in hereditary cancer syndromes, specifically for critical mismatch repair gene pathways . These display additional clinical complexity due to possible occurrence of non-Mendelian inheritance with reversibility between generations.
Ageing specific changes in the epigenome were first recognized with the identification of reactivation of X-inactivated genes . The accumulation of alterations, with the divergence of DNA methylation and histone acetylation states, were then detailed over a lifetime by comparing monozygotic twins with increasing age . Although it could not be resolved whether these modifications are caused by accumulated environmental influence, or the accruement of small defects in transmission through mitotic replication over time termed ‘epigenetic drift’. The latter is possible due to the reduced stringency of the corrective mechanisms for epigenetic compared to genetic aberrations, due to less deleterious evolutionary pressure . Intra-individual changes over time in global DNA methylation were shown to group within familial clusters . Christensen et al.  identified that CpG methylation devolves with time from the fixed extremes commonly found in the genome, with CpG islands gaining methylation and elsewhere methylated CpGs losing this mark.
This observation however has become even far more fascinating utilising higher resolution DNA methylation arrays, enabling a more detailed exploration of where these differences may reside. Teschendorff et al.  identified, by the examination of over 900 individuals, that a DNA methylation ageing signature could be identified in the promoters of Polycomb Group Target (PCGTs) genes . Polycomb Group Proteins (PCGs) are involved in the process of repression of those genes involved in stem cell differentiation, and PCGTs are additionally known to be hypermethylated in cancer, thus leading to the proposal that age-related DNA methylation changes may predispose to neoplasia by inducing a more stem-cell-like state. Rakyan et al. also identified a comparable age-induced DNA methylation alteration in different tissue types . This signal was identified in bivalent chromatin domain promoters, where both repressive and activating histone markers are found together, Histone H3 lysine 27 methylation and H3 lysine 4 methylation, respectively. These co-locate with developmentally critical transcription factors , including PCGs, and regions of the genome that again predominately hypermethylate in cancer. The bivalent chromatin marks keep these promoters poised and therefore these cells capable of switching to different gene pathways of differentiation, until the point in time that they commit to a particular lineage.
Thus, a possible epigenomic mechanism connecting ageing-specific changes with a late-onset disease indicates this type of pathogenic pathway may exist in other adult-onset illnesses. An epigenetic role may assist in explaining the progressive and quantitative nature of most common diseases . Chronic diseases such as psychiatric illness, which have been predominately intractable to genomic analysis, not withstanding precise phenotyping difficulties, may be model cases for environmental influences affecting the epigenome. Adverse early life events are considerable risk factors for later onset psychiatric morbidity and murine models have identified DNA methylation changes that affect long-term hormonal alterations together with features mimicking depression . These epigenetic modulations induced by these experiences appear to become hardwired, with lasting effects on established neurons indicating their possible role in behavioural and psychiatric disorders . The critical role of histone modification in neuronal morphology has also been demonstrated by further murine work into addiction . Specifically cocaine-induced structural changes within the Nucleus Accumbens, in the striatum of the basal ganglia, were identified, which is known to play an important role in reward and pleasure systems. Reduced global levels of H3K9 dimethylation due to repression of the lysine dimethyltransferase G9a within this brain region, regulated by a cocaine-induced transcription factor, subsequently led to addictive-like behaviour. In humans DNA methylation differences have been identified in the frontal cortex of those diagnosed with major psychotic symptoms, in regions linked to the disease aetiology, and additionally were also discerned due to long-term anti-psychotic treatment . Any common complex diseases with strong genetic and environmental aspects would be excellent targets for in-depth epigenomic analysis.
Monozygotic twins reduce genetic heterogeneity to an almost negligible level. These models have been ideal in the past for heritability studies, utilising concordance, but also have obvious great benefit in focusing on pure epigenetic changes, especially in those rare sets that are discordant for disease status. This form of analysis in the autoimmune disease systemic lupus erythematosus (SLE)  identified a decrease in methylation across a number of genes associated with the phenotype. These were enriched for those with a known role in immune function, thus characterizing this disease, but not excluding the possibility of their involvement in the pathogenesis.
The re-examination of GWA study SNP data with respect to parent of origin effects has facilitated the identification of a common susceptibility type 1 diabetes SNP within an imprinted region . In an Icelandic cohort, this analysis found breast cancer, basal-cell carcinoma and type 2 diabetes (T2D) SNP associations , with one of the latter showing a differentially methylated CTCF-binding site and decreased methylation.
The integration of epigenomic information in complex trait genomic analysis can help to reveal further functional insights within an associated locus. This is shown in a study of open chromatin sites in T2D by DNA-seq of formaldehyde-assisted isolation of regulatory elements (FAIRE-seq) in pancreatic islets . An open location was mapped to a T2D associated SNP in the TCFL2 gene and was identified to be allele-specific and to alter enhancer activity. Analysis of common histone marks, also in human pancreatic islets, detected regulatory elements in the location of known T2D SNP associations . Within regions of strong linkage disequilibrium, allele-specific histone modifications or methylation data may enable narrowing down of critical regions and formulation of novel functional hypotheses.
DNA sequence is acknowledged to be significant in influencing methylation levels due to density of CpGs and allele-specific patterns are strongly contributed to by variation due to CpG-SNPs . Consequently the loss and gain of CpGs over time is proposed to be a significant evolutionary device . The predominant trend within the genome is to lose methylated cytosines and this destruction of a CpG dinucleotide by a SNP has been shown to lead to significant cis-methylation effects . Additionally this mechanism has been seen to be even involved in the generation of transcription factor binding sites . Functionally critical CpGs are often not within CpG island differentially methylated regions . Therefore, this adaptive role of CpG-SNPs  may be even more dramatic in those regions with a comparatively low density of CpGs, such as the tissue- and cancer-implicated CpG shores  or in low-CpG dense promoters identified as those which are more tissue specific , or enhancers . The investigation of the evolutionary aspects of genetically driven modulation of epigenetic variability, through for example, the polymorphic nature of CpG-SNP dinucleotide sequences, will help define human population epigenomics.
Although DNA methylation changes are tissue specific, as methylation profiling becomes increasingly high resolution the ability to identify surrogate markers of specific illnesses, even in easily accessible peripheral blood will become increasingly possible. This fingerprint is indicative of disease processes elsewhere in the body, whether these are due to specific inflammatory perturbations or other related pathology. Examples of such processes have been suggested in studies of ovarian cancer  and type 1 diabetes nephropathy .
Analysis of DNA methylation is currently a trade off between resolution and throughput . Array approaches include those custom designed for detailed assessment of particular genomic regions by direct or enrichment techniques, or the current ‘genome-wide’ Illumina Infinium which allows individual cytosine evaluation interrogating 27578 cytosines at CpG loci throughout the genome, in the plausible promoter region of 14495 genes . Therefore, this is limited to only assessing promoters and only with usually 2-CpG locations each. The bisulphite conversion effectively creates a pseudo-SNP, although the continuous nature of the measure adds further complexity as it is therefore not just simple case of three genotype calls. However, this array is soon to be superseded by a 450k version with dramatically increased coverage of CGI, as well as CGI shores and gene bodies.
Enrichment or representative analysis includes the use of methods such as methylated DNA immunoprecipitation (MeDIP) with a 5-methylcytosine antibody  or reduced representation by the use of restriction enzymes followed by bisulphite conversion (RRBS)  and subsequent use of second generation sequencing. Assessment by DNA-seq based methods has the added advantage of the ability to utilize SNPs within fragments to determine ASM on particular haplotypic backgrounds.
Currently only performed on an individual scale, second generation bisulphite-sequencing (BiS-seq) is the next step in this progression . Whilst currently cost prohibitive and analytically challenging, due to the difficulty of mapping to a reduced representative genome, the recent publication of the first individual human BiS-seq genomes indicate its promise.
Although one caveat to point out is that the bisulphite conversion reaction (see Glossary in Appendix 1) does not distinguish between 5-methyl and 5-hydroxymethyl modifications . Also in any Taq-polymerase based reaction the chemical by-product of the latter, 5-methylenesulphonate, can even affect the process by stalling the polymerase, especially if two or more are adjacent or near adjacent . This however will not be an issue for the next or third-generation sequencers, which will directly assay single molecule DNA and subsequently directly assess DNA epigenetic marks [75, 76].
Additionally chromatin sequencing analysis by chromatin immunoprecipitation (ChIP-seq) may enable a further detailed analysis of any individual of the multiple number of chromatin modifications adding further dimensions to epigenomic analyses .
Epigenomic analysis on a true genome-wide scale in large cohorts in multiple phenotypes and tissues is becoming increasingly obtainable. However, statistically significant epigenetic associations should be currently judged in the same light as genetic associations. That is on the statistical robustness of the association with phenotype, and by replication, not definitive functional proof if this is not directly apparent or obtainable. Although an epigenetic association may give a clearer indication of the mechanism, than perhaps a non-coding genetic association, it must be viewed just as one further step along in the dissection of complex traits, by perhaps enabling the proposition of further functional hypotheses. Obviously in the future the integration of multifaceted high resolution epigenomic assessment, with genomic and transcriptomic data will increase power to delineate and define these subtle disease processes .
The ability to accurately attain tissue-specific epigenomes in normal cells thereby defining reference methylomes, and in pathological specimens, will lead to increased insight into the perturbations that can occur due to environmental influences and disease processes. This knowledge, in combination with improved awareness of genetic susceptibility, will dramatically enhance our understanding of the aetiology of common complex diseases. Hopefully, this will lead to the identification of novel critical pathways and new pharmacological targets.
Moreover, the intersection of the most scientifically progressive areas of biomedicine, genomics and imaging, may even arise if in vivo imaging of epigenetic changes in cells, tissues and eventually intact organisms becomes possible. Due to the reversibility of pathogenic epigenetic marks, pharmacological agents such as the first epigenetic drug, 5-azacytidine, are available and many more have recently or are about to reach the clinic [79, 80]. Furthermore, due to sequence specificity some ‘epimutations’ may even be amenable to the development of a targeted therapeutic intervention.
CGB and SB are funded by the Wellcome Trust (084071). SB is additionally a Royal Society-Wolfson Research Merit Award Holder.
Christopher G. Bell is a Genetic Pathologist and Postdoctoral Researcher in the Medical Genomics group at the University College London Cancer Institute, and his research is focused on the epigenomics of common diseases.
Stephan Beck is Professor of Medical Genomics at University College London and is interested in the genomics and epigenomics of phenotypic plasticity in health and disease. [http://www.ucl.ac.uk/cancer/research-groups/medical-genomics/].
BiSulphite Conversion: Chemical modification of DNA, used to assess DNA methylation, whereby all cytosines are converted to uracil unless they are methylated.
CpG Dinucleotide: Consecutive Cytosine followed by Guanine base on the same DNA strand joined via phosphate bond.
CpG Islands: Regions of higher than expected density of CpG dinucleotides in the genome, frequently co-located with promoters.
CpG-SNPs: Single Nuclucleotide Polymorphism that resides at either base of a CpG dinucleotide, therefore creating or abrogating methylation ability.
Epigenome: Genome-wide epigenetic state.
Epigenator: Environmental change or trigger for the cell that occurs prior to any epigenetic modification.
Epigenetic Initiator: DNA binding proteins and non coding-RNA that direct formation of epigenetic marks.
Epigenetic Maintainer: Persistent epigenetic marks such as DNA methylation or Post-Translational Histone Modifications.
DNA methylation: The addition of a methyl group to the 5’ carbon of cytosine.
Chromatinome: Genome-wide Chromatin state.
Methylome: Genome-wide DNA Methylation state.
Post-Translational Histone Modifications: Chemical modifications of histones that are involved in regulating the chromatin structure.