|Home | About | Journals | Submit | Contact Us | Français|
Despite the success of genome-wide association studies (GWAS) in identifying loci associated with common diseases, a significant proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation (DNAm). Such Epigenome-Wide Association Studies (EWAS) present novel opportunities but also create new challenges that are not encountered in GWAS. We discuss EWAS study design, cohort and sample selections, statistical significance and power, confounding factors, and follow-up studies. We also discuss how integration of EWAS with GWAS can help to dissect complex GWAS haplotypes for functional analysis.
Elucidating the genetic and non-genetic determinants of human complex diseases represents one of the principal challenges of biomedical research. In recent years, genome-wide association studies (GWAS) have uncovered >800 single nucleotide polymorphism (SNP) associations for more than 150 diseases and other traits1. Although the complete genetic basis is not yet known for any human complex disease, re-sequencing of exomes, and ultimately whole genomes, holds promise to identify most of the remaining causal genetic variations. However, there is now increasing interest in exploring how non-genetic variation, including epigenetic factors, could influence complex disease aetiology2-4.
The epigenome of a cell is highly dynamic, being governed by a complex interplay of genetic and environmental factors5. Normal cellular function relies on the maintenance of epigenomic homeostasis, which is further highlighted by numerous reported associations between epigenomic perturbations and human diseases, notably cancer4. However, most studies of such associations to date have been performed either with inadequate genome coverage (e.g. tens to hundreds of loci) but adequate sample size, or approaching genome-wide coverage (thousands of loci) but inadequate sample sizes. Consequently, for any given human complex disease, we remain unaware of the proportion of phenotypic variation that is attributable to inter-individual epigenomic variation. This problem can only be elucidated by large-scale, systematic epigenomic equivalents of GWAS – epigenome-wide association studies (EWAS) as first proposed in 20086. At least for DNAm, technology is now available that is directly comparable in resolution and throughput to the highly successful GWAS chips that allow genotyping of around 500K SNPs.
But how does one conduct an EWAS? In addition to considerations that are common to both GWAS and EWAS (e.g. adequate technology and sample size), the design of EWAS has specific considerations with respect to sample selection. DNAm patterns are specific to tissues and developmental stages, and also change over time. Furthermore, EWAS associations can be causal as well as consequential for the phenotype in question - a difference from GWAS that presents considerable challenges. Here, we discuss these considerations in the context of designing and analyzing an effective EWAS, keeping in mind that EWAS are likely to evolve, as did GWAS, as information and experience accumulate.
Epigenetic information in mammals can be transmitted in multiple forms5, including mitotically stable DNAm, post-translational modifications of histone proteins, and ncRNAs. For DNAm, the predominant form is methylation of cytosines in the context of cytosine-guanine dinucleotides (CpG). However, recent results suggest that CpH methylation (where H = C/A/T) may be more common than previously appreciated7,8. Catalysed by Ten Eleven Translocation (TET) methylcytosine dioxygenases, 5-hydroxymethylation9,10 of cytosines (hmC) is yet another form of DNAm. Although details are still unclear, increasing evidence suggests a role of hmC in gene regulation and differentiation11 Histone modifications include, to name but a few, mono-, di- or tri-methylation, acetylation, and citrullination of one or more amino acids in the N-terminal tails of core histones5. More recently, it has been discovered that ncRNAs can self-propagate and be transmitted independently of the underlying DNA, in other words they can ‘epigenetically’ transmit regulatory information12,13. ncRNAs include short micro RNAs (miRNA), Piwi-interacting RNAs (piRNA), large intervening non-coding RNAs (lincRNA) and others12.
The full spectrum of epigenetic marks is currently unknown, but is potentially enormous, considering that the diploid human epigenome contains >108 Cs of which >107 are CpGs, and >108 histone tails, that can all potentially vary. The best-studied epigenetic mark is DNAm and Box 1 illustrates the most common features and contexts in which DNAm varies. DNAm variation at a single CpG site is known as a methylation variable position (MVP), which can be considered as the epigenetic equivalent of a SNP14. Very rarely, CpGs on only one of the two strands of DNA per allele are methylated. This is known as hemi-methylation, and probably reflects post-replication lag in DNAm maintenance in proliferating cells. If DNAm is altered at multiple adjacent CpG sites, this is referred to as a differentially methylated region (DMR); DMRs vary considerably in length, they are typically <1Kb but can exceed 1Mb15. Until recently, MVPs and DMRs were mostly studied in the context of core promoters, CpG islands (CGIs) and imprinted differentially methylated regions (iDMRs), however, it is becoming increasingly clear that DNAm is highly dynamic even outside of such regions. For example, a recent study found that tissue-, and cancer-specific DMRs preferentially occur in regions adjacent to CGIs, so-called CGI shores16. DNAm also plays a key role in silencing repeat elements, which may also impact on disease aetiology17,18.
This rapidly increasing list of features is not meant to be complete but intends to illustrate the key loci and contexts in which DNAm is known to vary.
Methylation variable position (MVP). A CpG site that shows differential methylation e.g. between different disease states as illustrated below. Given recent findings on non-CpG methylation, potentially all Cs could be MVPs.
Differentially methylated region (DMR):A region of the genome at which multiple adjacent CpG sites show differential methylation..DMRs can occur in many different contexts such as:
iDMR - imprinting-specific differentially methylated region
tDMR - tissue-specific differentially methylated region
rDMR - reprogramming-specific differentially methylated region
cDMR - cancer-specific differentially methylated region
aDMR -ageing-specific differentially methylated region
Variably methylated region (VMR). These are defined by increased variability rather than gain/loss of DNAm.
Allele-specific methylation (ASM). These are positions or regions that vary in DNAm depending on the parent-of-origin, the presence of a polymorphism or as a result of a stochastic event.
Haplotype-specific methylation (HSM). This is a differentially methylated region that is defined by a set of co-inherited SNPs (a haplotype).
CpG islands (CGIs). These are regions enriched for CpG sites. The majority of CGIs are unmethylated in all cell types.
CGI shores. These are regions immediately adjacent to CGIs and display higher variation in DNAm than CGIs despite their lower density of CpG sites.
The below figure shows different types of DNAm variation that can be identified with EWAS. For the purpose this simplified illustration, the cases and controls are assumed to have methylated or unmethylated CpG states only. Real samples will contain populations of different cells and hence display much more heterogeneous methylation levels across the full dynamic range between 0-100%.
The role of DNAm variation in complex disease has mainly been explored in the context of cancer, in what may be considered as early EWAS. Findings from these studies have been extensively discussed4,19, the key general conclusions being that tumour development is associated with gain of DNAm at CGIs, loss-of-imprinting, and epigenetic remodelling of repeat elements, particularly loss of DNAm at satellite DNA20,21. For non-malignant common complex diseases such as diabetes or autoimmunity, the epigenetic component is only just beginning to be investigated. Observations that support an epigenetic component in these diseases include the following. First, monozygotic twin (MZ) concordance for any complex disease is almost never 100% and recent small-scale EWAS of MZ twins discordant for systemic lupus erythematosus22 and autism spectrum disorders23 have found intra-MZ pair disease-associated epigenetic differences. Second, for several complex diseases, e.g. Type 1 Diabetes24, the incidence is rising in the general population and frequently altered in migrant populations, suggesting a role for non-genetic factors. Third, epidemiological evidence suggests that a sub-optimal in utero/early childhood environment can impact on disease outcomes (such as type 2 diabetes) in adulthood, a phenomenon termed developmental reprogramming25. Currently, the prime candidate for the molecular memory of the in utero environment is epigenetic modifications, including DNAm26-28.
As mentioned above, epigenetic variation can be causal for disease or can arise as a consequence of disease. Epigenetic variation could arise either directly or indirectly as a consequence of disease, and examples could include long-term alterations in immune-related cells in autoimmune disorders, altered metabolic regulation in type 2 diabetes, or somatic mutation-induced epigenetic alterations in cancer. However, distinguishing this from epigenetic variation that is causative of the disease process is not straightforward (as we discuss in greater detail below), but is critical since it will help elucidate the functional role of the disease-associated variation and potential utility in terms of diagnostics or therapeutics. A key step towards this goal is to determine whether the variation is present prior to any overt signs of disease. In this regard it is useful to consider how such epigenetic variation could arise prior to disease. Firstly, it could be inherited and hence be present in all tissues including the germline (i.e. transgenerational epigenetic inheritance), although the extent of this phenomenon is not fully known. Secondly, it could arise stochastically and be present soma-wide if it happens in early (e.g. in utero) development29,30 or be limited to one or a few tissues31,32 if it were to happen post-natally or during adult life. Thirdly, it could be environmentally-induced, either by adult life-style related factors such as diet or smoking33, or even in utero i.e. developmental reprogramming (described above).
It is also possible that the underlying genotype influences epigenetic variation, as recently demonstrated by several studies34-39. Loci harbouring genetic variants that influence methylation state have been termed methylation quantitative trait loci (methQTLs)34. In most methQTL, the correlations with cis-genotype are most pronounced. There is some evidence that genetic variation can also influence epigenetic states in trans, but this does not seem to be as prevalent as cis-effects38. Also, it is important to note that in most of these previous studies, the true causative genetic variant was not unequivocally identified, and the majority of methQTLs didn’t demonstrate a strict one-to-one relationship between cis-genotype and epigenotype. Rather, a given genotype generates an increased probability of methylation. Feinberg and Irizarry have recently argued for the existence of genetic variants in mouse and human genomes that do not change the mean phenotype but rather the variability of phenotype; this could be mediated epigenetically via variably methylated regions (VMR, see also Box 1)2. The existence of methQTLs provides a strong argument for integrated GWAS/EWAS to uncover genotypes that exert their function through epigenetic variation (discussed later).
methQTLs can also affect allele-specific methylation (ASM, see also Box 1). In this context, the steady-state methylation levels differ across the two alleles within the same cell. However, ASM can also occur in the absence of any specific genotype-epigenotype correlations. For example, parental imprinting, X-inactivation, random mono-allelic methylation of one allele, are all instances of ASM not due to differences in underlying genotype between methylated and unmethylated alleles.
Finally, it is also worth considering the possibility that in some cases disease-associated epigenetic variation could arise prior to disease-onset, but still not be causative for the disease per se. This type of epiphenomenon could be due to confounding, when an environmental factor such as smoking, or a genetic variant, induces both aberrant epigenetic states and disease.
These potential relationships between epigenetic variation and complex disease have important implications for the design and analysis of EWAS. First, they will determine the most relevant tissue and cell types to be sampled. Second, ‘reverse causation’ and confounding are particular issues for EWAS study design. Despite the considerable evidence of epigenetic perturbations in cancer4, and emerging evidence in other non-malignant diseases22,23,40-42, none of these studies has been able to conclusively distinguish causal from consequential epigenetic variants, a problem that has long been recognised43. Although any EWAS association with disease is potentially an advance, being able to identify the direction of causality will greatly aid in determining the utility of the epigenetic variation, e.g. as a marker of disease progression, as a target for reversal by treatment with an epi-drug, or as a measure of drug response by monitoring the kinetics of drug-induced epigenetic changes.
One of the major developments that enabled large-scale GWAS was of powerful but affordable genetic profiling technologies, in particular SNP arrays. Only recently have epigenomic profiling technologies reached the stage that large-scale EWAS have begun to be feasible. This requires that the mark/molecule is stable, amenable to high-throughput analysis, easily accessible in routine clinical samples, and that automatable whole-genome profiling methods are available. Currently, DNAm (and specifically CpG methylation) is the most suitable mark for EWAS. Other epigenetic marks may be as or more important, but are neither yet as easily accessible as DNAm in clinical specimens nor as amenable to high-throughput processing. In addition, there are numerous well-established correlations between different epigenetic marks, and hence profiling DNAm can, albeit indirectly, provide information about histone modification states and RNA dynamics5.
In principle, sequencing- and array-based profiling technologies can be used for EWAS. The most common of both these technologies have been extensively reviewed44 and independently benchmarked45,46, and are listed in Box 2. As is typical for this type of study, the choice comes down to balancing coverage, resolution, accuracy, specificity, throughput and cost47, Ultimately sequencing-based technologies are likely to prevail, but array-based methods like those used for GWAS are in our view the currently most suitable methods for EWAS. As described in Box 2, there are options for custom and off-the-self platforms covering the choices described above. Of these, the recently released Illumina 450K Infinium Methylation BeadChip looks in our view most promising for the first wave of EWAS, offering a good balance of genome-wide coverage (>450K CpG sites), resolution (single base pair) and throughput (12 samples per chip and up to 96 samples per run).
Lack of suitable technology has been a major bottleneck for EWAS in the past. Fortunately, this is no longer the case and a variety of both array- and sequencing-based methods are now readily available. As these have been already been extensively reviewed44,47,80 and benchmarked45,46,81,82, they are only briefly described here along with some additional technologies that may also be suitable for EWAS as guidance for the variety of choices available.
CHARM83: Comprehensive High-Throughput Relative Methylation; utilizes methylation-sensitive restriction enzymes.
Infinium84: The Infinium assay uses two different bead types (for methylated and unmethylated DNA) to detect CpG methylation of bisulfite treated DNA; utilizes chemical conversion of DNA.
Technologies that can be used in conjunction with arrays or sequencing:
HELP-chip/seq85: HpaII tiny fragment Enrichment by Ligation-mediated PCR; utilizes methylation-sensitive restriction enzymes.
MethylCap-chip/seq86: Methyl capture using the methyl binding domain of protein MeCP2; utilizes affinity enrichment.
BS-seq8: Whole-genome Bisulfite Sequencing; utilizes chemical conversion of DNA.
RRBS91: Reduced Representation Bisulfite Sequencing; utilizes chemical conversion of DNA.
Of these, the BS-seq approach - bisulfite conversion of randomly fragmented DNA followed by sequencing - provides the highest level of coverage and resolution, negligible bias towards CpG dense regions, and a direct read-out of non-CpG methylation92,93. Like all methods based on bisulfite conversion, BS-seq is not able to distinguish between methylated and hydroxymethylated cytosine bases94. Except for the reduced representation (RRBS) method which provides about 10% genome coverage, whole-genome BS-seq is currently too expensive for EWAS profiling, although costs keep falling rapidly. Affinity-based enrichment methods such as MeDIP-, MethylCap- and MBD-seq are more economical and highly automatable95 but are less quantitative and don’t provide single base resolution. In our view, the recently released Infiumium 450K BeadArrays seem well suited for EWAS profiling with respect to throughput, cost, resolution and accuracy. However, like other non sequencing-based methods this assay is susceptible to certain polymorphisms not known or considered at the time the array was designed.
Of course, the trade-off with all these methods is that many CpG sites are not profiled. As there is no epigenomic equivalent of the HapMap project which helped elucidate some of the genetic variation in the human genome77,78, we are not aware of the level of normal epigenetic variation that exists in human populations, or even which sites are the most relevant for disease aetiology. A true understanding of complex disease epigenomics will therefore only be realized when whole-genome methods become more affordable, possibly using techniques such as nanopore96 and single molecule real-time97 sequencing which are currently being developed. These will allow direct (i.e. no bisulfite, restriction or enrichment modifications required) and simultaneous determination of DNA methylation, DNA hydroxymethylation and DNA sequence in a single reaction.
In this section, we discuss the most informative study designs for EWAS with respect to types of study subjects and addressing the issue of reverse causation. Figure 1 illustrates some of the advantages and disadvantages for the four examples discussed.
The most commonly used GWAS design involves unrelated individuals recruited on the basis of their phenotype (e.g. cases and controls). Many case-control samples are already available, in some cases with genotype and expression data that can be integrated with epigenomic data. However, a retrospective study cannot determine whether the identified epigenetic variants are due to disease-associated genetic differences, post-disease processes or disease-associated drug interventions. Early examples of using case-control studies to identify associations between epigenetic variation and clinically relevant phenotypes have included studies on metabolic dysfunction48 and treatment with tamoxifen49.
These could be useful in EWAS that aim to identify transgenerational transmission of epigenetic marks (Box 3). It has recently been demonstrated that feeding F0 male mice either a high-fat or low-protein diet from weaning to the time of mating, results in F1 offspring with altered metabolic phenotypes28,50. Given that the sperm passes on very little, if any, cytoplasmic material to the offspring, these examples suggest the transgenerational transmission of epigenetic variants induced by the sub-optimal diet of the F0 males. A similar strategy using epigenomic profiling of parent-offspring trios could be used in humans. For example, if there is evidence to suggest that paternal environment influences phenotypic outcomes in the offspring, then one could perform integrated epigenomic and genomic profiling in the offspring to identify altered epigenetic variants, and the genetic information could be used to eliminate the possibility that genetic modifiers are causing the epigenetic variation. Such study designs will need to employ profiling methods able to detect allele-specific differences, be adequately powered, and have reliable measures of parental environmental exposures.
In mammals, epigenetic states are extensively reprogrammed between generations, and this is associated with the reinstatement of the pluripotent state that exists in very early development. However, a few studies have shown that occasionally epigenetic states are not completely reprogrammed, resulting in the transgenerational transmission of epigenetic states. The strongest evidence for this phenomenon in mammals comes from various mouse models such as Avy, and AxinFu (Refs29,30). In these models, the characteristic phenotype is associated with DNA methylation variation at the relevant locus. Interestingly, these states are not always completely reprogrammed between generations, thereby resulting in the range of phenotypes in the offspring being influenced by the phenotype of the parent, even in the absence of genetic heterogeneity. Establishing transgenerational epigenetic inheritance in humans is a far more challenging task since the outbred nature of human populations means that it is difficult to distinguish true epigenetic inheritance from the inheritance of genetic variants that determine variable epigenetic states. Nevertheless, several reports suggest that transgenerational epigenetic inheritance in humans may occur. If true, then we may need to reconsider whether some estimates of heritability are confounded by transgenerational epigenetic inheritance. For example, a given epigenetic state may be induced in the germline by environmental factors such as diet, and these states are passed on to the next generation, ultimately influencing phenotypic outcomes98. Indeed, in rats it has recently been demonstrated that a high-fat diet in fathers alters beta islet function in the daughters28. The true extent of this phenomenon is expected to become clearer in coming years.
MZ twins discordant for a disease of interest represent a useful resource for EWAS as any identified disease-associated epigenetic variant cannot be due to germline genetic variation32,51. However, unless the twins are recruited longitudinally, which is rarely possible, these studies cannot be used to distinguish between cause and consequence for the reasons discussed earlier. Recruiting large numbers of discordant MZ twins for a well-powered study is a potential problem, but some large twin resources are available (see under Links).
Longitudinal cohort designs follow initially disease-free people (ideally from birth) over the course of many years, recording disease events and other phenotypic changes and taking biological samples. They are expensive to establish, but many such studies are already underway, some involving appropriate tissues for EWAS (see under Links). For example, the British 1946 birth cohort52 offers samples and data spanning 65 years so far for over 5000 individuals. Two major advantages of such studies, compared with many case-control designs, are the avoidance of confounding due to differences in the recruitment of cases and controls, and of bias due to case-control differences in the measurement of risk factors. Longitudinal studies can also be invaluable for establishing the temporal origins and stability of disease-associated epigenetic variation, and hence help to distinguish causal from consequential epigenetic variants. If environmental influences are also recorded, it may be possible to relate these to epigenetic changes.
Longitudinal cohorts of disease discordant MZ twins would convey the additional advantage of ruling out genetic influences on disease-associated epigenetic variation, but such cohorts are rarely available for EWAS of common diseases. A compromise two-phase study design, involving a disease-discordant MZ twin cohort for the discovery phase and a different longitudinal cohort for the replication phase is discussed below.
In GWAS, most tissue types are suitable for identifying germline genetic variation and DNA extracted from patient blood or blood cell-derived cell lines is usually used. However, disease-associated epigenetic variation can be tissue-specific. Since the majority of EWAS use live individuals, DNA samples can only be easily accessed from certain sources such as blood, buccals, saliva, hair follicles, urine and faeces. Blood and blood subtypes for instance, are relevant for autoimmune diseases or blood-based cancers, and any tissue will suffice if the epigenetic variant is present soma-wide (as will be the case if induced during developmental reprogramming in early embryogenesis).
However, for many diseases alternative tissue sources need to be explored. These could include assaying cell-free serum DNA which comprises DNA from proliferating cells that is shed into the blood (as happens for most cancers), or post-mortem DNA, which is however less suitable if the aim is to establish causality. In fact, until epigenomic profiling can be routinely performed non-invasively (e.g. through imaging techniques53) and/or using very small tissue biopsies54, it will remain challenging to perform effective EWAS for brain-based and certain other diseases.
Another important issue is tissue heterogeneity. All tissues are composed of multiple cell types (e.g. blood contains >50 distinct cell types). If the disease-associated variation is restricted to a certain cell type that represents only a small proportion of the tissue sampled, then the variation may not be detected. The disease state itself can also alter the composition of cell types in a tissue (e.g. inflamed tissue will have a slightly different composition of cell types than non-inflamed tissue), and hence measured epigenetic differences between cases and controls may only reflect differences in cell type composition and not true epigenetic differences.
Finally, blood-spot (or Guthrie) cards are another valuable source of DNA. These are routinely created in many developed countries immediately after birth using either cord- or heel prick blood. Biobanks that include DNA and possibly other tissue, as well as phenotypic information, have been set up in several countries (see Links for examples).
There isn’t a single EWAS design that will suit all purposes, but rather the most suitable design depends on the required outcome. This is best illustrated in the form of two hypothetical examples, from the many possible EWAS designs that could be conducted:
Let’s assume that we are interested in identifying DNAm variants that arise prior to the onset of an autoimmune disease. We could start by performing genome-wide DNA methylation analysis of MZ twins discordant for the disease to identify disease-associated MVPs in immune-effector cells (i.e. a disease-relevant blood cell subset) that cannot be due to genetic variation. Then, we could take these MVPs and assay them in the same type of immune-effector cells from a prospective cohort, to look at DNAm at these sites in unrelated individuals sampled both before and after disease onset. Any MVPs that can be validated prior to disease onset are then candidate causal variations, and cannot be due to post-disease effects such as long-term medication or immune-related effects. Key follow-up studies could include correlation with gene expression and other epigenetic marks to investigate the affected pathways. Overall, this EWAS design combines analysis of a disease-relevant tissue from two independent cohorts that allow for discovery and validation of MVPs and elimination of various confounding factors.
Several cancer studies have identified epigenetic variants that can potentially be used to monitor disease progression and even response to treatment4. Some of these variants were detected by assaying DNA shed by the primary tumour into the patient’s serum, hence providing a relatively straightforward means of assessing progression55. An EWAS could also measure the DNAm state in serum from singleton patients that suffer from a given form of cancer, prior to, during, and following drug treatment. This could potentially identify epigenetic markers that predict the best response to treatment in real time. The root cause of the cancer-associated epigenetic variants (i.e. genetic or environmental) need not be known, nor would the primary tumour need to be directly analyzed, for the variant to be an effective measure of progression or response.
In 2005, just as the GWAS wave was about to break, Wang et al. published an influential review56 arguing for large sample sizes to detect small effects, and they highlighted the role of both minor allele frequency (MAF) and effect size in determining the power of a test of SNP association. They also discussed predictions from population genetics theory of the MAF spectrum over SNPs within a population, and the (limited) theory and data to predict effect size distributions. The corresponding arguments are no less compelling for EWAS, but the relevant parameters are even more difficult to predict, because of the paucity of data and relevant theory. DNA alleles do not typically vary across cells, and can now be typed with very low error rates. By contrast, methylation states may be tissue-specific, and can vary over cells within a tissue, over alleles within a cell (ASM) and in rare cases over DNA strands within an allele (hemi-methylation). Thus, for a tissue sample from one individual, the methylation state measured at a CpG site lies between zero and one, since it is an average over cells, alleles and strands and is further blurred by measurement error. Here, we use the limited available information about frequency spectra of DNA methylation variants, and their effect sizes for common disease, to tentatively propose power calculations under three scenarios. It remains unclear how realistic the proposed scenarios are but we hope at least to stimulate further discussion and investigation into this important aspect of EWAS study design.
A recent methylome analysis reported that on average 68% of CpG sites were methylated in human peripheral blood mononuclear cells57. There was great variation across genomic contexts: CpG sites in regions of high CpG density were almost always unmethylated, as were CGIs and 5′-UTRs; by contrast, 3′-UTRs, introns and repetitive elements were predominantly methylated. The rate of ASM was estimated to be between 0.3% and 0.6% (more than that attributable to imprinting alone). Hemi-methylation was found to be very rare (<0.2% which included non-CpG methylation and incomplete bisulfite conversion). The methylation spectrum was not symmetric: there were few sites close to being 100% methylated, but almost entirely unmethylated sites were not uncommon.
In Figure 2 (a,b) we have hypothesized methylation spectra for three different classes of individuals (“methylated”, “intermediate” and “unmethylated”) in order to generate overall frequency spectra in cases and controls. These form the basis of the power simulations, reported in Table 1. The difference in mean methylation rate between cases and controls provides a popular summary of effect size, but it does not reflect differences in variances or other features of the methylation spectrum. It also does not reflect the relative magnitude of methylation rates, whereas if a rare epigenotype in controls is almost absent in cases, this is likely to be more important than the same difference of mean rates for a more common epigenotype.
Odds ratios are well-established measures of genetic effect sizes for binary phenotypes. If we regard the mean methylation rate at a site in cases (or controls) to represent the methylation probability for a randomly-chosen DNA strand in the case (or control) tissue samples, then we can compute a methylation odds ratio. We call this methOR; it is the same as the ordinary OR except that the sampling unit is a DNA strand, rather than an individual. Thus, the methOR is the odds for a random DNA strand in the tissue sample from a random case to be methylated, divided by the same odds for controls. This provides a measure of effect size that incorporates relative magnitudes, but like the mean difference in rates it also does not allow for difference between cases and controls of features of the methylation spectrum such as its variance. As for other odds ratios, methOR is comparable across prospective and retrospective studies, and its value only measures association and does not imply causation.
Table 1 gives simulation-based power estimates for three sets of methylation spectra from Figure 2. They have similar methORs, while the case-control differences in mean methylation rates are the same for (a) and (b) but not (c). The fact that the power values differ between (a) and (b) emphasizes that there is no single-number measure of effect size since power depends on the entire methylation spectra in cases and controls. However, for the logistic regression analysis conducted in our simulations, methOR gives a better guide to power than the difference in rates. When methOR is around 1.25, a sample size of 800 cases + 800 controls is adequate to achieve 80% power at a significance level of α = 10−6 for scenario (c), but not (a) or (b) (see next section for a discussion of genome-wide significance for EWAS). When methOR is around 1.5, a sample size of 400 + 400 gives 80% power at α = 10−6 for (b) and (c), but not (a).
Very little is currently known about actual differences in methylation spectra at epigenetic variants implicated in disease, and recommendations about sample size will need to evolve with emerging data. A recent report58 on the effects of smoking on methylation identified one very strong association at a CpG site located in F2RL3, for which the median methylation rates were 95% for never-smokers and 83% for heavy smokers, giving a difference of 12% and methOR=2.7. Methylation status was much less variable in never smokers than in heavy smokers (inter-quartile ranges 0.94-0.96 and 0.78-0.88, respectively). For such a strong effect the sample size of 65 heavy smokers and 56 non-smokers was adequate to detect the association, but smoking is known to be among the most important environmental factors for health and other effect sizes of interest are likely to be much smaller. If we regard 1.5 to be a target methOR value, then it would seem to be not cost effective to pursue an EWAS with fewer than 400 cases and 400 controls, and 800 of each would be preferable to achieve good power. This is much less than the 2,000 cases and controls that became the de facto standard minimum sample size for GWAS following the Wellcome Trust Case Control Consortium (WTCCC) study59, reflecting the fact that effect sizes for EWAS and GWAS are not directly comparable. It seems likely that effect sizes and hence power will vary substantially according to genomic context, in which case genome-wide ranking by p-values is unsatisfactory60 and Bayesian measures of support that take power into account are more appropriate. Currently, however there remains little information to inform Bayesian prior distributions of effect sizes.
In GWAS, the establishment of genome-wide thresholds for significance is complicated by correlations between the genotyped SNPs61. In EWAS, there are analogous correlations among DNAm sites in DMRs, but these correlations typically extend to at most a few kilobases, though to date they have only been reported in non-disease contexts. Based on what we discussed above on co-methylation, ASM and hemi-methylation, the vast majority of CpG methylation can be expected to be symmetric across strands and across alleles in somatic cells. Thus, the ~28 million CpG sites in the haploid human genome correspond, due to correlation within DMRs and methylation symmetry, to substantially fewer independent methylation states. If a set of 500K CpG sites were evenly spaced, the average spacing between sites may be large enough to allow an assumption of independence, in which case a significance level α = 10−6 per site gives probability 0.36 of no false positives (= type 1 error rate) under the null and this might be regarded as a liberal threshold for a possible EWAS association. If 5 million CpG sites were assayed, we would expect 5 false positives under the null at this α level. Correlation among neighbouring sites means that a specific calculation is required to identify a stringent standard for epigenome-wide significance (global type 1 error < 0.05), which will typically lie between 10−8 and 10−7.
GWAS can be affected by two sources of confounding. Firstly, with retrospective ascertainment there is a risk of systematic differences between cases and controls in the handling or processing of samples (known as technical confounding, which includes batch effects)62,63. Similar problems are possible for EWAS. Secondly, confounding can arise because the ancestry of cases differs systematically from that of controls (known as population structure and cryptic relatedness)64. This causes confounding in GWAS because any polygenic contribution to disease causation is correlated with ancestry, and environmental exposures may also be correlated with ancestry for example due to different geographic locations of ancestors. Whether or not “polyepigenetic” effects exist seems unclear, but environmental exposures correlated with ancestry seem likely to impact epigenetic studies.
Unlike GWAS, environmental factors can also directly confound an EWAS, by affecting both epigenotype and phenotype, which can inflate type 1 error and exaggerate effect size estimates. Potential confounders such as age65 and smoking behaviour should if possible be adjusted for in a regression analysis. Even if a measured covariate is not a confounder, but for example has an independent effect on phenotype, then adjusting for it can allow better delineation of the direct epigenetic effect.
Fortunately the large numbers of SNPs in a GWAS allow many possibilities to detect and correct confounding63, including genome-wide adjustment of association statistics, regression adjustment using principal coordinates and mixed regression models64. Similar methods are likely to be effective to detect and adjust for confounding in EWAS. For example, leading principle coordinates of genome-wide methylation states may encapsulate unmeasured confounders, so if these are also correlated with phenotype then it may be appropriate to include these as covariates in a regression analysis, as is common for GWAS analyses. Indeed if GWAS data is also available on the EWAS study individuals, it may be appropriate to adjust for leading principle coordinates of both genetic and epigenetic states.
The values in Table 1 assume a single-stage study but as discussed above the possibilities of confounding, correlation with genotype and of reverse causation often argue for a two-stage study design, for example including a discordant MZ twin stage followed by a longitudinal cohort stage. In simple settings it is optimal if the sample size in each stage is inversely proportional to the square root of the cost per individual in that stage66. The question arises as to whether the second stage should assay all the sites from the first stage, or whether costs can be reduced by assaying in stage 2 only a limited set of “hits” from stage 1. The relatively low cost and additional information argue for the former strategy in general, unless stage 1 is large enough to eliminate all but a handful of potential hits. In either case it is broadly speaking optimal to conduct a single, joint analysis of results from both stages. If stage 1 involves MZ twin pairs, a paired analysis may be appropriate (such as a paired t-test) if there is substantially more variation among than within twin pairs. A combined two-sample case-control analysis is then not appropriate, but it is straightforward to combine test statistics from the two stages using standard meta-analysis techniques.
Particularly in the early days of GWAS studies, replication of hits in an independent study was important in weeding out false positives that arose through technical or design flaws in the initial study. Arguably GWAS study design has improved to the extent that replication is less crucial now since there are many checks available on the quality of the primary study, but replication is still seen as highly desirable and is typically relatively easy to achieve. Ideally replication should be carried out by an independent group of researchers, preferably using a different study design and different laboratory techniques, yet studying the same polymorphism in the same population and with the same phenotype definition. In practice it is impossible to demand all this, and what constitutes a satisfactory compromise is a matter of debate, although there are some broad points of consensus67. For EWAS, the same issues arise and in addition the issues of correlation with genotype and reverse causation should both be addressed in replicate analyses. Thus a replication is potentially more demanding for EWAS than GWAS, yet limited availability of tissue samples and study subjects mean that replication will be harder to achieve. As EWAS begin to develop it would be inappropriate for reviewers and editors to impose overly strict replication requirements analogous to those used in the current mature phase of GWAS. In particular, we should avoid any encouragement for researchers to hold back samples or resources from the primary study in order to use them later to claim “replication”. Lessons should be learned from the GWAS experience: the primary study needs to be well powered, and rigorous quality checks imposed on the EWAS data. If replication is not immediately feasible this should not preclude publication, but the need for further confirmation of results should be acknowledged. The appropriate level of tolerance of false positives from the primary study depends on several factors, including the costs of follow-up analyses. If these costs are not too excessive it may be optimal to initially tolerate some false positives in order to minimise false negatives. The field of EWAS needs to develop similarly to GWAS, with standards tightening over time with progressive learning from accumulated experience.
The ultimate aim of EWAS, like GWAS, is to provide a better understanding of disease aetiology, and to lead to the development of novel therapeutics and diagnostics. Typical follow-up experiments to determine the etiological role of disease-associated epigenetic variation could include correlation with other epigenetic modifications and collectively how they impact on gene expression. This could be achieved using ChIP-seq experiments, either for the many histone modifications known to correlate with DNAm68 or for transcription factors whose binding may be modulated – positively or negatively – by methylation at their target sites69. If a large effect size can be determined for a single site, then one could validate the link to the disease-associated phenotype by modulating the expression of the gene in question either in in vitro systems or model organism studies. However, a more likely scenario is of many disease-associated epigenetic variants each conferring only a small disease risk, as is suggested by the few small-scale EWAS to date22,23,40-42. In this case, it may be more fruitful to use approaches that integrate both computational and experimental methodologies to look at perturbations of entire transcriptional networks. The issue of reverse causation is also important in post-EWAS experiments, both in terms of which variants to follow-up, and the experimental approaches.
Even if the etiological role of any identified epigenetic variant proves elusive, it may still be possible to use them as predictive biomarkers. In this regard, the combination of chemical stability and ontogenetic plasticity make DNAm ideally suited as a biomarker. Translating any molecular marker including DNAm differences into clinically informative biomarkers has turned out to be more challenging70 than had been expected but progress has been made. Following earlier setbacks, a multi centre study identified, validated and replicated hypermethylation at SEPT9 as a blood-based DNAm biomarker for colorectal cancer in 200871, leading to a commercial test in early 201072. But enthusiasm is tempered with caution, as illustrated by the problems encountered by the cancer community in identifying biomarkers that predict which patients would benefit from a particular therapy70. The main problem has been the inability to select patients with a molecularly well-defined disease phenotypedue in large part to the heterogeneity of cancer tissues. Molecular heterogeneity is also an issue, though expected to be less important, for the common diseases that are being targeted by the first wave of EWAS.
Based on this experience, a systematic approach such as the recently launched OncoTrack project (see under Links) is needed to advance the field. Two bodies in particular - the Biomarkers Consortium and the AACR-FDA-NCI Cancer Biomarkers Collaborative - have recently issued a comprehensive report on the current state of affairs and future directions73. The response of the community has been positive with calls like ‘Bring on the biomarkers74” and pledging to replace the patched framework of fragmented research by a co-ordinated ‘big-science’ approach (such as OncoTrack) which has proved successful for efforts like the human and cancer genome projects. Based on this and other efforts, we can be cautiously optimistic that similar progress will also be made for epigenetic biomarkers.
The correlations that have been observed between genotype and epigenotype (methQTLs) are encouraging for the prospects of further integrated analysis. A recent study39 analysing SNPs, gene expression and DNAm in 77 HapMap cell lines identified SNPs that affect both gene expression and DNAm and provides evidence for shared genetic and epigenetic mechanisms affecting multiple QTLs. In this way, EWAS can be used to investigate genetic predispositions that exert their function through epigenetic mechanisms. A possible strategy involves designing of a custom array tiled across haplotypes identified by disease-associated GWAS SNPs, profiling it for differential DNAm and analysing the data stratified for risk SNPs rather than cases and controls. Using this strategy a recent study75 successfully integrated GWAS and EWAS data to identify haplotype-specific DNA methylation (HSM) in a Type 2 Diabetes and Obesity susceptibility locus. In the future, it may well be possible to do similar analyses for additional and combinatorial epigenetic marks to capture certain chromatin disease states e.g. based on altered bivalency status that are currently not easily captured by DNAm. Using multivariate Hidden Markov analysis of recurrent and spatially coherent combinations of epigenetic marks, a recent study76 reported 51 distinct chromatin states for human T cells that look highly promising for possible integration with GWAS data of blood-based diseases.
The success of GWAS in identifying disease-associated genetic variations clearly warrants the development of complementary approaches to identify additional variations that cannot be captured with GWAS. As outlined in this article, EWAS has the potential to do just that by capturing disease-associated epigenetic variations such as differential DNA methylation.
The single most useful resource empowering GWAS was the availability of a detailed SNP map of the human genome77,78 which allowed the selection of so-called tag SNPs for comprehensive variation coverage and cost-efficient profiling. DNAm is correlated over tissue-specific blocks of CpG sites spanning up to 1 Kb79. Knowledge of this block structure for different tissues and cell types has and will continue to improve the selection of CpG sites for EWAS as new methylome maps become available. Currently, such high-resolution maps are available for human embryonic stem cells, foetal fibroblasts and peripheral blood monocytes8,57, informing potential EWAS on early developmental disorders and blood-based diseases. As part of the recently launched International Human Epigenome Consortium (IHEC), 1000 reference epigenomes (including methylome maps) will be generated for human tissues and cell types over the coming years. In this context, these maps can be considered as the epigenetic equivalent to the human haplotype map and can be expected to significantly accelerate and improve our ability to conduct EWAS for many common diseases.
In addition to improving study design – for which we have discussed the key issues in this Review - the main challenge for EWAS will be access to appropriate samples. A useful starting point would be to establish the proposed Biobank Central (see under Links) which will allow researchers to electronically search for specific combinations of samples and associated data as required for EWAS. Initiation of new birth and other longitudinal cohorts should also be encouraged and existing collections should ensure that samples are suitable for EWAS and related studies that are likely to require chromatin (not just DNA) in the future. Finally, appropriately powered and designed EWAS need to be conducted to enable the development of tools for the analysis, interpretation and integration of EWAS data. To achieve this will require close cooperation between scientists, clinicians, resource providers and funding agencies as pioneered for GWAS. At the time of writing, the first wave of EWAS was still underway and an international conference (see under Links) has been arranged for later this year to discuss first results.
SB was supported by the Wellcome Trust (084071) and a Royal Society Wolfson Research Merit Award.
Biomarker Consortium: http://www.thebiomarkersconsortium.org/
Exemplar Project: http://www.oncotrack.org/
Biobank Central: http://en.wikipedia.org/wiki/BioBank_Central
Longitudinal cohort examples:
Twin cohort examples:
EWAS Conference: http://www.wellcome.ac.uk/conferences/epigenomics
Author’s web sites: