The second general advantage of studying epigenetic patterns in twins is in identifying epigenetic variants that are linked to disease, using EWAS of disease-discordant identical twins. The disease-discordant twin approach holds great promise and has proven to be successful in identifying a number of epidemiological and environmental risk factors in complex phenotypes [21
]. Disease-discordant identical twins can be seen as an ideal model, because twins are matched for most genetic variants, as well as many non-genetic effects such as early environment, maternal effects, and age and cohort effects. Furthermore, rates of twin discordances are higher than commonly believed, and are generally >50% for even the most common complex traits studied (Figure ).
Figure 1 Monozygotic twin discordance rates for common disease. Estimates of mean monozygotic twin discordance rates from the literature and the TwinsUK cohort for a series of common diseases, such as colon cancer and breast cancer , rheumatoid arthritis (RA) (more ...)
Several EWAS in disease-discordant twins have been published within the past year and the results show a trend - each study reported modest, but consistent, differential methylation in moderate to large numbers of genes relevant to the phenotype. We briefly describe results from three recent studies of common diseases in discordant twins, which were performed on the same promoter-specific DNA methylation platform (Illumina27K).
Dempster et al
] examined whole-blood DNA methylation patterns in 22 monozygotic twin pairs discordant for schizophrenia or bipolar disorder. They identified many differentially methylated regions (DMRs), and pathway analysis of the top loci showed a significant enrichment for gene networks directly relevant to psychiatric disorders and neurodevelopment. The mean methylation difference between affected and unaffected co-twins was 6% at the top DMR, but varied considerably across the sample. Assuming a conservative Bonferroni-adjusted threshold (α = 1.9 × 10-6
), standard paired-analysis results did not surpass the multiple testing correction, but - taking into account heterogeneity across families - resulted in genome-wide significant associations at the top DMRs.
Rakyan et al
] examined DNA methylation in CD14+
monocytes from 15 type 1 diabetes (T1D) discordant monozygotic twin pairs. Assuming a conservative Bonferroni-adjusted threshold (α = 2.2 × 10-6
), standard paired-analysis results did not surpass the multiple testing correction. However, the authors followed up the top 132 DMRs in four additional T1D-discordant monozygotic pairs and observed a similar direction of association effects. Pathway analysis indicated that several of the genes associated with the 132 DMRs were linked to T1D or the immune response. The authors also obtained longitudinal DNA methylation profiles in two additional datasets, which showed that the DMR variants were enriched in individuals both before and after disease onset, suggesting that the DMR effects arise early on in the etiological process that leads to T1D.
Gervin et al
] assessed DNA methylation and gene expression differences in psoriasis-discordant monozygotic twin pairs, using samples from CD4+
(17 monozygotic pairs) and CD8+
(13 monozygotic pairs) cells. The authors observed many DMRs and differentially expressed regions with small effects, which were not significant genome-wide. However, combined analysis of DNA methylation and gene expression identified genes where differences in DNA methylation were correlated with differences in gene expression, and several of the top-ranked genes were known to be associated with psoriasis. Gene ontology analysis revealed an enrichment of genes involved in biological processes associated with the immune response and in pathways comprising cytokines and chemokines, which have a clear role in psoriasis.
In each of the three studies there were many DMRs with modest effects, but these were often located in genes that are either known candidates for, or have apparent biological relevance to, the trait. These findings are especially exciting because of the overlap with molecular studies and genome-wide association study (GWAS) results, which imply that epigenetic studies of disease may prove to reveal not just markers of the disease process, but a novel approach to studying risk factors and mechanisms of complex phenotype susceptibility and progression. EWAS could therefore provide another route for the discovery of novel disease-associated SNPs. The EWAS performed to date have identified epigenetic variants with effect sizes larger than typical GWAS effects. For example, a recent DNA methylation study of smoking identified a DMR in a CpG site in the F2RL3
gene, coding for protease-activated receptor-4 (PAR4), at which median DNA methylation levels were 83% in heavy smokers and 95% in non-smokers, giving a difference of 12% methylation between the two groups [26
]. This corresponds to an odds ratio of 3.9 of the epigenetic variant [27
], which is approximately 3.5-fold greater than reported GWAS effects. However, EWAS findings also raise two important questions: first, why have genome-wide significant EWAS signals not yet been identified in known candidate genes; and second, are the identified changes causal or secondary to the trait?
We believe that the first issue is a question of power. None of the studies so far have used large samples or high-resolution methylation (or other epigenetic) assays. Typically, studies have either used very small samples (n
< 5) with high-resolution approaches such as bisulfite sequencing [28
], or lower-resolution assays, such as Illumina27K, with modest sample sizes (n
= 13 to 25) [4
]. The power of these studies to detect disease-related DNA differential methylation effects will depend on many factors. These include variables describing the biology of DNA methylation, such as the initial trigger of the epigenetic variant and its stability through cell division, its effect size on the disease (or of the disease on the methylation variant), the coverage of the methylation assay, and sample size and study design. Kaminsky et al
] estimated the power of the discordant twin study design, using a particular CpG-island microarray methylation variant in a candidate gene, and found reasonable power to detect DMRs with 15 twin pairs. However, formal power calculations for more extensive genome-wide coverage have not yet been reported in twins. Preliminary estimates from published DMRs report low (35%) to reasonable (>80%) power to detect DMRs at specific CpG sites, at methylation differences of 5 to 6% between affected and unaffected twins [4
]. The observed variability of the reported methylation differences at the CpG site of interest (and distribution of DNA methylation levels in the sample) will also impact power, as has been observed in traditional case-control DNA methylation power analysis [27
The second disease-related differential methylation question is whether it is possible to distinguish epigenetic changes that are causal from those that arise secondary to disease. The identification of potential causal effects is exciting, but secondary effects can also help us to understand complex phenotype progression, and may lead to the determination of early diagnostic or prognostic markers. In both cases the therapeutic value of the results has great potential.
We propose two approaches to disentangle potential epigenetic cause from consequence in disease: first, integrating genetic-epigenetic data in phenotype analysis; and second, obtaining longitudinal epigenetic data before and after disease onset. Genetic-epigenetic studies would identify cases where genetic effects on the trait are potentially mediated by DNA methylation, and DNA methylation is therefore likely to be causal to the trait. In these cases genetic variants that are associated with the trait would also tend to be meQTLs for the CpG site, at which DNA methylation is also associated with the phenotype. However, the proportion of CpG sites in the genome where DNA methylation is under the influence of genetic effects seems to be relatively small (albeit based on low-resolution scans so far). In addition, the majority of genetic-epigenetic effects on the phenotype may already be identified in gene mapping studies of disease, and EWAS findings would in some cases only clarify potential mechanisms of action of already-identified GWAS signals. It is also possible that the genetic variant interacts with the epigenetic variant in disease susceptibility; for example, DMR effects may affect only disease-discordant monozygotic twins of a particular genotype. However, although genetic-epigenetic disease results imply causality, this is not necessarily always the case. It is possible that genetic associations lead to the phenotype of interest, which in turn drives changes in methylation and alters gene expression as a consequence.
The most conclusive approach to disentangle potential cause versus consequence of DNA methylation changes associated with disease is to perform longitudinal studies. In this case, the underlying cause of the DNA methylation effect can be genetic or non-genetic, and should be examined before, during, and after disease onset to help understand its role in disease onset and progression. Longitudinal studies are crucial to understanding epigenetic effects in disease and should be a priority when samples are available, which sadly is often not the case.
The main goal of longitudinal DNA methylation studies is to identify whether the DNA methylation change arose prior to disease onset and is therefore likely to be causal. If that is the case, it is important to note the timing of the change both before the appearance of the phenotype and potentially during intermediate pre-clinical phenotype states prior to final disease (for example, normoglycemic, pre-diabetic, diabetic). Obtaining such data will inform the biological model of epigenetic effects on disease. For instance, is there a threshold model similar to the second hit in retinoblastoma [31
], which can be applied to DNA methylation effects during phenotype onset? If a threshold model is correct, then identifying the threshold of deleterious DNA methylation changes for each phenotype will be of clinical value. If longitudinal methylation studies identify effects that are likely to be causal to disease, then another immediate question is whether reversing these methylation effects during or after disease onset can help prevent, delay, or ameliorate the disease.
On the other hand, if longitudinal studies predominantly find that observed methylation changes are probably consequences of disease, then these findings can give insights into the mechanisms involved in disease progression. A related question is whether reversal of such changes can also reverse disease or prevent exacerbation of disease symptoms. This becomes further complicated in the case of relapsing diseases such as bipolar disorder, multiple sclerosis, or psoriasis, where there is a known or unknown trigger of the condition.
In conclusion, the early twin EWAS have provided us with fascinating insights into the potential power of the identical disease-discordant twin model to find novel susceptibility genes as well as novel disease mechanisms and potential drug targets. These results call for larger samples, replication, and more in-depth analyses, including genetic-epigenetic analyses and longitudinal assays, to establish the role of epigenetic variants in disease. Epigenetic effects may also play an important role in relapsing diseases such as bipolar disorder, multiple sclerosis and psoriasis, where there is a known or unknown trigger of the condition.