Disentangling epigenetic effects from the confounding influences of genetic and/or environmental heterogeneity represents a significant barrier to elucidating the etiological role of epigenetic variation in human complex disease. Herein lies the key advance made by our study, as the T1D–MVPs we report here represent the first example of disease-associated epigenetic variation that antedates clinical disease and cannot be explained by genetic heterogeneity, pharmacological treatment, or post-disease cellular dysfunction. Our results provide a platform from which to address several key issues in future studies that we discuss below.
First is the issue of causality. In GWASs, any disease-associated genetic variant is, or linked to, a causative variant. In EWASs, on the other hand, the direction of the cause-consequence relationship is difficult to define if an appropriate study design is not employed. Specifically, the commonly used unrelated singleton “case versus control” design of GWAS is not appropriate as epigenetic variation found to be associated with the disease could simply be due to the disease process itself or disease-associated genetic variation. It is for this reason that we employed the study design described here: T1D–discordant MZ twin pairs combined with longitudinally sampled pre–T1D singletons to rule out genetic differences and establish the temporal origins of T1D–associated epigenetic variation. Using this approach, we were able to demonstrate that T1D–MVPs antedate clinical disease. However, it will be important to further explore the temporal origins of the T1D–MVPs by analyzing samples obtained before the appearance of T1D–associated autoantibodies, which could help determine whether T1D–MVPs arise even before the sub-clinical immune phase. In this regard, longitudinal birth cohorts will be invaluable 
. If some T1D–MVPs were found before the sub-clinical immune autoantibody response, then the hypothesis that these T1D–MVPs are causing disease would be strengthened. The identification of T1D–MVPs in individuals before the appearance of T1D–associated autoantibodies would exclude T1D–MVPs being simply secondary to the autoantibody-associated immune process.
Establishing the temporal origins of T1D–MVPs will also be useful for elucidating the biological origins of T1D–MVPs. For example, if there were evidence that T1D–MVPs exist at birth (e.g. from birth-cohort studies), then this would suggest stochastic or environmental factors that operate in utero
. Given that we have studied MZ twins—genetically identical individuals exposed to similar environments during childhood—early life stochastic origins of T1D–MVPs is an attractive idea. Indeed, stochastic epigenetic variation in humans is more common than previously appreciated as demonstrated by the recent genome-scale analysis of DNA methylation profiles in 114 monozygotic (MZ) and 80 dizygotic (DZ) twins 
. A potential source of stochastic epigenetic variation could be genetic variants that increase the probability of stochastic epigenetic variation in cis
, as suggested by various authors 
. In the context of our results, it doesn't mean that T1D–MVPs are due to somatic genetic differences, but rather the T1D–discordant twins may harbor germline genetic variants that are associated with increased levels of epigenetic stochasticity, and indeed we find that T1D–MVPs are less epigenetically variable in the normal MZ twins (). If this occurs in the context of a genomic background that is predisposed to a given disease, then it could impact on the probability of one twin developing the disease, whereas the co-twin remains disease-free. However, it is also possible that T1D–MVPs are induced environmentally as MZ twins are exposed to similar, but not identical, environments and there are examples of disease-relevant environmental factors that operate in early life to influence disease-risk 
. Given a large enough sample size and genome-coverage, it might be possible to identify environmental triggers based on gene regulatory networks enriched for T1D–associated epigenetic and transcriptional variation.
Third, although we have focused on promoter-associated single CpGs here, our data suggest that larger surrounding genomic regions are affected (i.e. differentially methylated regions or DMRs), and it will be important to further define these regions spatially. In the near future, it should be possible to perform high throughput sequencing-based whole-genome DNA methylomic profiling in large cohorts to: (i) identify new T1D–MVPs/DMRs, including those that might exist outside of promoter regions; (ii) help define the boundaries of the T1D–associated DMRs, if they exist; (iii) establish the hierarchy of CpG sites within a DMR in terms of functional impact, that is, it is possible we have identified T1D–MVPs that are ‘linked’ to the most discriminative CpG site i.e. a ‘tag’-MVP, similar to tag-SNPs in GWASs; (iv) profile a number of key cell types including other immune effector cells.
Fourth, we need to understand the functional outcome of T1D–MVPs at the molecular level. The most obvious impact is on gene expression, but equally important will be investigations into how the MVPs alter the local chromatin structure. For example, do they alter the binding of key transcription factors? Or do they correlate with alterations in other epigenetic marks such as histone modifications? The magnitude of methylation differences we have identified at T1D–MVPs is relatively small compared with DNA methylation perturbations generally observed in the context of cancer. However, given that other small-scale studies of non-malignant disease-associated methylation variation in humans also report effects of small magnitude 
, it is quite possible that this is the norm for complex disease-associated epigenomic variation. In this regard, it is worth drawing parallels with findings from GWAS in which most variants individually confer a small disease-risk 
. Therefore, studying the local chromatin architecture and gene expression will help define how DNA methylation variants of small magnitude impact on molecular outcomes in a variety of key immune effector cells, thus helping to elucidate how T1D–MVPs, in combination with genetic and other environmental factors, are involved in T1D etiology and the causal or consequential nature of the T1D–MVPs. Of course, it is also quite possible that some T1D–MVPs are not directly involved in the T1D pathogenesis process, but rather are biomarkers for the disease. This is similar to T1D–associated GAD65, IA2 and islet cell autoantibodies, which are highly predictive of disease, but without evidence that they are involved in T1D etiology. Analysis of individuals before they present with autoantibodies will be key to establishing whether T1D–MVPs are valuable biomarkers for the disease that can augment the predictive power of autoantibodies and genetic variants.
It is noteworthy that a T1D–MVP signature was detected by assaying a relatively modest number of samples and genome coverage, which emphasizes the power of our study design that combines MZ twins and prospectively sampled individuals, as opposed to the typical singleton ‘case versus control’ approach. Although previous complex disease epigenomic studies have correlated disease-associated epigenetic variants with changes in gene expression or temporal stability 
, none have been able to address the key question of temporal origins, which is critical for establishing the direction of the cause-consequence relationship between disease phenotype and epigenetic variation. Therefore, in addition to identifying a previously unappreciated molecular component of type 1 diabetes risk, we believe our study also represents one possible blueprint for future EWASs of other complex diseases