Human induced pluripotent stem cells (hiPSCs1–3) are useful in disease modeling and drug discovery, and they promise to provide a new generation of cell-based therapeutics. To date there has been no systematic evaluation of the most widely used techniques for generating integration-free hiPSCs. Here we compare Sendai-viral (SeV)4, episomal (Epi)5 and mRNA transfection mRNA6 methods using a number of criteria. All methods generated high-quality hiPSCs, but significant differences existed in aneuploidy rates, reprogramming efficiency, reliability and workload. We discuss the advantages and shortcomings of each approach, and present and review the results of a survey of a large number of human reprogramming laboratories on their independent experiences and preferences. Our analysis provides a valuable resource to inform the use of specific reprogramming methods for different laboratories and different applications, including clinical translation.
One of the most provocative recent observations in cancer epigenetics is the discovery of large hypomethylated blocks, including single copy genes, in colorectal cancer, that correspond in location to heterochromatic LOCKs (large organized chromatin lysine-modifications) and LADs (lamin-associated domains).
Here we performed a comprehensive genome-scale analysis of 10 breast, 28 colon, nine lung, 38 thyroid, 18 pancreas cancers, and five pancreas neuroendocrine tumors as well as matched normal tissue from most of these cases, as well as 51 premalignant lesions. We used a new statistical approach that allows the identification of large hypomethylated blocks on the Illumina HumanMethylation450 BeadChip platform.
We find that hypomethylated blocks are a universal feature of common solid human cancer, and that they occur at the earliest stage of premalignant tumors and progress through clinical stages of thyroid and colon cancer development. We also find that the disrupted CpG islands widely reported previously, including hypermethylated island bodies and hypomethylated shores, are enriched in hypomethylated blocks, with flattening of the methylation signal within and flanking the islands. Finally, we found that genes showing higher between individual gene expression variability are enriched within these hypomethylated blocks.
Thus hypomethylated blocks appear to be a universal defining epigenetic alteration in human cancer, at least for common solid tumors.
Electronic supplementary material
The online version of this article (doi:10.1186/s13073-014-0061-y) contains supplementary material, which is available to authorized users.
Lysine-specific demethylase 1 (LSD1) is an epigenetic enzyme that oxidatively cleaves methyl groups from monomethyl and dimethyl Lys4 of histone H3 (H3K4Me1, H3K4Me2) and can contribute to gene silencing. This study describes the design and synthesis of analogues of a monoamine oxidase antidepressant, phenelzine, and their LSD1 inhibitory properties. A novel phenelzine analogue (bizine) containing a phenyl-butyrylamide appendage was shown to be a potent LSD1 inhibitor in vitro and was selective versus monoamine oxidases A/B and the LSD1 homologue, LSD2. Bizine was found to be effective at modulating bulk histone methylation in cancer cells, and ChIP-seq experiments revealed a statistically significant overlap in the H3K4 methylation pattern of genes affected by bizine and those altered in LSD1−/− cells. Treatment of two cancer cell lines, LNCaP and H460, with bizine conferred a reduction in proliferation rate, and bizine showed additive to synergistic effects on cell growth when used in combination with two out of five HDAC inhibitors tested. Moreover, neurons exposed to oxidative stress were protected by the presence of bizine, suggesting potential applications in neurodegenerative disease.
Background Gestational age at birth strongly predicts neonatal, adolescent and adult morbidity and mortality through mostly unknown mechanisms. Identification of specific genes that are undergoing regulatory change prior to birth, such as through changes in DNA methylation, would increase our understanding of developmental changes occurring during the third trimester and consequences of pre-term birth (PTB).
Methods We performed a genome-wide analysis of DNA methylation (using microarrays, specifically CHARM 2.0) in 141 newborns collected in Baltimore, MD, using novel statistical methodology to identify genomic regions associated with gestational age at birth. Bisulphite pyrosequencing was used to validate significant differentially methylated regions (DMRs), and real-time PCR was performed to assess functional significance of differential methylation in a subset of newborns.
Results We identified three DMRs at genome-wide significance levels adjacent to the NFIX, RAPGEF2 and MSRB3 genes. All three regions were validated by pyrosequencing, and RAGPEF2 also showed an inverse correlation between DNA methylation levels and gene expression levels. Although the three DMRs appear very dynamic with gestational age in our newborn sample, adult DNA methylation levels at these regions are stable and of equal or greater magnitude than the oldest neonate, directionally consistent with the gestational age results.
Conclusions We have identified three differentially methylated regions associated with gestational age at birth. All three nearby genes play important roles in the development of several organs, including skeletal muscle, brain and haematopoietic system. Therefore, they may provide initial insight into the basis of PTB's negative health outcomes. The genome-wide custom DNA methylation array technology and novel statistical methods employed in this study could constitute a model for epidemiologic studies of epigenetic variation.
Epigenetic epidemiology; differentially methylated regions; pre-term birth; gestational age; genome-wide DNA methylation
familial aggregation; causal heterogeneity
We compared bona-fide human induced pluripotent stem cells (iPSC) derived from umbilical cord blood (CB) and neonatal keratinocytes (K). As a consequence of both incomplete erasure of tissue-specific methylation and aberrant de novo methylation, CB-iPSC and K-iPSC are distinct in genome-wide DNA methylation profiles and differentiation potential. Extended passage of some iPSC clones in culture didn't improve their epigenetic resemblance to ESC, implying that some human iPSC retain a residual “epigenetic memory” of their tissue of origin.
Tumor heterogeneity is a major barrier to effective cancer diagnosis and treatment. We recently identified cancer-specific differentially DNA-methylated regions (cDMRs) in colon cancer, which also distinguish normal tissue types from each other, suggesting that these cDMRs might be generalized across cancer types. Here we show stochastic methylation variation of the same cDMRs, distinguishing cancer from normal, in colon, lung, breast, thyroid, and Wilms tumors, with intermediate variation in adenomas. Whole genome bisulfite sequencing shows these variable cDMRs are related to loss of sharply delimited methylation boundaries at CpG islands. Furthermore, we find hypomethylation of discrete blocks encompassing half the genome, with extreme gene expression variability. Genes associated with the cDMRs and large blocks are involved in mitosis and matrix remodeling, respectively. These data suggest a model for cancer involving loss of epigenetic stability of well-defined genomic domains that underlies increased methylation variability in cancer and could contribute to tumor heterogeneity.
The behavior of epigenetic mechanisms in the brain is obscured by tissue heterogeneity and disease-related histological changes. Not accounting for these confounders leads to biased results. We develop a statistical methodology that estimates and adjusts for celltype composition by decomposing neuronal and non-neuronal differential signal. This method provides a conceptual framework for deconvolving heterogeneous epigenetic data from postmortem brain studies. We apply it to find cell-specific differentially methylated regions between prefrontal cortex and hippocampus. We demonstrate the utility of the method on both Infinium 450k and CHARM data.
DNA methylation; epigenetics; differentially methylated region; brain region; cell-type heterogeneity; deconvolution; NeuN; neuron; glia; postmortem brain; fluorescence activated cell sorting
Epigenetic mechanisms integrate genetic and environmental causes of disease. Comprehensive genome-wide analyses of epigenetic modifications have not demonstrated robust association with common diseases. Using Illumina HumanMethylation450 arrays on 354 ACPA positive rheumatoid arthritis (RA) cases and 337 controls, we identified two clusters within the MHC region whose differential methylation potentially mediates genetic risk for RA. To reduce confounding hampering previous epigenome-wide studies, we corrected for cellular heterogeneity by estimating and adjusting for cell-type proportions and used mediation analysis to filter out associations likely consequential to disease. Four CpGs also showed association between genotype and variance of methylation in addition to mean. The associations for both clusters replicated at least one CpG (p<0.01), with the rest showing suggestive association, in monocytes in an independent 12 cases and 12 controls. Thus, DNA methylation is a potential mediator of genetic risk.
In spite of our increased understanding of how genomes are dysregulated in cancer and a plethora of molecular diagnostic tools, the front line and ‘gold standard’ detection of cancer remains the pathologist’s detection of gross changes in cellular and tissue structure, most strikingly nuclear dis-organization. In fact, for over 140 years it has been noted that nuclear morphology is often disrupted in cancer. Even today, nuclear morphology measures include nuclear size, shape, DNA content (ploidy) and ‘chromatin organization’. Given the importance of nuclear shape to diagnoses of cancer phenotypes, it is surprising and frustrating that we currently lack a detailed understanding to explain these changes and how they might arise and relate to molecular events in the cell. It is an implicit hypothesis that perturbation of chromatin and epigenetic signatures may lead to alterations in nuclear structure (or vice versa) and that these perturbations lie at the heart of cancer genesis. In this review, we attempt to synthesize research leading to our current understanding on how chromatin interactions at the nuclear lamina, epigenetic modulation and gene regulation may intersect in cancer and offer a perspective on critical experiments that would help clarify how nuclear architecture may contribute to the cancerous phenotype. We also discuss the historical understanding of nuclear structure in normal cells and as a diagnostic in cancer.
Chromatin; Chromosome; Nuclear lamina; Histone; DNA methylation; Lamina Associated Domain; Epigenetics; Fluorescence in situ hybridization (FISH); Hi-C; DamID; ChIP; Cancer; Development
In honeybee societies, distinct caste phenotypes are created from the same genotype, suggesting a role for epigenetics in deriving these behaviorally different phenotypes. We found no differences in DNA methylation between irreversible worker/queen castes, but substantial differences between nurses and forager subcastes. Reverting foragers back to nurses reestablished methylation levels for a majority of genes and provided the first evidence in any organism of reversible epigenetic changes associated with behavior.
In a previous genomic analysis, using somatic methyltransferase (DNMT) knockout cells, we showed that hypomethylation decreased the expression of as many genes as were observed to increase, suggesting a previously unknown mechanism for epigenetic regulation. To address this idea, the expression of the BAG family genes was used as a model. These genes were used because their expression was decreased in DNMT1−/−, DNMT3B−/−, and double knockout cells and increased in DNMT1-overexpressing and DNMT3B-overexpressing cells. Chromatin immunoprecipitation analysis of the BAG-1 promoter in DNMT1-overexpressing or DNMT3B-overexpressing cells showed a permissive dimethyl-H3-K4/dimethyl-H3-K9 chromatin status associated with DNA-binding of CTCFL/BORIS, as well as increased BAG-1 expression. In contrast, a nonpermissive dimethyl-H3-K4/dimethyl-H3-K9 chromatin status was associated with CTCF DNA-binding and decreased BAG-1 expression in the single and double DNMT knockout cells. BORIS short hairpin RNA knockdown decreased both promoter DNA-binding, as well as BAG-1 expression, and changed the dimethyl-H3-K4/dimethyl-H3-K9 ratio to that characteristic of a nonpermissive chromatin state. These results suggest that DNMT1 and DNMT3B regulate BAG-1 expression via insulator protein DNA-binding and chromatin dynamics by regulating histone dimethylation.
Chromatin status is characterized in part by covalent posttranslational modifications of histones that regulate chromatin dynamics and direct gene expression. BORIS (brother of the regulator of imprinted sites) is an insulator DNA-binding protein that is thought to play a role in chromatin organization and gene expression. BORIS is a cancer-germ line gene; these are genes normally present in male germ cells (testis) that are also expressed in cancer cell lines as well as primary tumors. This work identifies SET1A, an H3K4 methyltransferase, and BAT3, a cochaperone recruiter, as binding partners for BORIS, and these proteins bind to the upstream promoter regions of two well-characterized procarcinogenic genes, Myc and BRCA1. RNA interference (RNAi) knockdown of BAT3, as well as SET1A, decreased Myc and BRCA1 gene expression but did not affect the binding properties of BORIS, but RNAi knockdown of BORIS prevented the assembly of BAT3 and SET1A at the Myc and BRCA1 promoters. Finally, chromatin analysis suggested that BORIS and BAT3 exert their effects on gene expression by recruiting proteins such as SET1A that are linked to changes in H3K4 dimethylation. Thus, we propose that BORIS acts as a platform upon which BAT3 and SET1A assemble and exert effects upon chromatin structure and gene expression.
Background During the past 5 years, high-throughput technologies have been successfully used by epidemiology studies, but almost all have focused on sequence variation through genome-wide association studies (GWAS). Today, the study of other genomic events is becoming more common in large-scale epidemiological studies. Many of these, unlike the single-nucleotide polymorphism studied in GWAS, are continuous measures. In this context, the exercise of searching for regions of interest for disease is akin to the problems described in the statistical ‘bump hunting’ literature.
Methods New statistical challenges arise when the measurements are continuous rather than categorical, when they are measured with uncertainty, and when both biological signal, and measurement errors are characterized by spatial correlation along the genome. Perhaps the most challenging complication is that continuous genomic data from large studies are measured throughout long periods, making them susceptible to ‘batch effects’. An example that combines all three characteristics is genome-wide DNA methylation measurements. Here, we present a data analysis pipeline that effectively models measurement error, removes batch effects, detects regions of interest and attaches statistical uncertainty to identified regions.
Results We illustrate the usefulness of our approach by detecting genomic regions of DNA methylation associated with a continuous trait in a well-characterized population of newborns. Additionally, we show that addressing unexplained heterogeneity like batch effects reduces the number of false-positive regions.
Conclusions Our framework offers a comprehensive yet flexible approach for identifying genomic regions of biological interest in large epidemiological studies using quantitative high-throughput methods.
Epigenetic epidemiology; DNA methylation; genome-wide analysis; bump hunting; batch effects
It has recently been proposed that variation in DNA methylation at specific genomic locations may play an important role in the development of complex diseases such as cancer. Here, we develop 1- and 2-group multiple testing procedures for identifying and quantifying regions of DNA methylation variability. Our method is the first genome-wide statistical significance calculation for increased or differential variability, as opposed to the traditional approach of testing for mean changes. We apply these procedures to genome-wide methylation data obtained from biological and technical replicates and provide the first statistical proof that variably methylated regions exist and are due to interindividual variation. We also show that differentially variable regions in colon tumor and normal tissue show enrichment of genes regulating gene expression, cell morphogenesis, and development, supporting a biological role for DNA methylation variability in cancer.
Bump finding; Functional data analysis; Multiple testing; Preprocessing; Variably methylation regions (VMRs)
Comprehensive high-throughput arrays for relative methylation (CHARM) was recently developed as an experimental platform and analytic approach to assess DNA methylation (DNAm) at a genome-wide level. Its initial implementation was for human and mouse. We adapted it for rat and sought to examine DNAm differences across tissues and brain regions in this model organism. We extracted DNA from liver, spleen and three brain regions: cortex, hippocampus and hypothalamus from adult Sprague Dawley rats. DNA was digested with McrBC, and the resulting methyl-depleted fraction was hybridized to the rat CHARM array along with a mock-treated fraction. Differentially methylated regions (DMRs) between tissue types were detected using normalized methylation log-ratios. In validating 24 of the most significant DMRs by bisulfite pyrosequencing, we detected large mean differences in DNAm, ranging from 33–59%, among the most significant DMRs in the across-tissue comparisons. The comparable figures for the hippocampus vs. hypothalamus DMRs were 14–40%, for the cortex vs. hippocampus DMRs, 12–29%, and for the cortex vs. hypothalamus DMRs, 5–35%, with a correlation of r2 = 0.92 between the methylation differences in 24 DMRs predicted by CHARM and those validated by bisulfite pyrosequencing. Our adaptation of the CHARM array for the rat genome yielded highly robust results that demonstrate the value of this method in detecting substantial DNAm differences between tissues and across different brain regions. This platform should prove valuable in future studies aimed at examining DNAm differences in particular brain regions of rats exposed to environmental stimuli with potential epigenetic consequences.
epigenetics; DNA methylation; methylation array; genome-wide; rat; brain
The organization of higher order chromatin is an emerging epigenetic mechanism for understanding development and disease. We and others have previously observed dynamic changes during differentiation and oncogenesis in large heterochromatin domains such as Large Organized Chromatin K (lysine) modifications (LOCKs), of histone H3 lysine-9 dimethylation (H3K9me2) or other repressive histone posttranslational modifications. The microstructure of these regions has not previously been explored.
We analyzed the genome-wide distribution of H3K9me2 in two human pluripotent stem cell lines and three differentiated cells lines. We identified > 2,500 small regions with very low H3K9me2 signals in the body of LOCKs, which were termed as euchromatin islands (EIs). EIs are 6.5-fold enriched for DNase I Hypersensitive Sites and 8-fold enriched for the binding of CTCF, the major organizer of higher-order chromatin. Furthermore, EIs are 2–6 fold enriched for differentially DNA-methylated regions associated with tissue types (T-DMRs), reprogramming (R-DMRs) and cancer (C-DMRs). Gene ontology (GO) analysis suggests that EI-associated genes are functionally related to organ system development, cell adhesion and cell differentiation.
We identify the existence of EIs as a finer layer of epigenomic architecture within large heterochromatin domains. Their enrichment for CTCF sites and DNAse hypersensitive sites, as well as association with DMRs, suggest that EIs play an important role in normal epigenomic architecture and its disruption in disease.
Epigenetics; H3K9me2; Euchromatin islands; CTCF; DNA methylation
While genome-wide association studies are ongoing to identify sequence variation influencing susceptibility to major depressive disorder (MDD), epigenetic marks, such as DNA methylation, which can be influenced by environment, might also play a role. Here we present the first genome-wide DNA methylation (DNAm) scan in MDD. We compared 39 postmortem frontal cortex MDD samples to 26 controls. DNA was hybridized to our Comprehensive High-throughput Arrays for Relative Methylation (CHARM) platform, covering 3.5 million CpGs. CHARM identified 224 candidate regions with DNAm differences >10%. These regions are highly enriched for neuronal growth and development genes. Ten of 17 regions for which validation was attempted showed true DNAm differences; the greatest were in PRIMA1, with 12–15% increased DNAm in MDD (p = 0.0002–0.0003), and a concomitant decrease in gene expression. These results must be considered pilot data, however, as we could only test replication in a small number of additional brain samples (n = 16), which showed no significant difference in PRIMA1. Because PRIMA1 anchors acetylcholinesterase in neuronal membranes, decreased expression could result in decreased enzyme function and increased cholinergic transmission, consistent with a role in MDD. We observed decreased immunoreactivity for acetylcholinesterase in MDD brain with increased PRIMA1 DNAm, non-significant at p = 0.08.
While we cannot draw firm conclusions about PRIMA1 DNAm in MDD, the involvement of neuronal development genes across the set showing differential methylation suggests a role for epigenetics in the illness. Further studies using limbic system brain regions might shed additional light on this role.
DNA methylation is a key regulator of gene function in a multitude of both normal and abnormal biological processes, but tools to elucidate its roles on a genome-wide scale are still in their infancy. Methylation sensitive restriction enzymes and microarrays provide a potential high-throughput, low-cost platform to allow methylation profiling. However, accurate absolute methylation estimates have been elusive due to systematic errors and unwanted variability. Previous microarray preprocessing procedures, mostly developed for expression arrays, fail to adequately normalize methylation-related data since they rely on key assumptions that are violated in the case of DNA methylation. We develop a normalization strategy tailored to DNA methylation data and an empirical Bayes percentage methylation estimator that together yield accurate absolute methylation estimates that can be compared across samples. We illustrate the method on data generated to detect methylation differences between tissues and between normal and tumor colon samples.
DNA methylation; Epigenetics; Microarray
Epithelial to mesenchymal transition (EMT) is an extreme example of cell plasticity, important for normal development, injury repair, and malignant progression. Widespread epigenetic reprogramming occurs during stem cell differentiation and malignant transformation, but EMT-related epigenetic reprogramming is poorly understood. Here we investigated epigenetic modifications during TGF-β-mediated EMT. While DNA methylation was unchanged during EMT, we found global reduction of the heterochromatin mark H3-lys9 dimethylation (H3K9Me2), increase of the euchromatin mark H3-lys4 trimethylation (H3K4Me3), and increase of the transcriptional mark H3-lys36 trimethylation (H3K36Me3). These changes were largely dependent on lysine-specific deaminase-1 (Lsd1), and Lsd1 loss-of-function experiments showed marked effects on EMT-driven cell migration and chemoresistance. Genome-scale mapping revealed that chromatin changes were largely specific to large organized heterochromatin K9-modifications (LOCKs), suggesting that EMT is characterized by reprogramming of specific chromatin domains across the genome.
The KvLQT1 gene encodes a voltage-gated potassium channel. Mutations in KvLQT1 underlie the dominantly transmitted Ward-Romano long QT syndrome, which causes cardiac arrhythmia, and the recessively transmitted Jervell and Lange-Nielsen syndrome, which causes both cardiac arrhythmia and congenital deafness. KvLQT1 is also disrupted by balanced germline chromosomal rearrangements in patients with Beckwith-Wiedemann syndrome (BWS), which causes prenatal overgrowth and cancer. Because of the diverse human disorders and organ systems affected by this gene, we developed an animal model by inactivating the murine Kvlqt1. No electrocardiographic abnormalities were observed. However, homozygous mice exhibited complete deafness, as well as circular movement and repetitive falling, suggesting imbalance. Histochemical study revealed severe anatomic disruption of the cochlear and vestibular end organs, suggesting that Kvlqt1 is essential for normal development of the inner ear. Surprisingly, homozygous mice also displayed threefold enlargement by weight of the stomach resulting from mucous neck cell hyperplasia. Finally, there were no features of BWS, suggesting that Kvlqt1 is not responsible for BWS.
The epigenome consists of non–sequence-based modifications, such as DNA methylation, that are heritable during cell division and that may affect normal phenotypes and predisposition to disease. Here, we have performed an unbiased genome-scale analysis of ~4 million CpG sites in 74 individuals with comprehensive array-based relative methylation (CHARM) analysis. We found 227 regions that showed extreme interindividual variability [variably methylated regions (VMRs)] across the genome, which are enriched for developmental genes based on Gene Ontology analysis. Furthermore, half of these VMRs were stable within individuals over an average of 11 years, and these VMRs defined a personalized epigenomic signature. Four of these VMRs showed covariation with body mass index consistently at two study visits and were located in or near genes previously implicated in regulating body weight or diabetes. This work suggests an epigenetic strategy for identifying patients at risk of common disease.
The DNA of most vertebrates is depleted in CpG dinucleotide: a C followed by a G in the 5′ to 3′ direction. CpGs are the target for DNA methylation, a chemical modification of cytosine (C) heritable during cell division and the most well-characterized epigenetic mechanism. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). Knowing CGI locations is important because they mark functionally relevant epigenetic loci in development and disease. For various mammals, including human, a readily available and widely used list of CGI is available from the UCSC Genome Browser. This list was derived using algorithms that search for regions satisfying a definition of CGI proposed by Gardiner-Garden and Frommer more than 20 years ago. Recent findings, enabled by advances in technology that permit direct measurement of epigenetic endpoints at a whole-genome scale, motivate the need to adapt the current CGI definition. In this paper, we propose a procedure, guided by hidden Markov models, that permits an extensible approach to detecting CGI. The main advantage of our approach over others is that it summarizes the evidence for CGI status as probability scores. This provides flexibility in the definition of a CGI and facilitates the creation of CGI lists for other species. The utility of this approach is demonstrated by generating the first CGI lists for invertebrates, and the fact that we can create CGI lists that substantially increases overlap with recently discovered epigenetic marks. A CGI list and the probability scores, as a function of genome location, for each species are available at http://www.rafalab.org.
CpG island; Epigenetics; Hidden Markov model; Sequence analysis
Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology.
Epigenetics; Epidemiology; DNA methylation
Epigenomics provides the functional context of genome sequence, analogous to the functional anatomy of the human body provided by Vesalius a half millennium ago. Much of what appear to be inconclusive genetic data for common disease could therefore become meaningful in an epigenomic context.