Transcriptional misregulation is involved in the development of many diseases, especially neoplastic transformation. Distal regulatory elements, such as enhancers, play a major role in specifying cell-specific transcription patterns in both normal and diseased tissues, suggesting that enhancers may be prime targets for therapeutic intervention. By focusing on modulating gene regulation mediated by cell type-specific enhancers, there is hope that normal epigenetic patterning in an affected tissue could be restored with fewer side effects than observed with treatments employing relatively nonspecific inhibitors such as epigenetic drugs. New methods employing genomic nucleases and site-specific epigenetic regulators targeted to specific genomic regions, using either artificial DNA-binding proteins or RNA–DNA interactions, may allow precise genome engineering at enhancers. However, this field is still in its infancy and further refinements that increase specificity and efficiency are clearly required.
CRISPRs; DNA methylation; enhancers; epigenetic therapy; gene expression; genome engineering; genomic nuclease; histone modifications; TALENs; ZFNs
Developmental history shapes the epigenome and biological function of differentiated cells. Epigenomic patterns have been broadly attributed to the three embryonic germ layers. Here we investigate how developmental origin influences epigenomes. We compare key epigenomes of cell types derived from surface ectoderm (SE), including keratinocytes and breast luminal and myoepithelial cells, against neural crest-derived melanocytes and mesoderm-derived dermal fibroblasts to identify SE differentially methylated regions (SE-DMRs). DNA methylomes of neonatal keratinocytes share many more DMRs with adult breast luminal and myoepithelial cells than with melanocytes and fibroblasts from the same neonatal skin. This suggests that SE origin contributes to DNA methylation patterning, while shared skin tissue environment has limited effect on epidermal keratinocytes. Hypomethylated SE-DMRs are in proximity to genes with SE relevant functions. They are also enriched for enhancer- and promoter-associated histone modifications in SE-derived cells, and for binding motifs of transcription factors important in keratinocyte and mammary gland biology. Thus, epigenomic analysis of cell types with common developmental origin reveals an epigenetic signature that underlies a shared gene regulatory network.
Recent studies indicate that DNA methylation can be used to identify transcriptional enhancers, but no systematic approach has been developed for genome-wide identification and analysis of enhancers based on DNA methylation. We describe ELMER (Enhancer Linking by Methylation/Expression Relationships), an R-based tool that uses DNA methylation to identify enhancers and correlates enhancer state with expression of nearby genes to identify transcriptional targets. Transcription factor motif analysis of enhancers is coupled with expression analysis of transcription factors to infer upstream regulators. Using ELMER, we investigated more than 2,000 tumor samples from The Cancer Genome Atlas. We identified networks regulated by known cancer drivers such as GATA3 and FOXA1 (breast cancer), SOX17 and FOXA2 (endometrial cancer), and NFE2L2, SOX2, and TP63 (squamous cell lung cancer). We also identified novel networks with prognostic associations, including RUNX1 in kidney cancer. We propose ELMER as a powerful new paradigm for understanding the cis-regulatory interface between cancer-associated transcription factors and their functional target genes.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0668-3) contains supplementary material, which is available to authorized users.
The role of intermediate methylation states in DNA is unclear. Here, to comprehensively identify regions of intermediate methylation and their quantitative relationship with gene activity, we apply integrative and comparative epigenomics to 25 human primary cell and tissue samples. We report 18,452 intermediate methylation regions located near 36% of genes and enriched at enhancers, exons and DNase I hypersensitivity sites. Intermediate methylation regions average 57% methylation, are predominantly allele-independent and are conserved across individuals and between mouse and human, suggesting a conserved function. These regions have an intermediate level of active chromatin marks and their associated genes have intermediate transcriptional activity. Exonic intermediate methylation correlates with exon inclusion at a level between that of fully methylated and unmethylated exons, highlighting gene context-dependent functions. We conclude that intermediate DNA methylation is a conserved signature of gene regulation and exon usage.
Many loci in the mammalian genome are intermediately methylated. Here, by comprehensively identifying these loci and quantifying their relationship with gene activity, the authors show that intermediate methylation is an evolutionarily conserved epigenomic signature of gene regulation.
While significant effort has been dedicated to the characterization of epigenetic changes associated with prenatal differentiation, relatively little is known about the epigenetic changes that accompany post-natal differentiation where fully functional differentiated cell types with limited lifespans arise. Here we sought to address this gap by generating epigenomic and transcriptional profiles from primary human breast cell types isolated from disease-free human subjects. From these data we define a comprehensive human breast transcriptional network, including a set of myoepithelial- and luminal epithelial-specific intronic retention events. Intersection of epigenetic states with RNA expression from distinct breast epithelium lineages demonstrates that mCpG provides a stable record of exonic and intronic usage, whereas H3K36me3 is dynamic. We find a striking asymmetry in epigenomic reprogramming between luminal and myoepithelial cell types, with the genomes of luminal cells harbouring more than twice the number of hypomethylated enhancer elements compared with myoepithelial cells.
Epigenetic changes associated with post-natal differentiation have been characterized. Here the authors generate epigenomic and transcriptional profiles from primary human breast cells, providing insights into the transcriptional and epigenetic events that define post-natal cell differentiation in vivo.
Due to the hyper-activation of WNT signaling in a variety of cancer types, there has been a strong drive to develop pathway-specific inhibitors with the eventual goal of providing a chemotherapeutic antagonist of WNT signaling to cancer patients. A new category of drugs, called epigenetic inhibitors, are being developed that hold high promise for inhibition of the WNT pathway. The canonical WNT signaling pathway initiates when WNT ligands bind to receptors, causing the nuclear localization of the co-activator β-catenin (CTNNB1), which leads to an association of β-catenin with a member of the TCF transcription factor family at regulatory regions of WNT-responsive genes. The TCF/β-catenin complex then recruits CBP (CREBBP) or p300 (EP300), leading to histone acetylation and gene activation. A current model in the field is that CBP-driven expression of WNT target genes supports proliferation whereas p300-driven expression of WNT target genes supports differentiation. The small molecule inhibitor ICG-001 binds to CBP, but not to p300, and competitively inhibits the interaction of CBP with β-catenin. Upon treatment of cancer cells, this should reduce expression of CBP-regulated transcription, leading to reduced tumorigenicity and enhanced differentiation.
We have compared the genome-wide effects on the transcriptome after treatment with ICG-001 (the specific CBP inhibitor) versus C646, a compound that competes with acetyl-coA for the Lys-coA binding pocket of both CBP and p300. We found that both drugs cause large-scale changes in the transcriptome of HCT116 colon cancer cells and PANC1 pancreatic cancer cells and reverse some tumor-specific changes in gene expression. Interestingly, although the epigenetic inhibitors affect cell cycle pathways in both the colon and pancreatic cancer cell lines, the WNT signaling pathway was affected only in the colon cancer cells. Notably, WNT target genes were similarly downregulated after treatment of HCT116 with C646 as with ICG-001.
Our results suggest that treatment with a general HAT inhibitor causes similar effects on the transcriptome as does treatment with a CBP-specific inhibitor and that epigenetic inhibition affects the WNT pathway in HCT116 cells and the cholesterol biosynthesis pathway in PANC1 cells.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-8935-8-9) contains supplementary material, which is available to authorized users.
Epigenetic inhibitor; Histone acetylation; WNT signaling; C646; ICG-001; Colon cancer; Pancreatic cancer; TCF7L2; Cholesterol biosynthesis
Understanding the mechanisms underlying ErbB3 over-expression in breast cancer will facilitate the rational design of therapies to disrupt ErbB2-ErbB3 oncogenic function. While ErbB3 over-expression is frequently observed in breast cancer, the factors mediating its aberrant expression are poorly understood. In particular, the ErbB3 gene is not significantly amplified, raising the question as to how ErbB3 over-expression is achieved. In this study we demonstrate that the ZNF217 transcription factor, amplified at 20q13 in ~20% of breast tumors, regulates ErbB3 expression. Analysis of a panel of human breast cancer cell lines (n = 50) and primary human breast tumors (n=15) demonstrated a strong positive correlation between ZNF217 and ErbB3 expression. Ectopic expression of ZNF217 in human mammary epithelial cells induced ErbB3 expression while ZNF217 silencing in breast cancer cells resulted in decreased ErbB3 expression. While ZNF217 has previously been linked with transcriptional repression due to its close association with CtBP1/2 repressor complexes, our results demonstrate that ZNF217 also activates gene expression. We demonstrate that ZNF217 recruitment to the ErbB3 promoter is CtBP1/2-independent and that ZNF217 and CtBP1/2 play opposite roles in regulating ErbB3 expression. In addition, we identify ErbB3 as one of the mechanisms by which ZNF217 augments PI-3K/Akt signaling.
ZNF217; ErbB3; CtBP2; 20q13; breast cancer
Colorectal cancer (CRC) is a leading cause of cancer-related deaths in the United States. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with increased risk for CRC. A molecular understanding of the functional consequences of this genetic variation has been complicated because each GWAS SNP is a surrogate for hundreds of other SNPs, most of which are located in non-coding regions. Here we use genomic and epigenomic information to test the hypothesis that the GWAS SNPs and/or correlated SNPs are in elements that regulate gene expression, and identify 23 promoters and 28 enhancers. Using gene expression data from normal and tumour cells, we identify 66 putative target genes of the risk-associated enhancers (10 of which were also identified by promoter SNPs). Employing CRISPR nucleases, we delete one risk-associated enhancer and identify genes showing altered expression. We suggest that similar studies be performed to characterize all CRC risk-associated enhancers.
Previous studies identified genetic variants associated with colorectal cancer (CRC), but the functional consequences of these genetic risk factors remain poorly understood. Here, the authors report that CRC risk variants reside in promoters and enhancers and could increase colon cancer risk through gene expression regulation.
Gene expression is epigenetically regulated by a combination of histone modifications and methylation of CpG dinucleotides in promoters. In normal cells, CpG-rich promoters are typically unmethylated, marked with histone modifications such as H3K4me3, and are highly active. During neoplastic transformation, CpG dinucleotides of CG-rich promoters become aberrantly methylated, corresponding with the removal of active histone modifications and transcriptional silencing. Outside of promoter regions, distal enhancers play a major role in the cell type-specific regulation of gene expression. Enhancers, which function by bringing activating complexes to promoters through chromosomal looping, are also modulated by a combination of DNA methylation and histone modifications.
Here we use HCT116 colorectal cancer cells with and without mutations in DNA methyltransferases, the latter of which results in a 95% reduction in global DNA methylation levels. These cells are used to study the relationship between DNA methylation, histone modifications, and gene expression. We find that the loss of DNA methylation is not sufficient to reactivate most of the silenced promoters. In contrast, the removal of DNA methylation results in the activation of a large number of enhancer regions as determined by the acquisition of active histone marks.
Although the transcriptome is largely unaffected by the loss of DNA methylation, we identify two distinct mechanisms resulting in the upregulation of distinct sets of genes. One is a direct result of DNA methylation loss at a set of promoter regions and the other is due to the presence of new intragenic enhancers.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0469-0) contains supplementary material, which is available to authorized users.
The self-renewal capacity ascribed to embryonic stem cells (ESCs) is reminiscent of cancer cell proliferation, raising speculation that a common network of genes may regulate these traits. A search for general regulators of these traits yielded a set of microRNAs for which expression is highly enriched in hESCs and liver cancer cells (HCCs), but attenuated in differentiated quiescent hepatocytes. Here, we show that these microRNAs promote hESC self-renewal, as well as HCC proliferation, and when overexpressed in normally quiescent hepatocytes, induce proliferation and activate cancer signaling pathways. Proliferation in hepatocytes is mediated through translational repression of Pten, Tgfbr2, Klf11 and Cdkn1a, which collectively dysregulates the PI3K/AKT/mTOR and TGFβ tumor suppressor signaling pathways. Furthermore, aberrant expression of these miRNAs is observed in human liver tumor tissues, and induces epithelial-mesenchymal transition in hepatocytes. These findings suggest that microRNAs that are essential in normal development as promoters of ESC self-renewal are frequently up-regulated in human liver tumors, and harbor neoplastic transformation potential when they escape silencing in quiescent human hepatocytes.
Human embryonic stem cells; microRNAs; hepatocellular carcinoma cells
DNA motifs are short sequences varying from 6 to 25 bp and can be highly variable and degenerated. One major approach for predicting transcription factor (TF) binding is using position weight matrix (PWM) to represent information content of regulatory sites; however, when used as the sole means of identifying binding sites suffers from the limited amount of training data available and a high rate of false-positive predictions. ChIPMotifs program is a de novo motif finding tool developed for ChIP-based high-throughput data, and W-ChIPMotifs is a Web application tool for ChIPMotifs. It composes various ab initio motif discovery tools such as MEME, MaMF, Weeder and optimizes the significance of the detected motifs by using bootstrap re-sampling error estimation and a Fisher test. Using these techniques, we determined a PWM for OCT4 which is similar to canonical OCT4 consensus sequence. In a separate study, we also use de novo motif discovery to suggest that ZNF263 binds to a 24-nt site that differs from the motif predicted by the zinc finger code in several positions.
Motif; ChIP; Position weight matrix; OCT4; ZNF263
Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
The dynamic modification of DNA and histones plays a key role in transcriptional regulation through altering the packaging of DNA and modifying the nucleosome surface. These chromatin states, also referred to as the epigenome, are distinctive for different tissues, developmental stages, and disease states and can also be altered by environmental influences. New technologies allow the genome-wide visualization of the information encoded in the epigenome. For example, the chromatin immunoprecipitation (ChIP) assay allows investigators to characterize DNA–protein interactions in vivo. ChIP followed by hybridization to microarrays (ChIP-chip) or by high-throughput sequencing (ChIP-seq) are both powerful tools to identify genome-wide profiles of transcription factors, histone modifications, DNA methylation, and nucleosome positioning. ChIP-seq technology, which can now interrogate the entire human genome at high resolution with only one lane of sequencing, has recently surpassed ChIP-chip technology for epigenomic analyses. Importantly, for the study of primary cells and tissues, epigenetic profiles can be generated using as little as 1 μg of chromatin. In this chapter, we describe in detail the steps involved in performing ChIP assays (with a focus on characterizing histone modifications in primary cells) either manually or using the IP-Star ChIP robot, followed by a detailed protocol to prepare successful libraries for Illumina sequencing. Critical quality control checkpoints are discussed. Although not a focus of this chapter, we also point the reader to several methods by which massive ChIP-seq data sets can be analyzed to extract the tremendous information contained within.
Chromatin immunoprecipitation; ChIP-seq; Next generation sequencing; Epigenomics; Histone modifications; IP-Star; ChIP robot
The last decade has seen an incredible breakthrough in technologies that allow histones, transcription factors (TFs), and RNA polymerases to be precisely mapped throughout the genome. From this research, it is clear that there is a complex interaction between the chromatin landscape and the general transcriptional machinery and that the dynamic control of this interface is central to gene regulation. However, the chromatin remodeling enzymes and general TFs cannot, on their own, recognize and stably bind to promoter or enhancer regions. Rather, they are recruited to cis regulatory regions through interaction with site-specific DNA binding TFs and/or proteins that recognize epigenetic marks such as methylated cytosines or specifically modified amino acids in histones. These “recruitment” factors are modular in structure, reflecting their ability to interact with the genome via one region of the protein and to simultaneously bind to other regulatory proteins via “effector” domains. In this chapter, we provide examples of common effector domains that can function in transcriptional regulation via their ability to (a) interact with the basal transcriptional machinery and general co-activators, (b) interact with other TFs to allow cooperative binding, and (c) directly or indirectly recruit histone and chromatin modifying enzymes.
Half of all human transcription factors are zinc finger proteins and yet very little is known concerning the biological role of the majority of these factors. In particular, very few genome-wide studies of the in vivo binding of zinc finger factors have been performed. Based on in vitro studies and other methods that allow selection of high affinity-binding sites in artificial conditions, a zinc finger code has been developed that can be used to compose a putative recognition motif for a particular zinc finger factor (ZNF). Theoretically, a simple bioinformatics analysis could then predict the genomic locations of all the binding sites for that ZNF. However, it is unlikely that all of the sequences in the human genome having a good match to a predicted motif are in fact occupied in vivo (due to negative influences from repressive chromatin, nucleosomal positioning, overlap of binding sites with other factors, etc). A powerful method to identify in vivo binding sites for transcription factors on a genome-wide scale is the chromatin immunoprecipitation (ChIP) assay, followed by hybridization of the precipitated DNA to microarrays (ChIP-chip) or by high throughput DNA sequencing of the sample (ChIP-seq). Such comprehensive in vivo binding studies would not only identify target genes of a particular zinc finger factor, but also provide binding motif data that could be used to test the validity of the zinc finger code. This chapter describes in detail the steps needed to prepare ChIP samples and libraries for high throughput sequencing using the Illumina GA2 platform and includes descriptions of quality control steps necessary to ensure a successful ChIP-seq experiment.
Zinc fingers; chromatin immunoprecipitation; ChIP-seq; next generation sequencing
Artificial transcription factors (ATFs) and genomic nucleases based on a DNA binding platform consisting of multiple zinc finger domains are currently being developed for clinical applications. However, no genome-wide investigations into their binding specificity have been performed. We have created six-finger ATFs to target two different 18 nt regions of the human SOX2 promoter; each ATF is constructed such that it contains or lacks a super KRAB domain (SKD) that interacts with a complex containing repressive histone methyltransferases. ChIP-seq analysis of the effector-free ATFs in MCF7 breast cancer cells identified thousands of binding sites, mostly in promoter regions; the addition of an SKD domain increased the number of binding sites ∼5-fold, with a majority of the new sites located outside of promoters. De novo motif analyses suggest that the lack of binding specificity is due to subsets of the finger domains being used for genomic interactions. Although the ATFs display widespread binding, few genes showed expression differences; genes repressed by the ATF-SKD have stronger binding sites and are more enriched for a 12 nt motif. Interestingly, epigenetic analyses indicate that the transcriptional repression caused by the ATF-SKD is not due to changes in active histone modifications.
The ZNF217 gene, encoding a C2H2 zinc finger protein, is located at 20q13 and found amplified and overexpressed in greater than 20% of breast tumors. Current studies indicate ZNF217 drives tumorigenesis, yet the regulatory mechanisms of ZNF217 are largely unknown. Because ZNF217 associates with chromatin modifying enzymes, we postulate that ZNF217 functions to regulate specific gene signaling networks. Here, we present a large-scale functional genomic analysis of ZNF217, which provides insights into the regulatory role of ZNF217 in MCF7 breast cancer cells.
ChIP-seq analysis reveals that the majority of ZNF217 binding sites are located at distal regulatory regions associated with the chromatin marks H3K27ac and H3K4me1. Analysis of ChIP-seq transcription factor binding sites shows clustering of ZNF217 with FOXA1, GATA3 and ERalpha binding sites, supported by the enrichment of corresponding motifs for the ERalpha-associated cis-regulatory sequences. ERalpha expression highly correlates with ZNF217 in lysates from breast tumors (n = 15), and ERalpha co-precipitates ZNF217 and its binding partner CtBP2 from nuclear extracts. Transcriptome profiling following ZNF217 depletion identifies differentially expressed genes co-bound by ZNF217 and ERalpha; gene ontology suggests a role for ZNF217-ERalpha in expression programs associated with ER+ breast cancer studies found in the Molecular Signature Database. Data-mining of expression data from breast cancer patients correlates ZNF217 with reduced overall survival.
Our genome-wide ZNF217 data suggests a functional role for ZNF217 at ERalpha target genes. Future studies will investigate whether ZNF217 expression contributes to aberrant ERalpha regulatory events in ER+ breast cancer and hormone resistance.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-520) contains supplementary material, which is available to authorized users.
Breast cancer; ZNF217; ERalpha; GATA3; FOXA1; ChIP-seq; RNA-seq; Endocrine resistance
C2H2 zinc fingers are found in several transcriptional regulators in the immune system. However, these proteins usually contain more fingers than are needed for stable DNA binding, suggesting that different fingers regulate different genes and functions. Mice lacking finger 1 or finger 4 of Ikaros exhibited distinct subsets of the phenotypes of Ikaros-null mice. Most notably, the two fingers controlled different stages of lymphopoiesis and finger 4 was selectively required for tumor suppression. The distinct phenotypes suggest that only a small number of Ikaros target genes are critical for each of its biological functions. Subdivision of phenotypes and targets by mutagenesis of individual fingers will facilitate efforts to understand how members of this prevalent family regulate development, immunity and disease.
Variability in the quality of antibodies to histone post-translational modifications (PTMs) presents widely recognized hindrance in epigenetics research. Here, by using antibody engineering technologies we produced recombinant antibodies directed to the trimethylated lysine residues of histone H3 with high specificity and affinity and no lot-to-lot variation. These recombinant antibodies performed well in common epigenetics applications, and their high specificity enabled us to identify positive and negative correlations among histone PTMs.
Genome-wide association studies (GWAS) have revolutionized the field of cancer genetics, but the causal links between increased genetic risk and onset/progression of disease processes remain to be identified. Here we report the first step in such an endeavor for prostate cancer. We provide a comprehensive annotation of the 77 known risk loci, based upon highly correlated variants in biologically relevant chromatin annotations— we identified 727 such potentially functional SNPs. We also provide a detailed account of possible protein disruption, microRNA target sequence disruption and regulatory response element disruption of all correlated SNPs at . 88% of the 727 SNPs fall within putative enhancers, and many alter critical residues in the response elements of transcription factors known to be involved in prostate biology. We define as risk enhancers those regions with enhancer chromatin biofeatures in prostate-derived cell lines with prostate-cancer correlated SNPs. To aid the identification of these enhancers, we performed genomewide ChIP-seq for H3K27-acetylation, a mark of actively engaged enhancers, as well as the transcription factor TCF7L2. We analyzed in depth three variants in risk enhancers, two of which show significantly altered androgen sensitivity in LNCaP cells. This includes rs4907792, that is in linkage disequilibrium () with an eQTL for NUDT11 (on the X chromosome) in prostate tissue, and rs10486567, the index SNP in intron 3 of the JAZF1 gene on chromosome 7. Rs4907792 is within a critical residue of a strong consensus androgen response element that is interrupted in the protective allele, resulting in a 56% decrease in its androgen sensitivity, whereas rs10486567 affects both NKX3-1 and FOXA-AR motifs where the risk allele results in a 39% increase in basal activity and a 28% fold-increase in androgen stimulated enhancer activity. Identification of such enhancer variants and their potential target genes represents a preliminary step in connecting risk to disease process.
In the following work we provide a complete summary annotation of functional hypotheses relating to risk identified by genome wide association studies of prostate cancer. In addition, we present new genome-wide profiles for H3K27-acetylation and TCF7L2 binding in LNCaP cells. We also introduce the concept of a risk enhancer, and characterize two novel androgen-sensitive enhancers whose activity is specifically affected by prostate-cancer risk SNPs. Our findings represent a preliminary approach to systematic identification of causal variation underlying cancer risk in the prostate.
Transposable element (TE) derived sequences comprise half of our genome and DNA methylome, and are presumed densely methylated and inactive. Examination of the genome-wide DNA methylation status within 928 TE subfamilies in human embryonic and adult tissues revealed unexpected tissue-specific and subfamily-specific hypomethylation signatures. Genes proximal to tissue-specific hypomethylated TE sequences were enriched for functions important for the tissue type and their expression correlated strongly with hypomethylation of the TEs. When hypomethylated, these TE sequences gained tissue-specific enhancer marks including H3K4me1 and occupancy by p300, and a majority exhibited enhancer activity in reporter gene assays. Many such TEs also harbored binding sites for transcription factors that are important for tissue-specific functions and exhibited evidence for evolutionary selection. These data suggest that sequences derived from TEs may be responsible for wiring tissue type-specific regulatory networks, and have acquired tissue-specific epigenetic regulation.
One big limitation of computational tools for analyzing ChIP-seq data is that most of them ignore non-unique tags (NUTs) that match the human genome even though NUTs comprise up to 60% of all raw tags in ChIP-seq data. Effectively utilizing these NUTs would increase the sequencing depth and allow a more accurate detection of enriched binding sites, which in turn could lead to more precise and significant biological interpretations. In this study, we have developed a computational tool, LOcating Non-Unique matched Tags (LONUT), to improve the detection of enriched regions from ChIP-seq data. Our LONUT algorithm applies a linear and polynomial regression model to establish an empirical score (ES) formula by considering two influential factors, the distance of NUTs to peaks identified using uniquely matched tags (UMTs) and the enrichment score for those peaks resulting in each NUT being assigned to a unique location on the reference genome. The newly located tags from the set of NUTs are combined with the original UMTs to produce a final set of combined matched tags (CMTs). LONUT was tested on many different datasets representing three different characteristics of biological data types. The detected sites were validated using de novo motif discovery and ChIP-PCR. We demonstrate the specificity and accuracy of LONUT and show that our program not only improves the detection of binding sites for ChIP-seq, but also identifies additional binding sites.
DNA methylation and repressive histone modifications cooperate to silence promoters. One mechanism by which regions of methylated DNA could acquire repressive histone modifications is via methyl DNA-binding transcription factors. The zinc finger protein ZBTB33 (also known as Kaiso) has been shown in vitro to bind preferentially to methylated DNA and to interact with the SMRT/NCoR histone deacetylase complexes. We have performed bioinformatic analyses of Kaiso ChIP-seq and DNA methylation datasets to test a model whereby binding of Kaiso to methylated CpGs leads to loss of acetylated histones at target promoters.
Our results suggest that, contrary to expectations, Kaiso does not bind to methylated DNA in vivo but instead binds to highly active promoters that are marked with high levels of acetylated histones. In addition, our studies suggest that DNA methylation and nucleosome occupancy patterns restrict access of Kaiso to potential binding sites and influence cell type-specific binding.
We propose a new model for the genome-wide binding and function of Kaiso whereby Kaiso binds to unmethylated regulatory regions and contributes to the active state of target promoters.
DNA methylation; Zinc finger proteins; Histone modifications; Transcription factor binding; Epigenetics; Transcriptional regulation