Genome-wide association studies (GWAS) have identified >100 independent susceptibility loci for prostate cancer, including the hot spot at 8q24. However, how genetic variants at this locus confer disease risk hasn’t been fully characterized. Using circularized chromosome conformation capture (4C) coupled with next-generation sequencing and an enhancer at 8q24 as “bait”, we identified genome-wide partners interacting with this enhancer in cell lines LNCaP and C4-2B. These 4C-identified regions are distributed in open nuclear compartments, featuring active histone marks (H3K4me1, H3K4me2 and H3K27Ac). Transcription factors NKX3-1, FOXA1 and AR (androgen receptor) tend to occupy these 4C regions. We identified genes located at the interacting regions, and found them linked to positive regulation of mesenchymal cell proliferation in LNCaP and C4-2B, and several pathways (TGF beta signaling pathway in LNCaP and p53 pathway in C4-2B). Common genes (e.g. MYC and POU5F1B) were identified in both prostate cancer cell lines. However, each cell line also had exclusive genes (e.g. ELAC2 and PTEN in LNCaP and BRCA2 and ZFHX3 in C4-2B). In addition, BCL-2 identified in C4-2B might contribute to the progression of androgen-refractory prostate cancer. Overall, our work reveals key genes and pathways involved in prostate cancer onset and progression.
Considerable progress towards an understanding of complex diseases has been made in recent years due to the development of high-throughput genotyping technologies. Using microarrays that contain millions of single-nucleotide polymorphisms (SNPs), Genome Wide Association Studies (GWASs) have identified SNPs that are associated with many complex diseases or traits. For example, as of February 2015, 2111 association studies have identified 15,396 SNPs for various diseases and traits, with the number of identified SNP-disease/trait associations increasing rapidly in recent years. However, it has been difficult for researchers to understand disease risk from GWAS results. This is because most GWAS-identified SNPs are located in non-coding regions of the genome. It is important to consider that the GWAS-identified SNPs serve only as representatives for all SNPs in the same haplotype block, and it is equally likely that other SNPs in high linkage disequilibrium (LD) with the array-identified SNPs are causal for the disease. Because it was hoped that disease-associated coding variants would be identified if the true casual SNPs were known, investigators have expanded their analyses using LD calculation and fine-mapping. However, such analyses also identified risk-associated SNPs located in non-coding regions. Thus, the GWAS field has been left with the conundrum as to how a single-nucleotide change in a non-coding region could confer increased risk for a specific disease. One possible answer to this puzzle is that the variant SNPs cause changes in gene expression levels rather than causing changes in protein function. This review provides a description of (1) advances in genomic and epigenomic approaches that incorporate functional annotation of regulatory elements to prioritize the disease risk-associated SNPs that are located in non-coding regions of the genome for follow-up studies, (2) various computational tools that aid in identifying gene expression changes caused by the non-coding disease-associated SNPs, and (3) experimental approaches to identify target genes of, and study the biological phenotypes conferred by, non-coding disease-associated SNPs.
GWAS; Enhancers; Non-coding SNPs; Genome engineering
Recent research on disparate psychiatric disorders has implicated rare variants in genes involved in global gene regulation and chromatin modification, as well as many common variants located primarily in regulatory regions of the genome. Understanding precisely how these variants contribute to disease will require a deeper appreciation for the mechanisms of gene regulation in the developing and adult human brain. The PsychENCODE project aims to produce a public resource of multidimensional genomic data using tissue- and cell type–specific samples from approximately 1,000 phenotypically well-characterized, high-quality healthy and disease-affected human post-mortem brains, as well as functionally characterize disease-associated regulatory elements and variants in model systems. We are beginning with a focus on autism spectrum disorder, bipolar disorder and schizophrenia, and expect that this knowledge will apply to a wide variety of psychiatric disorders. This paper outlines the motivation and design of PsychENCODE.
Transcriptional misregulation is involved in the development of many diseases, especially neoplastic transformation. Distal regulatory elements, such as enhancers, play a major role in specifying cell-specific transcription patterns in both normal and diseased tissues, suggesting that enhancers may be prime targets for therapeutic intervention. By focusing on modulating gene regulation mediated by cell type-specific enhancers, there is hope that normal epigenetic patterning in an affected tissue could be restored with fewer side effects than observed with treatments employing relatively nonspecific inhibitors such as epigenetic drugs. New methods employing genomic nucleases and site-specific epigenetic regulators targeted to specific genomic regions, using either artificial DNA-binding proteins or RNA–DNA interactions, may allow precise genome engineering at enhancers. However, this field is still in its infancy and further refinements that increase specificity and efficiency are clearly required.
CRISPRs; DNA methylation; enhancers; epigenetic therapy; gene expression; genome engineering; genomic nuclease; histone modifications; TALENs; ZFNs
Developmental history shapes the epigenome and biological function of differentiated cells. Epigenomic patterns have been broadly attributed to the three embryonic germ layers. Here we investigate how developmental origin influences epigenomes. We compare key epigenomes of cell types derived from surface ectoderm (SE), including keratinocytes and breast luminal and myoepithelial cells, against neural crest-derived melanocytes and mesoderm-derived dermal fibroblasts to identify SE differentially methylated regions (SE-DMRs). DNA methylomes of neonatal keratinocytes share many more DMRs with adult breast luminal and myoepithelial cells than with melanocytes and fibroblasts from the same neonatal skin. This suggests that SE origin contributes to DNA methylation patterning, while shared skin tissue environment has limited effect on epidermal keratinocytes. Hypomethylated SE-DMRs are in proximity to genes with SE relevant functions. They are also enriched for enhancer- and promoter-associated histone modifications in SE-derived cells, and for binding motifs of transcription factors important in keratinocyte and mammary gland biology. Thus, epigenomic analysis of cell types with common developmental origin reveals an epigenetic signature that underlies a shared gene regulatory network.
Recent studies indicate that DNA methylation can be used to identify transcriptional enhancers, but no systematic approach has been developed for genome-wide identification and analysis of enhancers based on DNA methylation. We describe ELMER (Enhancer Linking by Methylation/Expression Relationships), an R-based tool that uses DNA methylation to identify enhancers and correlates enhancer state with expression of nearby genes to identify transcriptional targets. Transcription factor motif analysis of enhancers is coupled with expression analysis of transcription factors to infer upstream regulators. Using ELMER, we investigated more than 2,000 tumor samples from The Cancer Genome Atlas. We identified networks regulated by known cancer drivers such as GATA3 and FOXA1 (breast cancer), SOX17 and FOXA2 (endometrial cancer), and NFE2L2, SOX2, and TP63 (squamous cell lung cancer). We also identified novel networks with prognostic associations, including RUNX1 in kidney cancer. We propose ELMER as a powerful new paradigm for understanding the cis-regulatory interface between cancer-associated transcription factors and their functional target genes.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0668-3) contains supplementary material, which is available to authorized users.
The role of intermediate methylation states in DNA is unclear. Here, to comprehensively identify regions of intermediate methylation and their quantitative relationship with gene activity, we apply integrative and comparative epigenomics to 25 human primary cell and tissue samples. We report 18,452 intermediate methylation regions located near 36% of genes and enriched at enhancers, exons, and DNase I hypersensitivity sites. Intermediate methylation regions average 57% methylation, are predominantly allele-independent, and are conserved across individuals and between mouse and human, suggesting a conserved function. At enhancers, these regions have an intermediate level of active chromatin marks and their associated genes have intermediate transcriptional activity. Exonic intermediate methylation correlates with exon inclusion at the level between that of fully methylated and unmethylated exons, highlighting gene context-dependent functions. We conclude that intermediate DNA methylation is a conserved signature of gene regulation and exon usage.
While significant effort has been dedicated to the characterization of epigenetic changes associated with pre-natal differentiation relatively little is known about the epigenetic changes that accompany post-natal differentiation where fully functional differentiated cell types with limited lifespans arise. Here we sought to address this gap by generating epigenomic and transcriptional profiles from primary human breast cell types isolated from disease-free human subjects. From these data we define a comprehensive human breast transcriptional network, including a set of myoepithelial- and luminal epithelial- specific intronic retention events. Intersection of epigenetic states with RNA expression from distinct breast epithelium lineages demonstrates that mCpG provides a stable record of exonic and intronic usage whereas H3K36me3 is dynamic. We find a striking asymmetry in epigenomic reprogramming between luminal and myoepithelial cell types, with the genomes of luminal cells harboring more than twice the number of hypomethylated enhancer elements compared to myoepithelial cells.
The role of intermediate methylation states in DNA is unclear. Here, to comprehensively identify regions of intermediate methylation and their quantitative relationship with gene activity, we apply integrative and comparative epigenomics to 25 human primary cell and tissue samples. We report 18,452 intermediate methylation regions located near 36% of genes and enriched at enhancers, exons and DNase I hypersensitivity sites. Intermediate methylation regions average 57% methylation, are predominantly allele-independent and are conserved across individuals and between mouse and human, suggesting a conserved function. These regions have an intermediate level of active chromatin marks and their associated genes have intermediate transcriptional activity. Exonic intermediate methylation correlates with exon inclusion at a level between that of fully methylated and unmethylated exons, highlighting gene context-dependent functions. We conclude that intermediate DNA methylation is a conserved signature of gene regulation and exon usage.
Many loci in the mammalian genome are intermediately methylated. Here, by comprehensively identifying these loci and quantifying their relationship with gene activity, the authors show that intermediate methylation is an evolutionarily conserved epigenomic signature of gene regulation.
While significant effort has been dedicated to the characterization of epigenetic changes associated with prenatal differentiation, relatively little is known about the epigenetic changes that accompany post-natal differentiation where fully functional differentiated cell types with limited lifespans arise. Here we sought to address this gap by generating epigenomic and transcriptional profiles from primary human breast cell types isolated from disease-free human subjects. From these data we define a comprehensive human breast transcriptional network, including a set of myoepithelial- and luminal epithelial-specific intronic retention events. Intersection of epigenetic states with RNA expression from distinct breast epithelium lineages demonstrates that mCpG provides a stable record of exonic and intronic usage, whereas H3K36me3 is dynamic. We find a striking asymmetry in epigenomic reprogramming between luminal and myoepithelial cell types, with the genomes of luminal cells harbouring more than twice the number of hypomethylated enhancer elements compared with myoepithelial cells.
Epigenetic changes associated with post-natal differentiation have been characterized. Here the authors generate epigenomic and transcriptional profiles from primary human breast cells, providing insights into the transcriptional and epigenetic events that define post-natal cell differentiation in vivo.
Due to the hyper-activation of WNT signaling in a variety of cancer types, there has been a strong drive to develop pathway-specific inhibitors with the eventual goal of providing a chemotherapeutic antagonist of WNT signaling to cancer patients. A new category of drugs, called epigenetic inhibitors, are being developed that hold high promise for inhibition of the WNT pathway. The canonical WNT signaling pathway initiates when WNT ligands bind to receptors, causing the nuclear localization of the co-activator β-catenin (CTNNB1), which leads to an association of β-catenin with a member of the TCF transcription factor family at regulatory regions of WNT-responsive genes. The TCF/β-catenin complex then recruits CBP (CREBBP) or p300 (EP300), leading to histone acetylation and gene activation. A current model in the field is that CBP-driven expression of WNT target genes supports proliferation whereas p300-driven expression of WNT target genes supports differentiation. The small molecule inhibitor ICG-001 binds to CBP, but not to p300, and competitively inhibits the interaction of CBP with β-catenin. Upon treatment of cancer cells, this should reduce expression of CBP-regulated transcription, leading to reduced tumorigenicity and enhanced differentiation.
We have compared the genome-wide effects on the transcriptome after treatment with ICG-001 (the specific CBP inhibitor) versus C646, a compound that competes with acetyl-coA for the Lys-coA binding pocket of both CBP and p300. We found that both drugs cause large-scale changes in the transcriptome of HCT116 colon cancer cells and PANC1 pancreatic cancer cells and reverse some tumor-specific changes in gene expression. Interestingly, although the epigenetic inhibitors affect cell cycle pathways in both the colon and pancreatic cancer cell lines, the WNT signaling pathway was affected only in the colon cancer cells. Notably, WNT target genes were similarly downregulated after treatment of HCT116 with C646 as with ICG-001.
Our results suggest that treatment with a general HAT inhibitor causes similar effects on the transcriptome as does treatment with a CBP-specific inhibitor and that epigenetic inhibition affects the WNT pathway in HCT116 cells and the cholesterol biosynthesis pathway in PANC1 cells.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-8935-8-9) contains supplementary material, which is available to authorized users.
Epigenetic inhibitor; Histone acetylation; WNT signaling; C646; ICG-001; Colon cancer; Pancreatic cancer; TCF7L2; Cholesterol biosynthesis
Understanding the mechanisms underlying ErbB3 over-expression in breast cancer will facilitate the rational design of therapies to disrupt ErbB2-ErbB3 oncogenic function. While ErbB3 over-expression is frequently observed in breast cancer, the factors mediating its aberrant expression are poorly understood. In particular, the ErbB3 gene is not significantly amplified, raising the question as to how ErbB3 over-expression is achieved. In this study we demonstrate that the ZNF217 transcription factor, amplified at 20q13 in ~20% of breast tumors, regulates ErbB3 expression. Analysis of a panel of human breast cancer cell lines (n = 50) and primary human breast tumors (n=15) demonstrated a strong positive correlation between ZNF217 and ErbB3 expression. Ectopic expression of ZNF217 in human mammary epithelial cells induced ErbB3 expression while ZNF217 silencing in breast cancer cells resulted in decreased ErbB3 expression. While ZNF217 has previously been linked with transcriptional repression due to its close association with CtBP1/2 repressor complexes, our results demonstrate that ZNF217 also activates gene expression. We demonstrate that ZNF217 recruitment to the ErbB3 promoter is CtBP1/2-independent and that ZNF217 and CtBP1/2 play opposite roles in regulating ErbB3 expression. In addition, we identify ErbB3 as one of the mechanisms by which ZNF217 augments PI-3K/Akt signaling.
ZNF217; ErbB3; CtBP2; 20q13; breast cancer
Colorectal cancer (CRC) is a leading cause of cancer-related deaths in the United States. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with increased risk for CRC. A molecular understanding of the functional consequences of this genetic variation has been complicated because each GWAS SNP is a surrogate for hundreds of other SNPs, most of which are located in non-coding regions. Here we use genomic and epigenomic information to test the hypothesis that the GWAS SNPs and/or correlated SNPs are in elements that regulate gene expression, and identify 23 promoters and 28 enhancers. Using gene expression data from normal and tumour cells, we identify 66 putative target genes of the risk-associated enhancers (10 of which were also identified by promoter SNPs). Employing CRISPR nucleases, we delete one risk-associated enhancer and identify genes showing altered expression. We suggest that similar studies be performed to characterize all CRC risk-associated enhancers.
Previous studies identified genetic variants associated with colorectal cancer (CRC), but the functional consequences of these genetic risk factors remain poorly understood. Here, the authors report that CRC risk variants reside in promoters and enhancers and could increase colon cancer risk through gene expression regulation.
Gene expression is epigenetically regulated by a combination of histone modifications and methylation of CpG dinucleotides in promoters. In normal cells, CpG-rich promoters are typically unmethylated, marked with histone modifications such as H3K4me3, and are highly active. During neoplastic transformation, CpG dinucleotides of CG-rich promoters become aberrantly methylated, corresponding with the removal of active histone modifications and transcriptional silencing. Outside of promoter regions, distal enhancers play a major role in the cell type-specific regulation of gene expression. Enhancers, which function by bringing activating complexes to promoters through chromosomal looping, are also modulated by a combination of DNA methylation and histone modifications.
Here we use HCT116 colorectal cancer cells with and without mutations in DNA methyltransferases, the latter of which results in a 95% reduction in global DNA methylation levels. These cells are used to study the relationship between DNA methylation, histone modifications, and gene expression. We find that the loss of DNA methylation is not sufficient to reactivate most of the silenced promoters. In contrast, the removal of DNA methylation results in the activation of a large number of enhancer regions as determined by the acquisition of active histone marks.
Although the transcriptome is largely unaffected by the loss of DNA methylation, we identify two distinct mechanisms resulting in the upregulation of distinct sets of genes. One is a direct result of DNA methylation loss at a set of promoter regions and the other is due to the presence of new intragenic enhancers.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0469-0) contains supplementary material, which is available to authorized users.
The self-renewal capacity ascribed to embryonic stem cells (ESCs) is reminiscent of cancer cell proliferation, raising speculation that a common network of genes may regulate these traits. A search for general regulators of these traits yielded a set of microRNAs for which expression is highly enriched in hESCs and liver cancer cells (HCCs), but attenuated in differentiated quiescent hepatocytes. Here, we show that these microRNAs promote hESC self-renewal, as well as HCC proliferation, and when overexpressed in normally quiescent hepatocytes, induce proliferation and activate cancer signaling pathways. Proliferation in hepatocytes is mediated through translational repression of Pten, Tgfbr2, Klf11 and Cdkn1a, which collectively dysregulates the PI3K/AKT/mTOR and TGFβ tumor suppressor signaling pathways. Furthermore, aberrant expression of these miRNAs is observed in human liver tumor tissues, and induces epithelial-mesenchymal transition in hepatocytes. These findings suggest that microRNAs that are essential in normal development as promoters of ESC self-renewal are frequently up-regulated in human liver tumors, and harbor neoplastic transformation potential when they escape silencing in quiescent human hepatocytes.
Human embryonic stem cells; microRNAs; hepatocellular carcinoma cells
DNA motifs are short sequences varying from 6 to 25 bp and can be highly variable and degenerated. One major approach for predicting transcription factor (TF) binding is using position weight matrix (PWM) to represent information content of regulatory sites; however, when used as the sole means of identifying binding sites suffers from the limited amount of training data available and a high rate of false-positive predictions. ChIPMotifs program is a de novo motif finding tool developed for ChIP-based high-throughput data, and W-ChIPMotifs is a Web application tool for ChIPMotifs. It composes various ab initio motif discovery tools such as MEME, MaMF, Weeder and optimizes the significance of the detected motifs by using bootstrap re-sampling error estimation and a Fisher test. Using these techniques, we determined a PWM for OCT4 which is similar to canonical OCT4 consensus sequence. In a separate study, we also use de novo motif discovery to suggest that ZNF263 binds to a 24-nt site that differs from the motif predicted by the zinc finger code in several positions.
Motif; ChIP; Position weight matrix; OCT4; ZNF263
Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
The dynamic modification of DNA and histones plays a key role in transcriptional regulation through altering the packaging of DNA and modifying the nucleosome surface. These chromatin states, also referred to as the epigenome, are distinctive for different tissues, developmental stages, and disease states and can also be altered by environmental influences. New technologies allow the genome-wide visualization of the information encoded in the epigenome. For example, the chromatin immunoprecipitation (ChIP) assay allows investigators to characterize DNA–protein interactions in vivo. ChIP followed by hybridization to microarrays (ChIP-chip) or by high-throughput sequencing (ChIP-seq) are both powerful tools to identify genome-wide profiles of transcription factors, histone modifications, DNA methylation, and nucleosome positioning. ChIP-seq technology, which can now interrogate the entire human genome at high resolution with only one lane of sequencing, has recently surpassed ChIP-chip technology for epigenomic analyses. Importantly, for the study of primary cells and tissues, epigenetic profiles can be generated using as little as 1 μg of chromatin. In this chapter, we describe in detail the steps involved in performing ChIP assays (with a focus on characterizing histone modifications in primary cells) either manually or using the IP-Star ChIP robot, followed by a detailed protocol to prepare successful libraries for Illumina sequencing. Critical quality control checkpoints are discussed. Although not a focus of this chapter, we also point the reader to several methods by which massive ChIP-seq data sets can be analyzed to extract the tremendous information contained within.
Chromatin immunoprecipitation; ChIP-seq; Next generation sequencing; Epigenomics; Histone modifications; IP-Star; ChIP robot
The last decade has seen an incredible breakthrough in technologies that allow histones, transcription factors (TFs), and RNA polymerases to be precisely mapped throughout the genome. From this research, it is clear that there is a complex interaction between the chromatin landscape and the general transcriptional machinery and that the dynamic control of this interface is central to gene regulation. However, the chromatin remodeling enzymes and general TFs cannot, on their own, recognize and stably bind to promoter or enhancer regions. Rather, they are recruited to cis regulatory regions through interaction with site-specific DNA binding TFs and/or proteins that recognize epigenetic marks such as methylated cytosines or specifically modified amino acids in histones. These “recruitment” factors are modular in structure, reflecting their ability to interact with the genome via one region of the protein and to simultaneously bind to other regulatory proteins via “effector” domains. In this chapter, we provide examples of common effector domains that can function in transcriptional regulation via their ability to (a) interact with the basal transcriptional machinery and general co-activators, (b) interact with other TFs to allow cooperative binding, and (c) directly or indirectly recruit histone and chromatin modifying enzymes.
Half of all human transcription factors are zinc finger proteins and yet very little is known concerning the biological role of the majority of these factors. In particular, very few genome-wide studies of the in vivo binding of zinc finger factors have been performed. Based on in vitro studies and other methods that allow selection of high affinity-binding sites in artificial conditions, a zinc finger code has been developed that can be used to compose a putative recognition motif for a particular zinc finger factor (ZNF). Theoretically, a simple bioinformatics analysis could then predict the genomic locations of all the binding sites for that ZNF. However, it is unlikely that all of the sequences in the human genome having a good match to a predicted motif are in fact occupied in vivo (due to negative influences from repressive chromatin, nucleosomal positioning, overlap of binding sites with other factors, etc). A powerful method to identify in vivo binding sites for transcription factors on a genome-wide scale is the chromatin immunoprecipitation (ChIP) assay, followed by hybridization of the precipitated DNA to microarrays (ChIP-chip) or by high throughput DNA sequencing of the sample (ChIP-seq). Such comprehensive in vivo binding studies would not only identify target genes of a particular zinc finger factor, but also provide binding motif data that could be used to test the validity of the zinc finger code. This chapter describes in detail the steps needed to prepare ChIP samples and libraries for high throughput sequencing using the Illumina GA2 platform and includes descriptions of quality control steps necessary to ensure a successful ChIP-seq experiment.
Zinc fingers; chromatin immunoprecipitation; ChIP-seq; next generation sequencing
Artificial transcription factors (ATFs) and genomic nucleases based on a DNA binding platform consisting of multiple zinc finger domains are currently being developed for clinical applications. However, no genome-wide investigations into their binding specificity have been performed. We have created six-finger ATFs to target two different 18 nt regions of the human SOX2 promoter; each ATF is constructed such that it contains or lacks a super KRAB domain (SKD) that interacts with a complex containing repressive histone methyltransferases. ChIP-seq analysis of the effector-free ATFs in MCF7 breast cancer cells identified thousands of binding sites, mostly in promoter regions; the addition of an SKD domain increased the number of binding sites ∼5-fold, with a majority of the new sites located outside of promoters. De novo motif analyses suggest that the lack of binding specificity is due to subsets of the finger domains being used for genomic interactions. Although the ATFs display widespread binding, few genes showed expression differences; genes repressed by the ATF-SKD have stronger binding sites and are more enriched for a 12 nt motif. Interestingly, epigenetic analyses indicate that the transcriptional repression caused by the ATF-SKD is not due to changes in active histone modifications.
The ZNF217 gene, encoding a C2H2 zinc finger protein, is located at 20q13 and found amplified and overexpressed in greater than 20% of breast tumors. Current studies indicate ZNF217 drives tumorigenesis, yet the regulatory mechanisms of ZNF217 are largely unknown. Because ZNF217 associates with chromatin modifying enzymes, we postulate that ZNF217 functions to regulate specific gene signaling networks. Here, we present a large-scale functional genomic analysis of ZNF217, which provides insights into the regulatory role of ZNF217 in MCF7 breast cancer cells.
ChIP-seq analysis reveals that the majority of ZNF217 binding sites are located at distal regulatory regions associated with the chromatin marks H3K27ac and H3K4me1. Analysis of ChIP-seq transcription factor binding sites shows clustering of ZNF217 with FOXA1, GATA3 and ERalpha binding sites, supported by the enrichment of corresponding motifs for the ERalpha-associated cis-regulatory sequences. ERalpha expression highly correlates with ZNF217 in lysates from breast tumors (n = 15), and ERalpha co-precipitates ZNF217 and its binding partner CtBP2 from nuclear extracts. Transcriptome profiling following ZNF217 depletion identifies differentially expressed genes co-bound by ZNF217 and ERalpha; gene ontology suggests a role for ZNF217-ERalpha in expression programs associated with ER+ breast cancer studies found in the Molecular Signature Database. Data-mining of expression data from breast cancer patients correlates ZNF217 with reduced overall survival.
Our genome-wide ZNF217 data suggests a functional role for ZNF217 at ERalpha target genes. Future studies will investigate whether ZNF217 expression contributes to aberrant ERalpha regulatory events in ER+ breast cancer and hormone resistance.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-520) contains supplementary material, which is available to authorized users.
Breast cancer; ZNF217; ERalpha; GATA3; FOXA1; ChIP-seq; RNA-seq; Endocrine resistance
C2H2 zinc fingers are found in several transcriptional regulators in the immune system. However, these proteins usually contain more fingers than are needed for stable DNA binding, suggesting that different fingers regulate different genes and functions. Mice lacking finger 1 or finger 4 of Ikaros exhibited distinct subsets of the phenotypes of Ikaros-null mice. Most notably, the two fingers controlled different stages of lymphopoiesis and finger 4 was selectively required for tumor suppression. The distinct phenotypes suggest that only a small number of Ikaros target genes are critical for each of its biological functions. Subdivision of phenotypes and targets by mutagenesis of individual fingers will facilitate efforts to understand how members of this prevalent family regulate development, immunity and disease.
Variability in the quality of antibodies to histone post-translational modifications (PTMs) presents widely recognized hindrance in epigenetics research. Here, by using antibody engineering technologies we produced recombinant antibodies directed to the trimethylated lysine residues of histone H3 with high specificity and affinity and no lot-to-lot variation. These recombinant antibodies performed well in common epigenetics applications, and their high specificity enabled us to identify positive and negative correlations among histone PTMs.
Genome-wide association studies (GWAS) have revolutionized the field of cancer genetics, but the causal links between increased genetic risk and onset/progression of disease processes remain to be identified. Here we report the first step in such an endeavor for prostate cancer. We provide a comprehensive annotation of the 77 known risk loci, based upon highly correlated variants in biologically relevant chromatin annotations— we identified 727 such potentially functional SNPs. We also provide a detailed account of possible protein disruption, microRNA target sequence disruption and regulatory response element disruption of all correlated SNPs at . 88% of the 727 SNPs fall within putative enhancers, and many alter critical residues in the response elements of transcription factors known to be involved in prostate biology. We define as risk enhancers those regions with enhancer chromatin biofeatures in prostate-derived cell lines with prostate-cancer correlated SNPs. To aid the identification of these enhancers, we performed genomewide ChIP-seq for H3K27-acetylation, a mark of actively engaged enhancers, as well as the transcription factor TCF7L2. We analyzed in depth three variants in risk enhancers, two of which show significantly altered androgen sensitivity in LNCaP cells. This includes rs4907792, that is in linkage disequilibrium () with an eQTL for NUDT11 (on the X chromosome) in prostate tissue, and rs10486567, the index SNP in intron 3 of the JAZF1 gene on chromosome 7. Rs4907792 is within a critical residue of a strong consensus androgen response element that is interrupted in the protective allele, resulting in a 56% decrease in its androgen sensitivity, whereas rs10486567 affects both NKX3-1 and FOXA-AR motifs where the risk allele results in a 39% increase in basal activity and a 28% fold-increase in androgen stimulated enhancer activity. Identification of such enhancer variants and their potential target genes represents a preliminary step in connecting risk to disease process.
In the following work we provide a complete summary annotation of functional hypotheses relating to risk identified by genome wide association studies of prostate cancer. In addition, we present new genome-wide profiles for H3K27-acetylation and TCF7L2 binding in LNCaP cells. We also introduce the concept of a risk enhancer, and characterize two novel androgen-sensitive enhancers whose activity is specifically affected by prostate-cancer risk SNPs. Our findings represent a preliminary approach to systematic identification of causal variation underlying cancer risk in the prostate.