We report testing of the specificity and utility of over 200 antibodies raised against 57 different histone modifications, in Drosophila melanogaster, Caenorhabditis elegans and human cells. While most antibodies performed well, over 25% failed specificity tests by dot blot or western blot. Among specific antibodies, over 20% failed in chromatin immunoprecipitation experiments. We advise rigorous testing of histone-modification antibodies before use and provide a website for posting new test results.
Animals have body parts made of similar cell types located at different axial positions (e.g. limbs). The identity and distinct morphology of each structure is often specified by the activity of different “master regulator” transcription factors. Although similarities in gene expression have been observed between body parts made of similar cell types, it is not known how regulatory information in the genome is differentially utilized to create morphologically diverse structures in development. Here, we use genome-wide open chromatin profiling to show that among the Drosophila appendages, the same DNA regulatory modules are accessible throughout the genome at a given stage of development, except at the loci encoding the master regulators themselves. In addition, while open chromatin profiles change over developmental time, these changes are coordinated between different appendages. We propose that master regulators create morphologically distinct structures by differentially influencing the function of the same set of DNA regulatory modules.
Dosage compensation, which regulates the expression of genes residing on the sex chromosomes, has provided valuable insights into chromatin-based mechanisms of gene regulation. The nematode Caenorhabditis elegans has adopted various strategies to down-regulate and even nearly silence the X chromosomes. This article discusses the different chromatin-based strategies used in somatic tissues and in the germline to modulate gene expression from the C. elegans X chromosomes and compares these strategies to those used by other organisms to cope with similar X-chromosome dosage differences.
In C. elegans, the two sexes have different numbers of X chromosomes. The soma and germline have evolved distinct dosage compensation mechanisms to deal with this difference in X ploidy.
Nuclear pores associate with active protein-coding genes in yeast and have been implicated in transcriptional regulation. Here, we show that in addition to transcriptional regulation, key components of C. elegans nuclear pores are required for processing of a subset of small nucleolar RNAs (snoRNAs) and tRNAs transcribed by RNA Polymerase (Pol) III. Chromatin immunoprecipitation of NPP-13 and NPP-3, two integral nuclear pore components, and importin-β IMB-1, provides strong evidence that this requirement is direct. All three proteins associate specifically with tRNA and snoRNA genes undergoing Pol III transcription. These pore components bind immediately downstream of the Pol III pre-initiation complex, but are not required for Pol III recruitment. Instead, NPP-13 is required for cleavage of tRNA and snoRNA precursors into mature RNAs, whereas Pol II transcript processing occurs normally. Our data suggest that integral nuclear pore proteins act to coordinate transcription and processing of Pol III transcripts in C. elegans.
Histones and their post-translational modifications influence the regulation of many DNA-dependent processes. Although an essential role for histone-modifying enzymes in these processes is well established, defining the specific contribution of individual histone residues remains a challenge because many histone-modifying enzymes have non-histone targets. This challenge is exacerbated by the paucity of suitable approaches to genetically engineer histone genes in metazoans. Here, we describe a facile platform in Drosophila for generating and analyzing any desired histone genotype, and we use it to test the in vivo function of three histone residues. We demonstrate that H4K20 is neither essential for DNA replication nor for completion of development, unlike conclusions drawn from analyses of H4K20 methyltransferases. We also show that H3K36 is required for viability and H3K27 is essential for maintenance of cellular identity during development. These findings highlight the power of engineering histones to interrogate genome structure and function in animals.
A major challenge in biology is determining how evolutionarily novel characters originate; however, mechanistic explanations for the origin of new characters are almost completely unknown. The evolution of pregnancy is an excellent system in which to study the origin of novelties because mammals preserve stages in the transition from egg laying to live birth. To determine the molecular bases of this transition, we characterized the pregnant/gravid uterine transcriptome from tetrapods to trace the evolutionary history of uterine gene expression. We show that thousands of genes evolved endometrial expression during the origins of mammalian pregnancy, including genes that mediate maternal-fetal communication and immunotolerance. Furthermore, thousands of cis-regulatory elements that mediate decidualization and cell-type identity in decidualized stromal cells are derived from ancient mammalian transposable elements (TEs). Our results indicate that one of the defining mammalian novelties evolved from DNA sequences derived from ancient mammalian TEs coopted into hormone-responsive regulatory elements distributed throughout the genome.
Eviction or destabilization of nucleosomes from chromatin is a hallmark of functional regulatory elements of the eukaryotic genome. Historically identified by nuclease hypersensitivity, these regulatory elements are typically bound by transcription factors or other regulatory proteins. FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) is an alternative approach to identify these genomic regions and has proven successful in a multitude of eukaryotic cell and tissue types. Cells or dissociated tissues are crosslinked briefly with formaldehyde, lysed, and sonicated. Sheared chromatin is subjected to phenol-chloroform extraction and the isolated DNA, typically encompassing 1–3% of the human genome, is purified. We provide guidelines for quantitative analysis by PCR, microarrays, or next-generation sequencing. Regulatory elements enriched by FAIRE display high concordance with those identified by nuclease hypersensitivity or ChIP, and the entire procedure can be completed in three days. FAIRE exhibits low technical variability, which allows its use in large-scale studies of chromatin from normal or diseased tissues.
FAIRE; open chromatin; nucleosome; next-generation sequencing
We performed a systematic evaluation of how variations in sequencing depth and other parameters influence interpretation of Chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) experiments. Using Drosophila S2 cells, we generated ChIP-seq datasets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin state bias, open chromatin regions yielded higher coverage, which led to false positives if not corrected and had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP library complexity at high coverage. The removal of reads originating at the same base reduced false-positives while having little effect on detection sensitivity. Even at a depth of ~1 read/bp coverage of mappable genome, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely-used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle datasets with deep coverage.
Single-cell ATAC-seq detects open chromatin in individual cells. Currently data are sparse, but combining information from many single cells can identify determinants of cell-to-cell chromatin variation.
After fertilization but prior to the onset of zygotic transcription, the C. elegans zygote cleaves asymmetrically to create the anterior AB and posterior P1 blastomeres, each of which goes on to generate distinct cell lineages. To understand how patterns of RNA inheritance and abundance arise after this first asymmetric cell division, we pooled hand-dissected AB and P1 blastomeres and performed RNA-seq. Our approach identified over 200 asymmetrically abundant mRNA transcripts. We confirmed symmetric or asymmetric abundance patterns for a subset of these transcripts using smFISH. smFISH also revealed heterogeneous subcellular patterning of the P1-enriched transcripts chs-1 and bpl-1. We screened transcripts enriched in a given blastomere for embryonic defects using RNAi. The gene neg-1 (F32D1.6) encoded an AB-enriched (anterior) transcript and was required for proper morphology of anterior tissues. In addition, analysis of the asymmetric transcripts yielded clues regarding the post-transcriptional mechanisms that control cellular mRNA abundance during asymmetric cell divisions, which are common in developing organisms.
At key moments in development, asymmetric cell divisions give rise to daughter cells of differing characteristics, a process that promotes cell-type diversity in complex organisms. The first cell division of the C. elegans early embryo is a powerful model for understanding asymmetric cell division because the timing of divisions and the placement of their division planes are precise and reproducible. We surveyed the mRNA content of each daughter cell in the C. elegans 2-cell embryo using low-input RNA sequencing. We identified several hundred asymmetric transcripts and tested them for functions in development. We found that the gene neg-1 produced mRNA and protein preferentially on the anterior (head-side) of 2-cell and 4-cell stage embryos and that loss of neg-1 led to consequences in anterior morphogenesis later in development. We also analyzed the asymmetric transcripts using quantitative microscopy, bioinformatics comparisons with previously existing datasets, and RNA sequence motif discovery to gain insight to the mechanisms by which asymmetric abundance patterns arise.
The C. elegans Dosage Compensation Complex (DCC) associates with both X chromosomes of XX animals to reduce X-linked transcript levels. Five DCC members are homologous to subunits of the evolutionarily conserved condensin complex, while two non-condensin subunits are required for DCC recruitment to X. Here, we investigated the molecular mechanism of DCC recruitment and spreading along X by examining gene expression and the binding patterns of DCC subunits in different stages of development, and in strains harboring X;autosome fusions. We show that DCC binding is dynamically specified according to gene activity during development, and that the mechanism of DCC spreading is independent of X-chromosome DNA sequence. Accordingly, in X;A fusion strains DCC spreading propagates from X-linked recruitment sites onto autosomal promoters as a function of distance. Quantitative analysis of spreading suggests that the condensin-like subunits spread from recruitment sites to promoters more readily than subunits involved in initial X-targeting. Via these mechanisms, a highly conserved chromatin complex is appropriated to accomplish domain-scale transcriptional regulation during development. Similarities to the X-recognition and spreading strategies used by the Drosophila DCC suggest mechanisms fundamental to chromosome-scale gene regulation.
The binding of sequence-specific regulatory factors and the recruitment of chromatin remodeling activities cause nucleosomes to be evicted from chromatin in eukaryotic cells. Traditionally, these active sites have been identified experimentally through their sensitivity to nucleases. Here we describe the details of a simple procedure for the genome-wide isolation of nucleosome-depleted DNA from human chromatin, termed FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements). We also provide protocols for different methods of detecting FAIRE-enriched DNA, including use of PCR, DNA microarrays, and next-generation sequencing. FAIRE works on all eukaryotic chromatin tested to date. To perform FAIRE, chromatin is crosslinked with formaldehyde, sheared by sonication, and phenol-chloroform extracted. Most genomic DNA is crosslinked to nucleosomes and is sequestered to the interphase, whereas DNA recovered in the aqueous phase corresponds to nucleosome-depleted regions of the genome. The isolated regions are largely coincident with the location of DNaseI hypersensitive sites, transcriptional start sites, enhancers, insulators, and active promoters. Given its speed and simplicity, FAIRE has utility in establishing chromatin profiles of diverse cell types in health and disease, isolating DNA regulatory elements en masse for further characterization, and as a screening assay for the effects of small molecules on chromatin organization.
Regulatory elements; chromatin accessibility; nucleosome occupancy; histone; FAIRE; DNase hypersensitivity; transcriptional regulation
RNA chaperones are ubiquitous, heterogeneous proteins essential for RNA structural biogenesis and function. We investigated the mechanism of chaperone-mediated RNA folding by following the time-resolved dimerization of the packaging domain of a retroviral RNA at nucleotide resolution. In the absence of the nucleocapsid (NC) chaperone, dimerization proceeded via multiple, slow-folding intermediates. In the presence of NC, dimerization occurred rapidly via a single structural intermediate. The RNA binding domain of hnRNP A1 protein (UP1), a structurally unrelated chaperone, also accelerated dimerization. Both chaperones interacted primarily with guanosine residues. Replacing guanosine with more weakly pairing inosine yielded an RNA that folded rapidly without a facilitating chaperone. These results show RNA chaperones can simplify RNA folding landscapes by weakening intramolecular interactions involving guanosine and explain many RNA chaperone activities.
Transient induction of the Src oncoprotein in a non-transformed breast cell line can initiate an epigenetic switch to a cancer cell via a positive feedback loop that involves activation of the signal transducer and activator of transcription 3 protein (STAT3) and NF-κB transcription factors.
We show that during the transformation process, nucleosome-depleted regions (defined by formaldehyde-assisted isolation of regulatory elements (FAIRE)) are largely unchanged and that both before and during transformation, STAT3 binds almost exclusively to previously open chromatin regions. Roughly, a third of the transformation-inducible genes require STAT3 for the induction. STAT3 and NF-κB appear to drive the regulation of different gene sets during the transformation process. Interestingly, STAT3 directly regulates the expression of NFKB1, which encodes a subunit of NF-κB, and IL6, a cytokine that stimulates STAT3 activity. Lastly, many STAT3 binding sites are also bound by FOS and the expression of several AP-1 factors is altered during transformation in a STAT3-dependent manner, suggesting that STAT3 may cooperate with AP-1 proteins.
These observations uncover additional complexities to the inflammatory feedback loop that are likely to contribute to the epigenetic switch. In addition, gene expression changes during transformation, whether driven by pre-existing or induced transcription factors, occur largely through pre-existing nucleosome-depleted regions.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-8935-8-7) contains supplementary material, which is available to authorized users.
MCF-10A; Transformation; FAIRE; STAT3; FOS
Associating genetic variation with quantitative measures of gene regulation offers a way to bridge the gap between genotype and complex phenotypes. In order to identify quantitative trait loci (QTLs) that influence the binding of a transcription factor in humans, we measured binding of the multifunctional transcription and chromatin factor CTCF in 51 HapMap cell lines. We identified thousands of QTLs in which genotype differences were associated with differences in CTCF binding strength, hundreds of them confirmed by directly observable allele-specific binding bias. The majority of QTLs were either within 1 kb of the CTCF binding motif, or in linkage disequilibrium with a variant within 1 kb of the motif. On the X chromosome we observed three classes of binding sites: a minority class bound only to the active copy of the X chromosome, the majority class bound to both the active and inactive X, and a small set of female-specific CTCF sites associated with two non-coding RNA genes. In sum, our data reveal extensive genetic effects on CTCF binding, both direct and indirect, and identify a diversity of patterns of CTCF binding on the X chromosome.
We have systematically measured the effect of normal genetic variation present in a human population on the binding of a specific chromatin protein (CTCF) to DNA by measuring its binding in 51 human cell lines. We observed a large number of changes in protein binding that we can confidently attribute to genetic effects. The corresponding genetic changes are often clustered around the binding motif for CTCF, but only a minority are actually within the motif. Unexpectedly, we also find that at most binding sites on the X chromosome, CTCF binding occurs equally on both the X chromosomes in females at the same level as on the single X chromosome in males. This finding suggests that in general, CTCF binding is not subject to global dosage compensation, the process which equalizes gene expression levels from the two female X chromosomes and the single male X.
Kaposi's sarcoma-associated herpesvirus (KSHV) is an oncogenic gammaherpesvirus which establishes latent infection in endothelial and B cells, as well as in primary effusion lymphoma (PEL). During latency, the viral genome exists as a circular DNA minichromosome (episome) and is packaged into chromatin analogous to human chromosomes. Only a small subset of promoters, those which drive latent RNAs, are active in latent episomes. In general, nucleosome depletion (“open chromatin”) is a hallmark of eukaryotic regulatory elements such as promoters and transcriptional enhancers or insulators. We applied formaldehyde-assisted isolation of regulatory elements (FAIRE) followed by next-generation sequencing to identify regulatory elements in the KSHV genome and integrated these data with previously identified locations of histone modifications, RNA polymerase II occupancy, and CTCF binding sites. We found that (i) regions of open chromatin were not restricted to the transcriptionally defined latent loci; (ii) open chromatin was adjacent to regions harboring activating histone modifications, even at transcriptionally inactive loci; and (iii) CTCF binding sites fell within regions of open chromatin with few exceptions, including the constitutive LANA promoter and the vIL6 promoter. FAIRE-identified nucleosome depletion was similar among B and endothelial cell lineages, suggesting a common viral genome architecture in all forms of latency.
Laminopathies are diseases characterized by defects in nuclear envelope structure. A well-known example is Emery-Dreifuss muscular dystrophy, which is caused by mutations in the human lamin A/C and emerin genes. While most nuclear envelope proteins are ubiquitously expressed, laminopathies often affect only a subset of tissues. The molecular mechanisms underlying these tissue-specific manifestations remain elusive. We hypothesize that different functional subclasses of genes might be differentially affected by defects in specific nuclear envelope components.
Here we determine genome-wide DNA association profiles of two nuclear envelope components, lamin/LMN-1 and emerin/EMR-1 in adult Caenorhabditis elegans. Although both proteins bind to transcriptionally inactive regions of the genome, EMR-1 is enriched at genes involved in muscle and neuronal function. Deletion of either EMR-1 or LEM-2, another integral envelope protein, causes local changes in nuclear architecture as evidenced by altered association between DNA and LMN-1. Transcriptome analyses reveal that EMR-1 and LEM-2 are associated with gene repression, particularly of genes implicated in muscle and nervous system function. We demonstrate that emr-1, but not lem-2, mutants are sensitive to the cholinesterase inhibitor aldicarb, indicating altered activity at neuromuscular junctions.
We identify a class of elements that bind EMR-1 but do not associate with LMN-1, and these are enriched for muscle and neuronal genes. Our data support a redundant function of EMR-1 and LEM-2 in chromatin anchoring to the nuclear envelope and gene repression. We demonstrate a specific role of EMR-1 in neuromuscular junction activity that may contribute to Emery-Dreifuss muscular dystrophy in humans.
The hsp-16.2 promoter is sufficient for recruitment of hsp-16.2 to nuclear pore complexes in a manner dependent on RNA pol II and ENY-2, but not on full-length mRNA production.
Some inducible yeast genes relocate to nuclear pores upon activation, but the general relevance of this phenomenon has remained largely unexplored. Here we show that the bidirectional hsp-16.2/41 promoter interacts with the nuclear pore complex upon activation by heat shock in the nematode Caenorhabditis elegans. Direct pore association was confirmed by both super-resolution microscopy and chromatin immunoprecipitation. The hsp-16.2 promoter was sufficient to mediate perinuclear positioning under basal level conditions of expression, both in integrated transgenes carrying from 1 to 74 copies of the promoter and in a single-copy genomic insertion. Perinuclear localization of the uninduced gene depended on promoter elements essential for induction and required the heat-shock transcription factor HSF-1, RNA polymerase II, and ENY-2, a factor that binds both SAGA and the THO/TREX mRNA export complex. After induction, colocalization with nuclear pores increased significantly at the promoter and along the coding sequence, dependent on the same promoter-associated factors, including active RNA polymerase II, and correlated with nascent transcripts.
DNaseI hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers, and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ~2.9 million DHSs that encompass virtually all known experimentally-validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation, and regulatory factor occupancy patterns. We connect ~580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is choreographed with dozens to hundreds of co-activated elements, and the trans-cellular DNaseI sensitivity pattern at a given region can predict cell type-specific functional behaviors. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
Centromeres are chromosomal loci that direct segregation of the genome during cell division. The histone H3 variant CENP-A (also known as CenH3) defines centromeres in monocentric organisms, which confine centromere activity to a discrete chromosomal region, and holocentric organisms, which distribute centromere activity along the chromosome length1–3. Because the highly repetitive DNA found at most centromeres is neither necessary nor sufficient for centromere function, stable inheritance of CENP-A nucleosomal chromatin is postulated to epigenetically propagate centromere identity4. Here, we show that in the holocentric nematode Caenorhabditis elegans pre-existing CENP-A nucleosomes are not necessary to guide recruitment of new CENP-A nucleosomes. This is indicated by lack of CENP-A transmission by sperm during fertilization and by removal and subsequent reloading of CENP-A during oogenic meiotic prophase. Genome-wide mapping of CENP-A location in embryos and quantification of CENP-A molecules in nuclei revealed that CENP-A is incorporated at low density in domains that cumulatively encompass half the genome. Embryonic CENP-A domains are established in a pattern inverse to regions that are transcribed in the germline and early embryo, and ectopic transcription of genes in a mutant germline altered the pattern of CENP-A incorporation in embryos. Furthermore, regions transcribed in the germline but not embryos fail to incorporate CENP-A throughout embryogenesis. We propose that germline transcription defines genomic regions that exclude CENP-A incorporation in progeny, and that zygotic transcription during early embryogenesis remodels and reinforces this basal pattern. These findings link centromere identity to transcription and shed light on the evolutionary plasticity of centromeres.
Many animal species use a chromosome-based mechanism of sex determination, which has led to the coordinate evolution of dosage-compensation systems. Dosage compensation not only corrects the imbalance in the number of X chromosomes between the sexes but also is hypothesized to correct dosage imbalance within cells that is due to monoallelic X-linked expression and biallelic autosomal expression, by upregulating X-linked genes twofold (termed ‘Ohno’s hypothesis’). Although this hypothesis is well supported by expression analyses of individual X-linked genes and by microarray-based transcriptome analyses, it was challenged by a recent study using RNA sequencing and proteomics. We obtained new, independent RNA-seq data, measured RNA polymerase distribution and reanalyzed published expression data in mammals, C. elegans and Drosophila. Our analyses, which take into account the skewed gene content of the X chromosome, support the hypothesis of upregulation of expressed X-linked genes to balance expression of the genome.