Many animal species use a chromosome-based mechanism of sex determination, which has led to the coordinate evolution of dosage-compensation systems. Dosage compensation not only corrects the imbalance in the number of X chromosomes between the sexes but also is hypothesized to correct dosage imbalance within cells that is due to monoallelic X-linked expression and biallelic autosomal expression, by upregulating X-linked genes twofold (termed ‘Ohno’s hypothesis’). Although this hypothesis is well supported by expression analyses of individual X-linked genes and by microarray-based transcriptome analyses, it was challenged by a recent study using RNA sequencing and proteomics. We obtained new, independent RNA-seq data, measured RNA polymerase distribution and reanalyzed published expression data in mammals, C. elegans and Drosophila. Our analyses, which take into account the skewed gene content of the X chromosome, support the hypothesis of upregulation of expressed X-linked genes to balance expression of the genome.
We performed a systematic evaluation of how variations in sequencing depth and other parameters influence interpretation of Chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) experiments. Using Drosophila S2 cells, we generated ChIP-seq datasets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin state bias, open chromatin regions yielded higher coverage, which led to false positives if not corrected and had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP library complexity at high coverage. The removal of reads originating at the same base reduced false-positives while having little effect on detection sensitivity. Even at a depth of ~1 read/bp coverage of mappable genome, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely-used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle datasets with deep coverage.
Dynamic access to genetic information is central to organismal development and environmental response. Consequently, genomic processes must be regulated by mechanisms that alter genome function relatively rapidly1-4. Conventional chromatin immunoprecipitation (ChIP) experiments measure transcription factor (TF) occupancy5, but are blind to kinetics and are poor predictors of TF function at a given locus. To measure TF binding dynamics genome-wide, we performed competition ChIP6,7 with a sequence-specific S. cerevisiae transcription factor, Rap18. Rap1 binding dynamics and Rap1 occupancy were only weakly correlated (R2 = 0.14), but binding dynamics were more strongly linked to function than occupancy. Long Rap1 residence was coupled to transcriptional activation, while fast binding turnover, which we term “treadmilling”, was linked to low transcriptional output. Thus, DNA-binding events that appear identical by conventional ChIP may have starkly different underlying modes of interaction that lead to opposing functional outcomes. We propose that TF binding turnover is a major point of regulation in determining the functional consequences of transcription factor binding, and is mediated in large part by control of competition between TFs and nucleosomes. Our model (Supplementary Fig. 1) predicts a clutch-like mechanism that rapidly engages a treadmilling transcription factor into a stable binding state, or vice-versa, to modulate TF function.
The C. elegans Dosage Compensation Complex (DCC) associates with both X chromosomes of XX animals to reduce X-linked transcript levels. Five DCC members are homologous to subunits of the evolutionarily conserved condensin complex, while two non-condensin subunits are required for DCC recruitment to X. Here, we investigated the molecular mechanism of DCC recruitment and spreading along X by examining gene expression and the binding patterns of DCC subunits in different stages of development, and in strains harboring X;autosome fusions. We show that DCC binding is dynamically specified according to gene activity during development, and that the mechanism of DCC spreading is independent of X-chromosome DNA sequence. Accordingly, in X;A fusion strains DCC spreading propagates from X-linked recruitment sites onto autosomal promoters as a function of distance. Quantitative analysis of spreading suggests that the condensin-like subunits spread from recruitment sites to promoters more readily than subunits involved in initial X-targeting. Via these mechanisms, a highly conserved chromatin complex is appropriated to accomplish domain-scale transcriptional regulation during development. Similarities to the X-recognition and spreading strategies used by the Drosophila DCC suggest mechanisms fundamental to chromosome-scale gene regulation.
The binding of sequence-specific regulatory factors and the recruitment of chromatin remodeling activities cause nucleosomes to be evicted from chromatin in eukaryotic cells. Traditionally, these active sites have been identified experimentally through their sensitivity to nucleases. Here we describe the details of a simple procedure for the genome-wide isolation of nucleosome-depleted DNA from human chromatin, termed FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements). We also provide protocols for different methods of detecting FAIRE-enriched DNA, including use of PCR, DNA microarrays, and next-generation sequencing. FAIRE works on all eukaryotic chromatin tested to date. To perform FAIRE, chromatin is crosslinked with formaldehyde, sheared by sonication, and phenol-chloroform extracted. Most genomic DNA is crosslinked to nucleosomes and is sequestered to the interphase, whereas DNA recovered in the aqueous phase corresponds to nucleosome-depleted regions of the genome. The isolated regions are largely coincident with the location of DNaseI hypersensitive sites, transcriptional start sites, enhancers, insulators, and active promoters. Given its speed and simplicity, FAIRE has utility in establishing chromatin profiles of diverse cell types in health and disease, isolating DNA regulatory elements en masse for further characterization, and as a screening assay for the effects of small molecules on chromatin organization.
Regulatory elements; chromatin accessibility; nucleosome occupancy; histone; FAIRE; DNase hypersensitivity; transcriptional regulation
Previous studies in Saccharomyces cerevisiae established that depletion of histone H4 results in the genome-wide transcriptional de-repression of hundreds of genes. To probe the mechanism of this transcriptional de-repression, we depleted nucleosomes in vivo by conditional repression of histone H3 transcription. We then measured the resulting changes in transcription by RNA–seq and in chromatin organization by MNase–seq. This experiment also bears on the degree to which trans-acting factors and DNA–encoded elements affect nucleosome position and occupancy in vivo. We identified ∼60,000 nucleosomes genome wide, and we classified ∼2,000 as having preferentially reduced occupancy following H3 depletion and ∼350 as being preferentially retained. We found that the in vivo influence of DNA sequences that favor or disfavor nucleosome occupancy increases following histone H3 depletion, demonstrating that nucleosome density contributes to moderating the influence of DNA sequence on nucleosome formation in vivo. To identify factors important for influencing nucleosome occupancy and position, we compared our data to 40 existing whole-genome data sets. Factors associated with promoters, such as histone acetylation and H2A.z incorporation, were enriched at sites of nucleosome loss. Nucleosome retention was linked to stabilizing marks such as H3K36me2. Notably, the chromatin remodeler Isw2 was uniquely associated with retained occupancy and altered positioning, consistent with Isw2 stabilizing histone–DNA contacts and centering nucleosomes on available DNA in vivo. RNA–seq revealed a greater number of de-repressed genes (∼2,500) than previous studies, and these genes exhibited reduced nucleosome occupancy in their promoters. In summary, we identify factors likely to influence nucleosome stability under normal growth conditions and the specific genomic locations at which they act. We find that DNA–encoded nucleosome stability and chromatin composition dictate which nucleosomes will be lost under conditions of limiting histone protein and that this, in turn, governs which genes are susceptible to a loss of regulatory fidelity.
Chromatin is formed by wrapping 146 bp of DNA around a disc-shaped complex of proteins called histones. These protein–DNA structures are known as nucleosomes. Nucleosomes help to regulate gene transcription, because nucleosomes compete with transcription factors for access to DNA. The precise positioning and level of nucleosome occupancy are known to be vital for transcriptional regulation, but the mechanisms that regulate the position and occupancy of nucleosomes are not fully understood. Recently, many studies have focused on the role of DNA sequence and chromatin remodeling proteins. Here, we manipulate the concentration of histone proteins in the cell to determine which nucleosomes are most susceptible to changes in occupancy and position. We find that the chromatin-associated proteins Sir2 and Tup1, and the chromatin remodelers Isw2 and Rsc8, are associated with stabilized nucleosomes. Histone acetylation and incorporation of the histone variant H2A.z are the factors most highly associated with destabilized nucleosomes. Certain DNA sequence properties also contribute to stability. The data identify factors likely to influence nucleosome stability and show a direct link between changes in chromatin and changes in transcription upon histone depletion.
We propose definitions and procedures for comparing nucleosome maps and discuss current agreement and disagreement on the effect of histone sequence preferences on nucleosome organization in vivo.
Next-generation sequencing-based assays to detect gene regulatory elements are enabling the analysis of individual-to-individual and allele-specific variation of chromatin status and transcription factor binding in humans. Recently, a number of studies have explored this area, using lymphoblastoid cell lines. Around 10% of chromatin sites show either individual-level differences or allele-specific behavior. Future studies are likely to be limited by cell line accessibility, meaning that white-bloodcell-based studies are likely to continue to be the main source of samples. A detailed understanding of the relationship between normal genetic variation and chromatin variation can shed light on how polymorphisms in non-coding regions in the human genome might underlie phenotypic variation and disease.
Insulin signaling has a profound effect on longevity and the oxidative stress resistance of animals. Inhibition of insulin signaling results in the activation of DAF-16/FOXO and SKN-1/Nrf transcription factors and increased animal fitness. By studying the biological functions of the endogenous RNA interference factor RDE-4 and conserved PHD zinc finger protein ZFP-1 (AF10), which regulate overlapping sets of genes in Caenorhabditis elegans, we identified an important role for these factors in the negative modulation of transcription of the insulin/PI3 signaling-dependent kinase PDK-1. Consistently, increased expression of pdk-1 in zfp-1 and rde-4 mutants contributed to their reduced lifespan and sensitivity to oxidative stress and pathogens due to the reduction in the expression of DAF-16 and SKN-1 targets. We found that the function of ZFP-1 in modulating pdk-1 transcription was important for the extended lifespan of the age-1(hx546) reduction-of-function PI3 kinase mutant, since the lifespan of the age-1; zfp-1 double mutant strain was significantly shorter compared to age-1(hx546). We further demonstrate that overexpression of ZFP-1 caused an increased resistance to oxidative stress in a DAF-16–dependent manner. Our findings suggest that epigenetic regulation of key upstream signaling components in signal transduction pathways through chromatin and RNAi may have a large impact on the outcome of signaling and expression of numerous downstream genes.
Reduced activity of the insulin-signaling pathway genes has been associated with a longer lifespan and increased resistance to oxidative stress in animals due to the activation of important transcription factors, which act as master regulators and affect large networks of genes. The ability to manipulate insulin signaling and reduce its activity may allow activation of oxidative-stress response programs in pathological conditions, such as neuronal degeneration, where oxidative stress plays a significant role. Here, we describe a new way of inhibiting insulin signaling that exists in the nematode Caenorhabditis elegans. We find that transcription of one of the insulin-signaling genes is inhibited by mechanisms involving chromatin and RNA interference, a silencing process that depends on short RNAs. We demonstrate that mutants deficient in RNA interference are more susceptible to stress due to increased insulin signaling and that increased dosage of a chromatin-binding protein repressing insulin signaling and promoting RNA interference leads to better survival of nematodes grown under oxidative stress conditions. Since there is a clear homolog of this chromatin-binding protein in mammals, it may also act to promote resistance to oxidative stress in human cells such as neurons.
ZINBA (Zero-Inflated Negative Binomial Algorithm) identifies genomic regions enriched in a variety of ChIP-seq and related next-generation sequencing experiments (DNA-seq), calling both broad and narrow modes of enrichment across a range of signal-to-noise ratios. ZINBA models and accounts for factors that co-vary with background or experimental signal, such as G/C content, and identifies enrichment in genomes with complex local copy number variations. ZINBA provides a single unified framework for analyzing DNA-seq experiments in challenging genomic contexts.
Software website: http://code.google.com/p/zinba/
Maintaining the proper expression of the transcriptome during development or in response to a changing environment requires a delicate balance between transcriptional regulators with activating and repressing functions. The budding yeast transcriptional co-repressor Tup1-Ssn6 is a model for studying similar repressor complexes in multicellular eukaryotes. Tup1-Ssn6 does not bind DNA directly, but is directed to individual promoters by one or more DNA-binding proteins, referred to as Tup1 recruiters. This functional architecture allows the Tup1-Ssn6 to modulate the expression of genes required for the response to a variety of cellular stresses. To understand the targeting or the Tup1-Ssn6 complex, we determined the genomic distribution of Tup1 and Ssn6 by ChIP-chip. We found that most loci bound by Tup1-Ssn6 could not be explained by co-occupancy with a known recruiting cofactor and that deletion of individual known Tup1 recruiters did not significantly alter the Tup1 binding profile. These observations suggest that new Tup1 recruiting proteins remain to be discovered and that Tup1 recruitment typically depends on multiple recruiting cofactors. To identify new recruiting proteins, we computationally screened for factors with binding patterns similar to the observed Tup1-Ssn6 genomic distribution. Four top candidates, Cin5, Skn7, Phd1, and Yap6, all known to be associated with stress response gene regulation, were experimentally confirmed to physically interact with Tup1 and/or Ssn6. Incorporating these new recruitment cofactors with previously characterized cofactors now explains the majority of Tup1 targeting across the genome, and expands our understanding of the mechanism by which Tup1-Ssn6 is directed to its targets.
Chromatin in sperm is different from that in other cells, with most of the genome packaged by protamines not nucleosomes. Nucleosomes are, however, retained at some genomic sites, where they have the potential to transmit paternal epigenetic information. It is not understood how this retention is specified. Here we show that base composition is the major determinant of nucleosome retention in human sperm, predicting retention very well in both genic and non-genic regions of the genome. The retention of nucleosomes at GC-rich sequences with high intrinsic nucleosome affinity accounts for the previously reported retention at transcription start sites and at genes that regulate development. It also means that nucleosomes are retained at the start sites of most housekeeping genes. We also report a striking link between the retention of nucleosomes in sperm and the establishment of DNA methylation-free regions in the early embryo. Taken together, this suggests that paternal nucleosome transmission may facilitate robust gene regulation in the early embryo. We propose that chromatin organization in the male germline, rather than in somatic cells, is the major functional consequence of fine-scale base composition variation in the human genome. The selective pressure driving base composition evolution in mammals could, therefore, be the need to transmit paternal epigenetic information to the zygote.
In most cells, DNA is packaged by protein complexes called nucleosomes. In sperm, however, nucleosomes are only retained at a small fraction of the genome, particularly at the start sites of genes. In this work, we show that the sites at which nucleosomes are retained in sperm are specified by variation in the base composition of the human genome. At a fine scale, the human genome varies extensively in the content of GC versus AT base pairs, and we find that in both genic and non-genic regions this predicts very well where nucleosomes are retained in mature sperm. These regions include transcription start sites, especially for genes that are expressed in all cells and for genes that regulate development. We also report that regions that retain nucleosomes in sperm are likely to be protected from DNA methylation in the early embryo, suggesting a further connection between the presence of nucleosomes on the paternal genome and the establishment of gene regulation in the embryo. Based on these results, we propose that an important selective pressure on base composition evolution in mammalian genomes may be the requirement to organize chromatin in sperm in a way that facilitates gene regulation in the early embryo.
Although Caenorhabditis elegans was the first multicellular organism with a completely sequenced genome, how this genome is arranged within the nucleus is not known.
We determined the genomic regions associated with the nuclear transmembrane protein LEM-2 in mixed-stage C. elegans embryos via chromatin immunoprecipitation. Large regions of several megabases on the arms of each autosome were associated with LEM-2. The center of each autosome was mostly free of such interactions, suggesting that they are largely looped out from the nuclear membrane. Only the left end of the X chromosome was associated with the nuclear membrane. At a finer scale, the large membrane-associated domains consisted of smaller subdomains of LEM-2 associations. These subdomains were characterized by high repeat density, low gene density, high levels of H3K27 trimethylation, and silent genes. The subdomains were punctuated by gaps harboring highly active genes. A chromosome arm translocated to a chromosome center retained its association with LEM-2, although there was a slight decrease in association near the fusion point.
Local DNA or chromatin properties are the main determinant of interaction with the nuclear membrane, with position along the chromosome making a minor contribution. Genes in small gaps between LEM-2 associated regions tend to be highly expressed, suggesting that these small gaps are especially amenable to highly efficient transcription. Although our data are derived from an amalgamation of cell types in mixed-stage embryos, the results suggest a model for the spatial arrangement of C. elegans chromosomes within the nucleus.
Chromatin plays a central role in eukaryotic gene regulation. We have performed genome-wide mapping of epigenetically-marked nucleosomes to determine their position both near transcription start sites and at distal regulatory elements including enhancers. In prostate cancer cells where androgen receptor (AR) binds primarily to enhancers, we found that androgen treatment dismisses a central nucleosome present over AR binding sites that is flanked by a pair of marked nucleosomes. A novel quantitative model built on the behavior of such nucleosome pairs correctly identified regions bound by the regulators of the immediate androgen response including AR and FoxA1. More importantly this model also correctly predicted novel binding sites for other transcription factors present following prolonged androgen stimulation including Oct1 and NKX3.1. Thus quantitative modeling of enhancer structure provides a powerful predictive method to infer the identity of transcription factors involved in cellular responses to specific stimuli.
Coprinopsis cinerea (also known as Coprinus cinereus) is a multicellular basidiomycete mushroom particularly suited to the study of meiosis due to its synchronous meiotic development and prolonged prophase. We examined the 15-hour meiotic transcriptional program of C. cinerea, encompassing time points prior to haploid nuclear fusion though tetrad formation, using a 70-mer oligonucleotide microarray. As with other organisms, a large proportion (∼20%) of genes are differentially regulated during this developmental process, with successive waves of transcription apparent in nine transcriptional clusters, including one enriched for meiotic functions. C. cinerea and the fungi Saccharomyces cerevisiae and Schizosaccharomyces pombe diverged ∼500–900 million years ago, permitting a comparison of transcriptional programs across a broad evolutionary time scale. Previous studies of S. cerevisiae and S. pombe compared genes that were induced upon entry into meiosis; inclusion of C. cinerea data indicates that meiotic genes are more conserved in their patterns of induction across species than genes not known to be meiotic. In addition, we found that meiotic genes are significantly more conserved in their transcript profiles than genes not known to be meiotic, which indicates a remarkable conservation of the meiotic process across evolutionarily distant organisms. Overall, meiotic function genes are more conserved in both induction and transcript profile than genes not known to be meiotic. However, of 50 meiotic function genes that were co-induced in all three species, 41 transcript profiles were well-correlated in at least two of the three species, but only a single gene (rad50) exhibited coordinated induction and well-correlated transcript profiles in all three species, indicating that co-induction does not necessarily predict correlated expression or vice versa. Differences may reflect differences in meiotic mechanisms or new roles for paralogs. Similarities in induction, transcript profiles, or both, should contribute to gene discovery for orthologs without currently characterized meiotic roles.
Meiosis is the part of the sexual reproduction process in which the number of chromosomes in an organism is halved. This occurs in most plants, animals, and fungi; and many of the proteins involved are the same in the different organisms that have been studied. We wanted to ask whether the genes involved in the meiotic process are turned on and off at the same stages of meiosis in organisms that separated a long time ago. To do this we looked at three fungal species, Saccharomyces cerevisiae (baker's yeast), Schizosaccharomyces pombe (a very distantly related fungus of the same phylum), and Coprinopsis cinerea (a mushroom-forming fungus of a different phylum), which had a common ancestor 500–900 million years ago (in comparison, rats and mice separated ∼23 million years ago). We lined up meiotic stages and found that gene expression during the meiotic process was more conserved for meiotic genes than for non-meiotic genes, indicating ancient conservation of the meiotic process.
Tissue-specific transcriptional regulation is central to human disease1. To identify regulatory DNA active in human pancreatic islets, we profiled chromatin by FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements)2–4 coupled with high-throughput sequencing. We identified ~80,000 open chromatin sites. Comparison of islet FAIRE-seq to five non-islet cell lines revealed ~3,300 physically linked clusters of islet-selective open chromatin sites, which typically encompassed single genes exhibiting islet-specific expression. We mapped sequence variants to open chromatin sites and found that rs7903146, a TCF7L2 intronic variant strongly associated with type 2 diabetes (T2D)5, is located in islet-selective open chromatin. We show that rs7903146 heterozygotes exhibit allelic imbalance in islet FAIRE signal, and that the variant alters enhancer activity, indicating that genetic variation at this locus acts in cis with local chromatin and regulatory changes. These findings illuminate the tissue-specific organization of cis-regulatory elements, and show that FAIRE-seq can guide identification of regulatory variants important for disease.
The extent to which variation in chromatin structure and transcription factor binding may influence gene expression, and thus underlie or contribute to variation in phenotype, is unknown. To address this question, we cataloged both individual-to-individual variation and differences between homologous chromosomes within the same individual (allele-specific variation) in chromatin structure and transcription factor binding in lymphoblastoid cells derived from individuals of geographically diverse ancestry. Ten percent of active chromatin sites were individual-specific; a similar proportion were allele-specific. Both individual-specific and allele-specific sites were commonly transmitted from parent to child, which suggests that they are heritable features of the human genome. Our study shows that heritable chromatin status and transcription factor binding differ as a result of genetic variation and may underlie phenotypic variation in humans.
Replication forks face multiple obstacles that slow their progression. By two-dimensional gel analysis, yeast forks pause at stable DNA protein complexes, and this pausing is greatly increased in the absence of the Rrm3 helicase. We used a genome wide approach to identify 96 sites of very high DNA polymerase binding in wild type cells. Most of these binding sites were not previously identified pause sites. Rather, the most highly represented genomic category among high DNA polymerase binding sites was the open reading frames (ORFs) of highly transcribed RNA polymerase II genes. Twice as many pause sites were identified in rrm3 compared to wild type cells as pausing in this strain occurred at both highly transcribed RNA polymerase II genes and the previously identified protein DNA complexes. ORFs of highly transcribed RNA polymerase II genes are the first class of natural pause sites that are not exacerbated in rrm3 cells.
DNA replication; helicase; Rrm3; yeast; genome integrity; microarray; chromatin immunoprecipitation
Despite the successes of genomics, little is known about how genetic information produces complex organisms. A look at the crucial functional elements of fly and worm genomes could change that.
Active eukaryotic regulatory sites are characterized by open chromatin, and yeast promoters and transcription factor binding sites (TFBSs) typically have low intrinsic nucleosome occupancy. Here, we show that in contrast to yeast, DNA at human promoters, enhancers, and TFBSs generally encodes high intrinsic nucleosome occupancy. In most cases we examined, these elements also have high experimentally measured nucleosome occupancy in vivo. These regions typically have high G+C content, which correlates positively with intrinsic nucleosome occupancy, and are depleted for nucleosome-excluding poly-A sequences. We propose that high nucleosome preference is directly encoded at regulatory sequences in the human genome to restrict access to regulatory information that will ultimately be utilized in only a subset of differentiated cells.
Histone lysine (K) acetylation is a major mechanism by which cells regulate the structure and function of chromatin, and new sites of acetylation continue to be discovered. Here we identify and characterize histone H3K36 acetylation (H3K36ac). By mass spectrometric analyses of H3 purified from Tetrahymena thermophila and Saccharomyces cerevisiae (yeast), we find that H3K36 can be acetylated or methylated. Using an antibody specific to H3K36ac, we show that this modification is conserved in mammals. In yeast, genome-wide ChIP-chip experiments show that H3K36ac is localized predominantly to the promoters of RNA polymerase II-transcribed genes, a pattern inversely related to that of H3K36 methylation. The pattern of H3K36ac localization is similar to that of other sites of H3 acetylation, including H3K9ac and H3K14ac. Using histone acetyltransferase complexes purified from yeast, we show that the Gcn5-containing SAGA complex that regulates transcription specifically acetylates H3K36 in vitro. Deletion of GCN5 completely abolishes H3K36ac in vivo. These data expand our knowledge of the genomic targets of Gcn5, show H3K36ac is highly conserved, and raise the intriguing possibility that the transition between H3K36ac and H3K36me acts as an “acetyl/methyl switch” governing chromatin function along transcription units.