The pathways regulating the transition of mammalian cells from quiescence to proliferation are mediated by multiple miRNAs. Despite significant improvements in our understanding of miRNA targeting, the majority of miRNA regulatory networks are still largely unknown and require experimental validation.
Here we identified miR-503, miR-103, and miR-494 as negative regulators of proliferation in primary human cells. We experimentally determined their genome wide target profiles using RNA-induced silencing complex (RISC) immunoprecipitations and gene expression profiling. Analysis of the genome wide target profiles revealed evidence of extensive regulation of gene expression through non-canonical target pairing by miR-503. We identified the proto-oncogene DDHD2 as a target of miR-503 that requires pairing outside of the canonical 5′ seed region of miR-503, representing a novel mode of miRNA-target pairing. Further bioinformatics analysis implicated miR-503 and DDHD2 in breast cancer tumorigenesis.
Our results provide an extensive genome wide set of targets for miR-503, miR-103, and miR-494, and suggest that miR-503 may act as a tumor suppressor in breast cancer by its direct non-canonical targeting of DDHD2.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1279-9) contains supplementary material, which is available to authorized users.
miRNA; miRNA targeting; Proliferation; Ago2 immunoprecipitation; RIP-seq; miRNA targets; miRNA target pairing; miR-503; miRNA non-canonical pairing
Heat shock transcription factor (HSF) and the promoter heat shock element (HSE) are among the most highly conserved transcriptional regulatory elements in nature. HSF mediates the transcriptional response of eukaryotic cells to heat, infection and inflammation, pharmacological agents, and other stresses. While HSF is essential for cell viability in Saccharomyces cerevisiae, oogenesis and early development in Drosophila melanogaster, extended life span in Caenorhabditis elegans, and extraembryonic development and stress resistance in mammals, little is known about its full range of biological target genes. We used whole-genome analyses to identify virtually all of the direct transcriptional targets of yeast HSF, representing nearly 3% of the genomic loci. The majority of the identified loci are heat-inducibly bound by yeast HSF, and the target genes encode proteins that have a broad range of biological functions including protein folding and degradation, energy generation, protein trafficking, maintenance of cell integrity, small molecule transport, cell signaling, and transcription. This genome-wide identification of HSF target genes provides novel insights into the role of HSF in growth, development, disease, and aging and in the complex metabolic reprogramming that occurs in all cells in response to stress.
Associating genetic variation with quantitative measures of gene regulation offers a way to bridge the gap between genotype and complex phenotypes. In order to identify quantitative trait loci (QTLs) that influence the binding of a transcription factor in humans, we measured binding of the multifunctional transcription and chromatin factor CTCF in 51 HapMap cell lines. We identified thousands of QTLs in which genotype differences were associated with differences in CTCF binding strength, hundreds of them confirmed by directly observable allele-specific binding bias. The majority of QTLs were either within 1 kb of the CTCF binding motif, or in linkage disequilibrium with a variant within 1 kb of the motif. On the X chromosome we observed three classes of binding sites: a minority class bound only to the active copy of the X chromosome, the majority class bound to both the active and inactive X, and a small set of female-specific CTCF sites associated with two non-coding RNA genes. In sum, our data reveal extensive genetic effects on CTCF binding, both direct and indirect, and identify a diversity of patterns of CTCF binding on the X chromosome.
We have systematically measured the effect of normal genetic variation present in a human population on the binding of a specific chromatin protein (CTCF) to DNA by measuring its binding in 51 human cell lines. We observed a large number of changes in protein binding that we can confidently attribute to genetic effects. The corresponding genetic changes are often clustered around the binding motif for CTCF, but only a minority are actually within the motif. Unexpectedly, we also find that at most binding sites on the X chromosome, CTCF binding occurs equally on both the X chromosomes in females at the same level as on the single X chromosome in males. This finding suggests that in general, CTCF binding is not subject to global dosage compensation, the process which equalizes gene expression levels from the two female X chromosomes and the single male X.
Chromatin consists of ordered nucleosomal arrays that are controlled by highly conserved adenosine triphosphate (ATP)-dependent chromatin remodeling complexes. One such remodeler, chromodomain helicase DNA binding protein 1 (Chd1), is believed to play an integral role in nucleosomal organization, as the loss of Chd1 is known to disrupt chromatin. However, the specificity and basis for the functional and physical localization of Chd1 on chromatin remains largely unknown.
Using genome-wide approaches, we found that the loss of Chd1 significantly disrupted nucleosome arrays within the gene bodies of highly transcribed genes. We also found that Chd1 is physically recruited to gene bodies, and that its occupancy specifically corresponds to that of the early elongating form of RNA polymerase, RNAPII Ser 5-P. Conversely, RNAPII Ser 5-P occupancy was affected by the loss of Chd1, suggesting that Chd1 is associated with early transcription elongation. Surprisingly, the occupancy of RNAPII Ser 5-P was affected by the loss of Chd1 specifically at intron-containing genes. Nucleosome turnover was also affected at these sites in the absence of Chd1. We also found that deletion of the histone methyltransferase for H3K36 (SET2) did not affect either Chd1 occupancy or nucleosome organization genome-wide.
Chd1 is specifically recruited onto the gene bodies of highly transcribed genes in an elongation-dependent but H3K36me3-independent manner. Chd1 co-localizes with the early elongating form of RNA polymerase, and affects the occupancy of RNAPII only at genes containing introns, suggesting a role in relieving splicing-related pausing of RNAPII.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-8935-7-32) contains supplementary material, which is available to authorized users.
Chromodomain helicase DNA binding protein 1 (Chd1); Chromatin remodeling; Transcription elongation; H3K36 methylation; Intron
Despite an emerging understanding of the genetic alterations giving rise to various tumors, the mechanisms whereby most oncogenes are overexpressed remain unclear. Here we have utilized an integrated approach of genomewide regulatory element mapping via DNase-seq followed by conventional reporter assays and transcription factor binding site discovery to characterize the transcriptional regulation of the medulloblastoma oncogene Orthodenticle Homeobox 2 (OTX2). Through these studies we have revealed that OTX2 is differentially regulated in medulloblastoma at the level of chromatin accessibility, which is in part mediated by DNA methylation. In cell lines exhibiting chromatin accessibility of OTX2 regulatory regions, we found that autoregulation maintains OTX2 expression. Comparison of medulloblastoma regulatory elements with those of the developing brain reveals that these tumors engage a developmental regulatory program to drive OTX2 transcription. Finally, we have identified a transcriptional regulatory element mediating retinoid-induced OTX2 repression in these tumors. This work characterizes for the first time the mechanisms of OTX2 overexpression in medulloblastoma. Furthermore, this study establishes proof of principle for applying ENCODE datasets towards the characterization of upstream trans-acting factors mediating expression of individual genes.
We show here that singular loss of the Bright/Arid3A transcription factor leads to reprograming of mouse embryonic fibroblasts (MEFs) and enhancement of standard four-factor (4F) reprogramming. Bright-deficient MEFs bypass senescence and, under standard embryonic stem cell (ESC) culture conditions, spontaneously form clones that in vitro express pluripotency markers, differentiate to all germ lineages, and in vivo form teratomas and chimeric mice. We demonstrate that BRIGHT binds directly to the promoter/enhancer regions of Oct4, Sox2, and Nanog to contribute to their repression in both MEFs and ESCs. Thus, elimination of the BRIGHT barrier may provide an approach for somatic cell reprogramming.
•Loss of Bright can alone reprogram or enhance conventional four-factor reprogramming•Bright directly represses Oct4, Sox2, and Nanog•Bright may function in somatic and embryonic stem cells to enforce differentiation
Popowski et al. show that loss of the transcription factor Bright/Arid3A induces reprogramming in mouse embryonic fibroblasts (MEFs) and enhancement of standard four-factor reprograming. Bright-deficient reprogrammed cells express all pluripotency markers and are capable of forming teratomas and chimeric mice. Bright binds directly to the promoter/enhancer regions of Oct4, Sox2, and Nanog and contributes to their repression in both MEFs and embryonic stem cells.
Understanding the relationships between regulatory factor binding, chromatin structure, cis-regulatory elements and RNA-regulation mechanisms relies on accurate information about transcription start sites (TSS) and polyadenylation sites (PAS). Although several approaches have identified transcript ends in yeast, limitations of resolution and coverage have remained, and definitive identification of TSS and PAS with single-nucleotide resolution has not yet been achieved. We developed SMORE-seq (simultaneous mapping of RNA ends by sequencing) and used it to simultaneously identify the strongest TSS for 5207 (90%) genes and PAS for 5277 (91%) genes. The new transcript annotations identified by SMORE-seq showed improved distance relationships with TATA-like regulatory elements, nucleosome positions and active RNA polymerase. We found 150 genes whose TSS were downstream of the annotated start codon, and additional analysis of evolutionary conservation and ribosome footprinting suggests that these protein-coding sequences are likely to be mis-annotated. SMORE-seq detected short non-coding RNAs transcribed divergently from more than a thousand promoters in wild-type cells under normal conditions. These divergent non-coding RNAs were less evident at promoters containing canonical TATA boxes, suggesting a model where transcription initiation at promoters by RNAPII is bidirectional, with TATA elements serving to constrain the directionality of initiation.
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to detect genome-wide interactions between a protein of interest and DNA in vivo. Loci showing strong enrichment over adjacent background regions are typically considered to be sites of binding. Insufficient attention has been given to systematic artifacts inherent to the ChIP-seq procedure that might generate a misleading picture of protein binding to certain loci. We show here that unrelated transcription factors appear to consistently bind to the gene bodies of highly transcribed genes in yeast. Strikingly, several types of negative control experiments, including a protein that is not expected to bind chromatin, also showed similar patterns of strong binding within gene bodies. These false positive signals were evident across sequencing platforms and immunoprecipitation protocols, as well as in previously published datasets from other labs. We show that these false positive signals derive from high rates of transcription, and are inherent to the ChIP procedure, although they are exacerbated by sequencing library construction procedures. This expression bias is strong enough that a known transcriptional repressor like Tup1 can erroneously appear to be an activator. Another type of background bias stems from the inherent nucleosomal structure of chromatin, and can potentially make it seem like certain factors bind nucleosomes even when they don't. Our analysis suggests that a mock ChIP sample offers a better normalization control for the expression bias, whereas the ChIP input is more appropriate for the nucleosomal periodicity bias. While these controls alleviate the effect of the biases to some extent, they are unable to eliminate it completely. Caution is therefore warranted regarding the interpretation of data that seemingly show the association of various transcription and chromatin factors with highly transcribed genes in yeast.
Recombination-activating gene 1 protein (RAG1) and RAG2 are critical enzymes for initiating variable-diversity-joining (VDJ) segment recombination, an essential process for antigen receptor expression and lymphocyte development. The transcription factor BCL11A is required for B cell development, but its molecular function(s) in B cell fate specification and commitment is unknown. We show here that the major B cell isoform, BCL11A-XL, binds the RAG1 promoter and Erag enhancer to activate RAG1 and RAG2 transcription in pre-B cells. We employed BCL11A overexpression with recombination substrates in a cultured pre-B cell line as well as Cre recombinase-mediated Bcl11alox/lox deletion in explanted murine pre-B cells to demonstrate direct consequences of BCL11A/RAG modulation on V(D)J recombination. We conclude that BCL11A is a critical component of a transcriptional network that regulates B cell fate by controlling V(D)J recombination.
DNaseI hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers, and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ~2.9 million DHSs that encompass virtually all known experimentally-validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation, and regulatory factor occupancy patterns. We connect ~580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is choreographed with dozens to hundreds of co-activated elements, and the trans-cellular DNaseI sensitivity pattern at a given region can predict cell type-specific functional behaviors. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
Nucleosomes are an essential component of eukaryotic chromosomes. The impact of nucleosomes is seen not just on processes that directly access the genome such as transcription, but also on an evolutionary timescale. Recent studies in a number of organisms have provided high-resolution maps of nucleosomes throughout the genome. Computational analysis, in conjunction with many other kinds of data, has shed light on several aspects of nucleosome biology. Nucleosomes are positioned by several means, including intrinsic sequence biases, by stacking against a fixed barrier, by DNA-binding proteins and by chromatin remodelers. These studies underscore the critical organizational role of nucleosomes in all eukaryotic genomes. Here, I review recent genomic studies that shed light on the determinants of nucleosome positioning and their impact on the genome.
Nucleosome; chromatin; remodeling; epigenetic; genome packaging
The transition of mammalian cells from quiescence to proliferation is accompanied by the differential expression of several microRNAs (miRNAs) and transcription factors. However, the interplay between transcription factors and miRNAs in modulating gene regulatory networks involved in human cell proliferation is largely unknown. Here we show that the miRNA miR-22 promotes proliferation in primary human cells, and through a combination of Argonaute-2 immunoprecipitation and reporter assays, we identified multiple novel targets of miR-22, including several cell-cycle arrest genes that mediate the effects of the tumor-suppressor p53. In addition, we found that miR-22 suppresses interferon gene expression by directly targeting high mobility group box-1 and interferon regulatory factor (IRF)-5, preventing activation of IRF3 and NF-κB, which are activators of interferon genes. The expression of interferon genes is elevated in quiescent cells and their expression is inhibitory for cell proliferation. In addition, we find that miR-22 is activated by the transcription factor Myc when quiescent cells enter proliferation and that miR-22 inhibits the Myc transcriptional repressor MXD4, mediating a feed-forward loop to elevate Myc expression levels. Our results implicate miR-22 in downregulating the anti-proliferative p53 and interferon pathways and reveal a new transcription factor–miRNA network that regulates the transition of primary human cells from quiescence to proliferation.
Single nucleotide polymorphisms (SNPs) have been associated with many aspects of human development and disease, and many non-coding SNPs associated with disease risk are presumed to affect gene regulation. We have previously shown that SNPs within transcription factor binding sites can affect transcription factor binding in an allele-specific and heritable manner. However, such analysis has relied on prior whole-genome genotypes provided by large external projects such as HapMap and the 1000 Genomes Project. This requirement limits the study of allele-specific effects of SNPs in primary patient samples from diseases of interest, where complete genotypes are not readily available.
In this study, we show that we are able to identify SNPs de novo and accurately from ChIP-seq data generated in the ENCODE Project. Our de novo identified SNPs from ChIP-seq data are highly concordant with published genotypes. Independent experimental verification of more than 100 sites estimates our false discovery rate at less than 5%. Analysis of transcription factor binding at de novo identified SNPs revealed widespread heritable allele-specific binding, confirming previous observations. SNPs identified from ChIP-seq datasets were significantly enriched for disease-associated variants, and we identified dozens of allele-specific binding events in non-coding regions that could distinguish between disease and normal haplotypes.
Our approach combines SNP discovery, genotyping and allele-specific analysis, but is selectively focused on functional regulatory elements occupied by transcription factors or epigenetic marks, and will therefore be valuable for identifying the functional regulatory consequences of non-coding SNPs in primary disease samples.
SNPs; Transcription factors; ChIP-seq; Genotyping; Allele-specific
Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species.
The human genome shares a remarkable amount of genomic sequence with our closest living primate relatives. Researchers have long sought to understand what regions of the genome are responsible for unique species-specific traits. Previous studies have shown that many genes are differentially expressed between species, but the regulatory elements contributing to these differences are largely unknown. Here we report a genome-wide comparison of active gene regulatory elements in human, chimpanzee, and macaque, and we identify hundreds of regulatory elements that have been gained or lost in the human or chimpanzee genomes since their evolutionary divergence. These elements contain evidence of natural selection and correlate with species-specific changes in gene expression. Polymorphic DNA bases in transcription factor motifs that we found in these regulatory elements may be responsible for the varied biological functions across species. This study directly links phenotypic and transcriptional differences between species with changes in chromatin structure.
Next-generation sequencing-based assays to detect gene regulatory elements are enabling the analysis of individual-to-individual and allele-specific variation of chromatin status and transcription factor binding in humans. Recently, a number of studies have explored this area, using lymphoblastoid cell lines. Around 10% of chromatin sites show either individual-level differences or allele-specific behavior. Future studies are likely to be limited by cell line accessibility, meaning that white-bloodcell-based studies are likely to continue to be the main source of samples. A detailed understanding of the relationship between normal genetic variation and chromatin variation can shed light on how polymorphisms in non-coding regions in the human genome might underlie phenotypic variation and disease.
The E2F family of transcription factors has important roles in cell cycle progression. E2F4 is an E2F family member that has been proposed to be primarily a repressor of transcription, but the scope of its binding activity and functions in transcriptional regulation is not fully known. We used ChIP sequencing (ChIP-seq) to identify around 16 000 E2F4 binding sites which potentially regulate 7346 downstream target genes with wide-ranging functions in DNA repair, cell cycle regulation, apoptosis, and other processes. While half of all E2F4 binding sites (56%) occurred near transcription start sites (TSSs), ∼20% of sites occurred more than 20 kb away from any annotated TSS. These distal sites showed histone modifications suggesting that E2F4 may function as a long-range regulator, which we confirmed by functional experimental assays on a subset. Overexpression of E2F4 and its transcriptional cofactors of the retinoblastoma (Rb) family and its binding partner DP-1 revealed that E2F4 acts as an activator as well as a repressor. E2F4 binding sites also occurred near regulatory elements for miRNAs such as let-7a and mir-17, suggestive of regulation of miRNAs by E2F4. Taken together, our genome-wide analysis provided evidence of versatile roles of E2F4 and insights into its functions.
The extent to which variation in chromatin structure and transcription factor binding may influence gene expression, and thus underlie or contribute to variation in phenotype, is unknown. To address this question, we cataloged both individual-to-individual variation and differences between homologous chromosomes within the same individual (allele-specific variation) in chromatin structure and transcription factor binding in lymphoblastoid cells derived from individuals of geographically diverse ancestry. Ten percent of active chromatin sites were individual-specific; a similar proportion were allele-specific. Both individual-specific and allele-specific sites were commonly transmitted from parent to child, which suggests that they are heritable features of the human genome. Our study shows that heritable chromatin status and transcription factor binding differ as a result of genetic variation and may underlie phenotypic variation in humans.
ArrayPlex is a software package that centrally provides a large number of flexible toolsets useful for functional genomics.
ArrayPlex is a software package that centrally provides a large number of flexible toolsets useful for functional genomics, including microarray data storage, quality assessments, data visualization, gene annotation retrieval, statistical tests, genomic sequence retrieval and motif analysis. It uses a client-server architecture based on open source components, provides graphical, command-line, and programmatic access to all needed resources, and is extensible by virtue of a documented application programming interface. ArrayPlex is available at .
Although chromatin structure is known to affect transcriptional activity, it is not clear how broadly patterns of changes in histone modifications and nucleosome occupancy affect the dynamic regulation of transcription in response to perturbations. The identity and role of chromatin remodelers that mediate some of these changes are also unclear. Here, we performed temporal genome-wide analyses of gene expression, nucleosome occupancy, and histone H4 acetylation during the response of yeast (Saccharomyces cerevisiae) to different stresses and report several findings. First, a large class of predominantly ribosomal protein genes, whose transcription was repressed during both heat shock and stationary phase, showed strikingly contrasting histone acetylation patterns. Second, the SWI/SNF complex was required for normal activation as well as repression of genes during heat shock, and loss of SWI/SNF delayed chromatin remodeling at the promoters of activated genes. Third, Snf2 was recruited to ribosomal protein genes and Hsf1 target genes, and its occupancy of this large set of genes was altered during heat shock. Our results suggest a broad and direct dual role for SWI/SNF in chromatin remodeling, during heat shock activation as well as repression, at promoters and coding regions.
Regulation of cell cycle progression is fundamental to cell health and reproduction, and failures in this process are associated with many human diseases. Much of our knowledge of cell cycle regulators derives from loss-of-function studies. To reveal new cell cycle regulatory genes that are difficult to identify in loss-of-function studies, we performed a near-genome-wide flow cytometry assay of yeast gene overexpression-induced cell cycle delay phenotypes. We identified 108 genes whose overexpression significantly delayed the progression of the yeast cell cycle at a specific stage. Many of the genes are newly implicated in cell cycle progression, for example SKO1, RFA1, and YPR015C. The overexpression of RFA1 or YPR015C delayed the cell cycle at G2/M phases by disrupting spindle attachment to chromosomes and activating the DNA damage checkpoint, respectively. In contrast, overexpression of the transcription factor SKO1 arrests cells at G1 phase by activating the pheromone response pathway, revealing new cross-talk between osmotic sensing and mating. More generally, 92%–94% of the genes exhibit distinct phenotypes when overexpressed as compared to their corresponding deletion mutants, supporting the notion that many genes may gain functions upon overexpression. This work thus implicates new genes in cell cycle progression, complements previous screens, and lays the foundation for future experiments to define more precisely roles for these genes in cell cycle progression.
All cells require proper cell cycle regulation; failure leads to numerous human diseases. Cell cycle mechanisms are broadly conserved across eukaryotes, with many key regulatory genes known. Nonetheless, our knowledge of regulators is incomplete. Many classic studies have analyzed yeast loss-of-function mutants to identify cell cycle genes. Studies have also implicated genes based upon their overexpression phenotypes, but the effects of gene overexpression on the cell cycle have not been quantified for all yeast genes. We individually quantified the effect of overexpression on cell cycle progression for nearly all (91%) of yeast genes, and we report the 108 genes causing the most significant and reproducible cell cycle defects, most of which have not been previously observed. We characterize three genes in more detail, implicating one in chromosomal segregation and mitotic spindle formation. A second affects mitotic stability and the DNA damage checkpoint. Curiously, overexpression of a third gene, SKO1, arrests the cell cycle by activating the pheromone response pathway, with cells mistakenly behaving as if mating pheromone is present. These results establish a basis for future experiments elucidating precise cell cycle roles for these genes. Similar assays in human cells could help further clarify the many connections between cell cycle control and cancers.
The Myc oncoprotein is a transcription factor involved in a variety of human cancers. Overexpression of Myc is associated with malignant transformation. In normal cells, Myc is induced by mitotic signals, and in turn, it regulates the expression of downstream target genes. Although diverse roles of Myc have been predicted from many previous studies, detailed functions of Myc targets are still unclear. By combining chromatin immunoprecipitation (ChIP) and promoter microarrays, we identified a total of 1469 Myc direct target genes, the majority of which are novel, in HeLa cells and human primary fibroblasts. We observed dramatic changes of Myc occupancy at its target promoters in foreskin fibroblasts in response to serum stimulation. Among the targets of Myc, 107 were nuclear encoded genes involved in mitochondrial biogenesis. Genes with important roles in mitochondrial replication and biogenesis, such as POLG, POLG2, and NRF1 were identified as direct targets of Myc, confirming a direct role for Myc in regulating mitochondrial biogenesis. Analysis of target promoter sequences revealed a strong preference for Myc occupancy at promoters containing one of several described consensus sequences, CACGTG, in vivo. This study thus sheds light on the transcriptional regulatory networks mediated by Myc in vivo.
The eukaryotic genome is packaged into chromatin, and chromatin modification and remodeling play an important role in transcriptional regulation, DNA replication, recombination and repair. Recent findings have shown that various post-translational histone modifications cooperate to recruit different effector proteins that bring about mobilization of the nucleosomes and cause distinct downstream consequences. The combination of chromatin immunoprecipitation (ChIP) using antibodies directed against the core histones or specific histone modifications, with high-resolution tiling microarray analysis allows the examination of nucleosome occupancy and histone modification status genome-wide. Comparing genome-wide chromatin status with global gene expression patterns can reveal causal connections between specific patterns of histone modifications and the resulting gene expression. Here, we describe current methods based on recent advances in microarray technology to conduct such studies.
S. cerevisiae; chromatin remodeling; chromatin immunoprecipitation; tiling microarray
The eukaryotic genome is packaged as chromatin with nucleosomes comprising its basic structural unit, but the detailed structure of chromatin and its dynamic remodeling in terms of individual nucleosome positions has not been completely defined experimentally for any genome. We used ultra-high–throughput sequencing to map the remodeling of individual nucleosomes throughout the yeast genome before and after a physiological perturbation that causes genome-wide transcriptional changes. Nearly 80% of the genome is covered by positioned nucleosomes occurring in a limited number of stereotypical patterns in relation to transcribed regions and transcription factor binding sites. Chromatin remodeling in response to physiological perturbation was typically associated with the eviction, appearance, or repositioning of one or two nucleosomes in the promoter, rather than broader region-wide changes. Dynamic nucleosome remodeling tends to increase the accessibility of binding sites for transcription factors that mediate transcriptional changes. However, specific nucleosomal rearrangements were also evident at promoters even when there was no apparent transcriptional change, indicating that there is no simple, globally applicable relationship between chromatin remodeling and transcriptional activity. Our study provides a detailed, high-resolution, dynamic map of single-nucleosome remodeling across the yeast genome and its relation to global transcriptional changes.
The eukaryotic genome is packed in a systematic hierarchy to accommodate it within the confines of the cell's nucleus. This packing, however, presents an impediment to the transcription machinery when it must access genomic DNA to regulate gene expression. A fundamental aspect of genome packing is the spooling of DNA around nucleosomes—structures formed from histone proteins—which must be dislodged during transcription. In this study, we identified all the nucleosome displacements associated with a physiological perturbation causing genome-wide transcriptional changes in the eukaryote Saccharomyces cerevisiae. We isolated nucleosomal DNA before and after subjecting cells to heat shock, then identified the ends of these DNA fragments and, thereby, the location of nucleosomes along the genome, using ultra-high–throughput sequencing. We identified localized patterns of nucleosome displacement at gene promoters in response to heat shock, and found that nucleosome eviction was generally associated with activation and their appearance with gene repression. Nucleosome remodeling generally improved the accessibility of DNA to transcriptional regulators mediating the response to stresses like heat shock. However, not all nucleosomal remodeling was associated with transcriptional changes, indicating that the relationship between nucleosome repositioning and transcriptional activity is not merely a reflection of competing access to DNA.
Ultra-high-throughput sequencing is used to show that distinct, localized patterns of nucleosome repositioning at promoters underlie the genome-wide transcriptional response to a physiological stimulus.
Cis-acting transcriptional regulatory elements in mammalian genomes typically contain specific combinations of binding sites for various transcription factors. Although some cis-regulatory elements have been well studied, the combinations of transcription factors that regulate normal expression levels for the vast majority of the 20,000 genes in the human genome are unknown. We hypothesized that it should be possible to discover transcription factor combinations that regulate gene expression in concert by identifying over-represented combinations of sequence motifs that occur together in the genome. In order to detect combinations of transcription factor binding motifs, we developed a data mining approach based on the use of association rules, which are typically used in market basket analysis. We scored each segment of the genome for the presence or absence of each of 83 transcription factor binding motifs, then used association rule mining algorithms to mine this dataset, thus identifying frequently occurring pairs of distinct motifs within a segment.
Support for most pairs of transcription factor binding motifs was highly correlated across different chromosomes although pair significance varied. Known true positive motif pairs showed higher association rule support, confidence, and significance than background. Our subsets of high-confidence, high-significance mined pairs of transcription factors showed enrichment for co-citation in PubMed abstracts relative to all pairs, and the predicted associations were often readily verifiable in the literature.
Functional elements in the genome where transcription factors bind to regulate expression in a combinatorial manner are more likely to be predicted by identifying statistically and biologically significant combinations of transcription factor binding motifs than by simply scanning the genome for the occurrence of binding sites for a single transcription factor.
Cell lines have been used to study cancer for decades, but truly quantitative assessment of their performance as models is often lacking. We used gene expression profiling to quantitatively assess the gene expression of nine cell line models of cervical cancer.
We find a wide variation in the extent to which different cell culture models mimic late-stage invasive cervical cancer biopsies. The lowest agreement was from monolayer HeLa cells, a common cervical cancer model; the highest agreement was from primary epithelial cells, C4-I, and C4-II cell lines. In addition, HeLa and SiHa cell lines cultured in an organotypic environment increased their correlation to cervical cancer significantly. We also find wide variation in agreement when we considered how well individual biological pathways model cervical cancer. Cell lines with an anti-correlation to cervical cancer were also identified and should be avoided.
Using gene expression profiling and quantitative analysis, we have characterized nine cell lines with respect to how well they serve as models of cervical cancer. Applying this method to individual pathways, we identified the appropriateness of particular cell lines for studying specific pathways in cervical cancer. This study will allow researchers to choose a cell line with the highest correlation to cervical cancer at a pathway level. This method is applicable to other cancers and could be used to identify the appropriate cell line and growth condition to employ when studying other cancers.