The vast non-coding portion of the human genome is awash in functional elements and disease-causing regulatory variants. The principles defining the relationships between these elements and distal target genes remain unknown. Promoters and distal elements can engage in looping interactions that have been implicated in gene regulation1. Here we have applied chromosome conformation capture carbon copy, 5C2, to comprehensively interrogate interactions between transcription start sites (TSSs) and distal elements in 1% of the human genome representing the ENCODE pilot project regions3. 5C maps were generated for GM12878, K562 and HeLa-S3 cells and results were integrated with data from the ENCODE consortium4. In each cell line we discovered >1,000 long-range interactions between promoters and distal sites that include elements resembling enhancers, promoters and CTCF-bound sites. We observed significant correlations between gene expression, promoter-enhancer interactions and the presence of enhancer RNAs. Long-range interactions display striking asymmetry with a bias for interactions with elements located ~120 Kb upstream of the TSS. Long-range interactions are often not blocked by sites bound by CTCF and cohesin implying that many of these sites do not demarcate physically insulated gene domains. Further, only ~7% of looping interactions are with the nearest gene, suggesting that genomic proximity is not a simple predictor for long-range interactions. Finally, promoters and distal elements are engaged in multiple long-range interactions to form complex networks. Our results start to place genes and regulatory elements in three-dimensional context, revealing their functional relationships.
The mouse X-inactivation center (Xic) orchestrates initiation of X inactivation by controlling the expression of the non-coding Xist transcript. The full extent of Xist’s regulatory landscape remains to be defined however. Here we use Chromosome Conformation Capture Carbon-Copy and super-resolution microscopy to analyse the spatial organisation of a 4.5Mb region including Xist. We uncover a series of discrete 200kb-1Mb topologically associating domains (TADs), present both before and after cell differentiation and on the active and inactive X. These domains align with several domain-wide epigenomic features as well as co-regulated gene clusters. Disruption of a TAD boundary causes ectopic chromosomal contacts and long-range transcriptional mis-regulation. Xist/Tsix illustrates the spatial segregation of oppositely regulated chromosomal neighborhoods, with their promoters lying in two adjacent TADs, each containing their known positive regulators. This led to the identification of a distal regulatory region of Tsix producing a novel long intervening RNA, Linx, within its TAD. In addition to uncovering a new principle of the cis-regulatory architecture of mammalian chromosomes, our study sets the stage for the full genetic dissection of the Xic.
Transposable elements (TEs) and DNA repeats are commonly targeted by DNA and histone methylation to achieve epigenetic gene silencing. We isolated mutations in two Arabidopsis genes, AtMORC1 and AtMORC6, which cause de-repression of DNA-methylated genes and TEs, but no losses of DNA or histone methylation. AtMORC1 and AtMORC6 are members of the conserved Microrchidia (MORC) adenosine triphosphatase (ATPase) family, predicted to catalyze alterations in chromosome superstructure. The atmorc1 and atmorc6 mutants show decondensation of pericentromeric heterochromatin, increased interaction of pericentromeric regions with the rest of the genome, and transcriptional defects that are largely restricted to loci residing in pericentromeric regions. Knockdown of the single MORC homolog in Caenorhabditis elegans also impairs transgene silencing. We propose that the MORC ATPases are conserved regulators of gene silencing in eukaryotes.
The extent to which the three dimensional organization of the genome contributes to chromosomal translocations is an important question in cancer genomics. We now have generated a high resolution Hi-C spatial organization map of the G1-arrested mouse pro-B cell genome and mapped translocations from target DNA double strand breaks (DSBs) within it via high throughput genome-wide translocation sequencing. RAG endonuclease-cleaved antigen-receptor loci are dominant translocation partners for target DSBs regardless of genomic position, reflecting high frequency DSBs at these loci and their co-localization in a fraction of cells. To directly assess spatial proximity contributions, we normalized genomic DSBs via ionizing-radiation. Under these conditions, translocations were highly enriched in cis along single chromosomes containing target DSBs and within other chromosomes and sub-chromosomal domains in a manner directly related to pre-existing spatial proximity. Our studies reveal the power of combining two high-throughput genomic methods to address long-standing questions in cancer biology.
Translocations; 3D nuclear organization; DNA double-strand breaks; genome stability
To complement the human Encyclopedia of DNA Elements (ENCODE) project and to enable a broad range of mouse genomics efforts, the Mouse ENCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome.
ENCODE Project; mouse genome; DNaseI hypersensitive sites; histone modifications; transcriptome; transcription factor binding sites; comparative genomics; ChIP-seq; RNA-seq
A major challenge in systems biology is to understand the gene regulatory networks that drive development, physiology and pathology. Interactions between transcription factors and regulatory genomic regions provide the first level of gene control. Gateway-compatible yeast one-hybrid (Y1H) assays present a convenient method to identify and characterize the repertoire of transcription factors that can bind a DNA sequence of interest. To delineate genome-scale regulatory networks, however, large sets of DNA fragments need to be processed at high throughput and high coverage. Here, we present “enhanced” Y1H (eY1H) assays that utilize a robotic mating platform with a set of improved Y1H reagents and automated readout quantification. We demonstrate that eY1H assays provide excellent coverage and identify interacting transcription factors for multiple DNA fragments in a short amount of time. eY1H assays will be an important tool for gene regulatory network mapping in Caenorhabditis elegans and other model organisms, as well as humans.
Gateway-compatible yeast one-hybrid (Y1H) assays provide a convenient gene-centered (DNA-to-protein) approach to identify the repertoire of transcription factors that can bind a DNA sequence of interest. We present a set of Y1H resources, including clones for 988 of 1,434 (69%) predicted human transcription factors, for the interrogation of interactions using either low or high-throughput settings. These approaches detect both known and novel interactions between human DNA regions and transcription factors.
Recent technological advances in the field of chromosome conformation capture are facilitating tremendous progress in the ability to map the three-dimensional (3D) organization of chromosomes at a resolution of several Kb and at the scale of complete genomes. Here we review progress in analyzing chromosome organization in human cells by building 3D models of chromatin based on comprehensive chromatin interaction datasets. We describe recent experiments that suggest that long-range interactions between active functional elements are sufficient to drive folding of local chromatin domains into compact globular states. We propose that chromatin globules are commonly formed along chromosomes, in a cell type specific pattern, as a result of frequent long-range interactions among active genes and nearby regulatory elements. Further, we speculate that increasingly longer range interactions can drive aggregation of groups of globular domains. This process would yield a compartmentalized chromosome conformation, consistent with recent observations obtained with genome-wide chromatin interaction mapping.
Recent advances in sequencing technologies have uncovered a world of RNAs that do not code for proteins, known as non-protein coding RNAs, that play important roles in gene regulation. Along with histone modifications and transcription factors, non-coding RNA is part of a layer of transcriptional control on top of the DNA code. This layer of components and their interactions specifically enables (or disables) the modulation of three-dimensional folding of chromatin to create a context for transcriptional regulation that underlies cell-specific transcription. In this perspective, we propose a structural and functional hierarchy, in which the DNA code, proteins and non-coding RNAs act as context creators to fold chromosomes and regulate genes.
The spatial organization of chromosomes inside the cell nucleus is still poorly understood. This organization is guided by intra- and interchromosomal contacts and by interactions of specific chromosomal loci with relatively fixed nuclear “landmarks” such as the nuclear envelope and the nucleolus. New molecular genome-wide mapping techniques have begun to uncover both types of molecular interactions, providing insights into the fundamental principles of interphase chromosome folding.
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
Meiotic recombination occurs between one chromatid of each maternal and paternal homolog (homolog bias) versus between sister chromatids (sister bias). Physical DNA analysis reveals that meiotic cohesin/axis component Rec8 promotes sister bias, likely via its cohesion activity. Two meiosis-specific axis components, Red1/Mek1kinase, counteract this effect. With this precondition satisfied, other molecules directly specify homolog bias per se. Rec8 also acts positively to maintain homolog bias during crossover recombination. These observations point to sequential release of double-strand break ends from association with their sister. Red1 and Rec8 are found to play distinct roles for sister cohesion, DSB formation and recombination progression kinetics. Also, the two components are enriched in spatially distinct domains of axial structure that develop prior to DSB formation. We propose that Red1 and Rec8 domains provide functionally complementary environments whereby inputs evolved from DSB repair and late-stage chromosome morphogenesis are integrated to give the complete meiotic chromosomal program.
Transcription factors control cell specific gene expression programs through interactions with diverse coactivators and the transcription apparatus. Gene activation may involve DNA loop formation between enhancer-bound transcription factors and the transcription apparatus at the core promoter, but this process is not well understood. We report here that Mediator and Cohesin physically and functionally connect the enhancers and core promoters of active genes in embryonic stem cells. Mediator, a transcriptional coactivator, forms a complex with Cohesin, which can form rings that connect two DNA segments. The Cohesin loading factor Nipbl is associated with Mediator/Cohesin complexes, providing a means to load Cohesin at promoters. DNA looping is observed between the enhancers and promoters occupied by Mediator and Cohesin. Mediator and Cohesin occupy different promoters in different cells, thus generating cell-type specific DNA loops linked to the gene expression program of each cell.
We developed a general approach that combines Chromosome Conformation Capture Carbon Copy with the Integrated Modeling Platform to generate high-resolution three-dimensional models of chromatin at the Mb scale. We applied this approach to the ENm008 domain on human chromosome 16 containing the α-globin locus, which is expressed in K562 cells and silenced in lymphoblastoid cells (GM12878). The models accurately reproduce the known looping interactions between the α-globin genes and their distal regulatory elements. Further, we find that the domain folds into a single globular conformation in GM12878 cells, whereas two globules are formed in K562 cells. The central cores of these globules are enriched for transcribed genes, whereas non-transcribed chromatin is more peripheral. We propose that globule formation represents a higher-order folding state related to clustering of transcribed genes around shared transcription machineries, as observed by microscopy.
The three-dimensional folding of chromosomes compartmentalizes the genome and and can bring distant functional elements, such as promoters and enhancers, into close spatial proximity 2-6. Deciphering the relationship between chromosome organization and genome activity will aid in understanding genomic processes, like transcription and replication. However, little is known about how chromosomes fold. Microscopy is unable to distinguish large numbers of loci simultaneously or at high resolution. To date, the detection of chromosomal interactions using chromosome conformation capture (3C) and its subsequent adaptations required the choice of a set of target loci, making genome-wide studies impossible 7-10.
We developed Hi-C, an extension of 3C that is capable of identifying long range interactions in an unbiased, genome-wide fashion. In Hi-C, cells are fixed with formaldehyde, causing interacting loci to be bound to one another by means of covalent DNA-protein cross-links. When the DNA is subsequently fragmented with a restriction enzyme, these loci remain linked. A biotinylated residue is incorporated as the 5' overhangs are filled in. Next, blunt-end ligation is performed under dilute conditions that favor ligation events between cross-linked DNA fragments. This results in a genome-wide library of ligation products, corresponding to pairs of fragments that were originally in close proximity to each other in the nucleus. Each ligation product is marked with biotin at the site of the junction. The library is sheared, and the junctions are pulled-down with streptavidin beads. The purified junctions can subsequently be analyzed using a high-throughput sequencer, resulting in a catalog of interacting fragments.
Direct analysis of the resulting contact matrix reveals numerous features of genomic organization, such as the presence of chromosome territories and the preferential association of small gene-rich chromosomes. Correlation analysis can be applied to the contact matrix, demonstrating that the human genome is segregated into two compartments: a less densely packed compartment containing open, accessible, and active chromatin and a more dense compartment containing closed, inaccessible, and inactive chromatin regions. Finally, ensemble analysis of the contact matrix, coupled with theoretical derivations and computational simulations, revealed that at the megabase scale Hi-C reveals features consistent with a fractal globule conformation.
We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1Mb. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.
Identification of regulatory elements and their target genes is complicated by the fact that regulatory elements can act over large genomic distances. Identification of long-range acting elements is particularly important in the case of disease genes as mutations in these elements can result in human disease. It is becoming increasingly clear that long-range control of gene expression is facilitated by chromatin looping interactions. These interactions can be detected by chromosome conformation capture (3C). Here, we employed 3C as a discovery tool for identification of long-range regulatory elements that control the cystic fibrosis transmembrane conductance regulator gene, CFTR. We identified four elements in a 460-kb region around the locus that loop specifically to the CFTR promoter exclusively in CFTR expressing cells. The elements are located 20 and 80 kb upstream; and 109 and 203 kb downstream of the CFTR promoter. These elements contain DNase I hypersensitive sites and histone modification patterns characteristic of enhancers. The elements also interact with each other and the latter two activate the CFTR promoter synergistically in reporter assays. Our results reveal novel long-range acting elements that control expression of CFTR and suggest that 3C-based approaches can be used for discovery of novel regulatory elements.
To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular.
Long-range genetic control is an inherent feature of genes harbouring a highly complex spatiotemporal expression pattern, requiring a combined action of multiple cis-regulatory elements such as promoters, enhancers, and silencers. Consequently, disruption of the long-range genetic control of a target gene by genomic rearrangements of regulatory elements may lead to aberrant gene transcription and disease. To date, the contribution of mutated regulatory elements to human disease has not been studied frequently. Here, we explored the contribution of genetic changes in potentially cis-regulatory elements of the FOXL2 gene in blepharophimosis syndrome (BPES), a developmental monogenic condition of the eyelids and ovaries. We identified a de novo very subtle deletion of 7.4 kb causing BPES. Moreover, we studied the functional capacities and chromosome conformation of the deleted region in FOXL2 expressing cellular systems. Interestingly, the chromosome conformation analysis demonstrated the close proximity of the 7.4 kb deleted fragment and two other conserved regions with the FOXL2 core promoter, and the necessity of their integrity for correct FOXL2 expression. Finally, our study revealed the smallest distant deletion causing monogenic disease and emphasized the importance of mutation screening of cis-regulatory elements in human genetic disease.
The organization of eukaryotic genomes is characterized by the presence of distinct euchromatic and heterochromatic sub-nuclear compartments. In Saccharomyces cerevisiae heterochromatic loci, including telomeres and silent mating type loci, form clusters at the nuclear periphery. We have employed live cell 3-D imaging and chromosome conformation capture (3C) to determine the contribution of nuclear positioning and heterochromatic factors in mediating associations of the silent mating type loci. We identify specific long-range interactions between HML and HMR that are dependent upon silencing proteins Sir2p, Sir3p, and Sir4p as well as Sir1p and Esc2p, two proteins involved in establishment of silencing. Although clustering of these loci frequently occurs near the nuclear periphery, colocalization can occur equally at more internal positions and is not affected in strains deleted for membrane anchoring proteins yKu70p and Esc1p. In addition, appropriate nucleosome assembly plays a role, as deletion of ASF1 or combined disruption of the CAF-1 and HIR complexes abolishes the HML-HMR interaction. Further, silencer proteins are required for clustering, but complete loss of clustering in asf1 and esc2 mutants had only minor effects on silencing. Our results indicate that formation of heterochromatic clusters depends on correctly assembled heterochromatin at the silent loci and, in addition, identify an Asf1p-, Esc2p-, and Sir1p-dependent step in heterochromatin formation that is not essential for gene silencing but is required for long-range interactions.
Chromosomes are non-randomly positioned inside cells, and this organization is relevant for genome regulation. Spatial clustering of heterochromatic loci provides a striking example of nuclear compartmentalization. In S. cerevisiae, the presence of heterochromatic sub-nuclear domains has been well established, but their mechanisms of formation are not fully understood. Here, we analyzed the DNA elements and protein complexes that are critical for formation of heterochromatic clusters. We focused on heterochromatic regions on chromosome III—the two telomeres, as well as the silent mating type loci HML and HMR, located on the left and right end of the chromosome, respectively. We employed live cell 3-D imaging and chromosome conformation capture (3C) and found that these loci specifically interact most prominently near silencer elements that flank the loci. Analysis of a panel of mutants showed that complexes involved in silencing are also involved in long-range interactions. Interestingly, we find that heterochromatic interactions are mechanistically distinct from silencing and independent of tethering to the nuclear periphery. Our results indicate that formation of heterochromatic clusters depends on correctly assembled heterochromatin, and point to a step in heterochromatin formation that is not essential for gene silencing but is required for long-range interactions between heterochromatic loci.
Analysis of the spatial organization of chromosomes reveals complex three-dimensional networks of chromosomal interactions. These interactions impact gene expression at multiple levels, including long-range control by distant enhancers and repressors, coordinated expression of genes and acquisition of epigenetic states. Major challenges now include deciphering the mechanisms by which loci come together and understanding the functional consequences of these often transient associations.
Gene expression is controlled by regulatory elements that can be located far away along the chromosome or in some cases even on other chromosomes. Genes and regulatory elements physically associate with each other resulting in complex genome-wide networks of chromosomal interactions. Here we describe several well-characterized cases of long-range interactions involved in activation and repression of transcription. We speculate on how these interactions may affect gene expression and outline possible mechanisms that may facilitate encounters between distant elements. Finally, we propose that a genome-wide network analysis may provide new insights into the logic of long-range gene regulation.
GC-rich and AT-rich chromatin domains display distinct chromatin conformations and are marked by distinct patterns of histone modifications, and the histone deacetylase Rpd3p is an attenuator of these differences.
Base-composition varies throughout the genome and is related to organization of chromosomes in distinct domains (isochores). Isochore domains differ in gene expression levels, replication timing, levels of meiotic recombination and chromatin structure. The molecular basis for these differences is poorly understood.
We have compared GC- and AT-rich isochores of yeast with respect to chromatin conformation, histone modification status and transcription. Using 3C analysis we show that, along chromosome III, GC-rich isochores have a chromatin structure that is characterized by lower chromatin interaction frequencies compared to AT-rich isochores, which may point to a more extended chromatin conformation. In addition, we find that throughout the genome, GC-rich and AT-rich genes display distinct levels of histone modifications. Interestingly, elimination of the histone deacetylase Rpd3p differentially affects conformation of GC- and AT-rich domains. Further, deletion of RPD3 activates expression of GC-rich genes more strongly than AT-rich genes. Analyses of effects of the histone deacetylase inhibitor trichostatin A, global patterns of Rpd3p binding and effects of deletion of RPD3 on histone H4 acetylation confirmed that conformation and activity of GC-rich chromatin are more sensitive to Rpd3p-mediated deacetylation than AT-rich chromatin.
We find that GC-rich and AT-rich chromatin domains display distinct chromatin conformations and are marked by distinct patterns of histone modifications. We identified the histone deacetylase Rpd3p as an attenuator of these base composition-dependent differences in chromatin status. We propose that GC-rich chromatin domains tend to occur in a more active conformation and that Rpd3p activity represses this propensity throughout the genome.