Genetic and epigenetic inheritance through mitosis is critical for dividing cells to maintain their state. This process occurs in the context of large-scale re-organization of chromosome conformation during prophase leading to the formation of mitotic chromosomes, and during the reformation of the interphase nucleus during telophase and early G1. This review highlights how recent studies over the last 5 years employing chromosome conformation capture combined with classical models of chromosome organization based on decades of microscopic observations, are providing new insights into the three-dimensional organization of chromatin inside the interphase nucleus and within mitotic chromosomes. One striking observation is that interphase genome organization displays cell type-specific features that are related to cell type-specific gene expression, whereas mitotic chromosome folding appears universal and tissue invariant. This raises the question of whether or not there is a need for an epigenetic memory for genome folding. Herein, the two different folding states of mammalian genomes are reviewed and then models are discussed wherein instructions for cell type-specific genome folding are locally encoded in the linear genome and transmitted through mitosis, e.g., as open chromatin sites with or without continuous binding of transcription factors. In the next cell cycle these instructions are used to re-assemble protein complexes on regulatory elements which then drive three-dimensional folding of the genome from the bottom up through local action and self-assembly into higher order levels of cell type-specific organization. In this model, no explicit epigenetic memory for cell type-specific chromosome folding is required.
Chromatin looping; Chromosome conformation capture; Chromosome folding; Epigenetic inheritance; Mitotic chromosome; Nucleus
Understanding the topological configurations of chromatin may reveal valuable insights into how the genome and epigenome act in concert to control cell fate during development. Here we generate high-resolution architecture maps across seven genomic loci in embryonic stem cells and neural progenitor cells. We observe a hierarchy of 3-D interactions that undergo marked reorganization at the sub-Mb scale during differentiation. Distinct combinations of CTCF, Mediator, and cohesin show widespread enrichment in looping interactions at different length scales. CTCF/cohesin anchor long-range constitutive interactions that form the topological basis for invariant sub-domains. Conversely, Mediator/cohesin together with pioneer factors bridge shortrange enhancer-promoter interactions within and between larger sub-domains. Knockdown of Smc1 or Med12 in ES cells results in disruption of spatial architecture and down-regulation of genes found in cohesin-mediated interactions. We conclude that cell type-specific chromatin organization occurs at the sub-Mb scale and that architectural proteins shape the genome in hierarchical length scales.
Mitotic chromosomes are among the most recognizable structures in the cell, yet for over a century their internal organization remains largely unsolved. We applied chromosome conformation capture methods, 5C and Hi-C, across the cell cycle and revealed two alternative three-dimensional folding states of the human genome. We show that the highly compartmentalized and cell-type-specific organization described previously for non-synchronous cells is restricted to interphase. In metaphase, we identify a homogenous folding state, which is locus-independent, common to all chromosomes, and consistent among cell types, suggesting a general principle of metaphase chromosome organization. Using polymer simulations, we find that metaphase Hi-C data is inconsistent with classic hierarchical models, and is instead best described by a linearly-organized longitudinally compressed array of consecutive chromatin loops.
Despite advances in DNA-sequencing technology, assembly of complex genomes remains a major challenge, particularly for genomes sequenced using short reads, which yield highly fragmented assemblies. Here we show that genome-wide in vivo chromatin interaction frequency data, which are measurable with chromosome conformation capture–based experiments, can be used as genomic distance proxies to accurately position individual contigs without requiring any sequence overlap. We also use these data to construct approximate genome scaffolds de novo. Applying our approach to incomplete regions of the human genome, we predict the positions of 65 previously unplaced contigs, in agreement with alternative methods in 26/31 cases attempted in common. Our approach can theoretically bridge any gap size and should be applicable to any species for which global chromatin interaction data can be generated.
Mammalian genomes encode genetic information in their linear sequence, but appropriate expression of their genes requires chromosomes to fold into complex three-dimensional structures. Transcriptional control involves the establishment of physical connections among genes and regulatory elements, both along and between chromosomes. Recent technological innovations in probing the folding of chromosomes are providing new insights into the spatial organization of genomes and its role in gene regulation. It is emerging that folding of large complex chromosomes involves a hierarchy of structures, from chromatin loops that connect genes and enhancers to larger chromosomal domains and nuclear compartments. The larger these structures are along this hierarchy, the more stable they are within cells, while becoming more stochastic between cells. Here, we review the experimental and theoretical data on this hierarchy of structures, and propose a key role for the recently discovered Topologically Associating domains.
Spatial organization of chromatin plays an important role at multiple levels of genome regulation. On a global scale its function is evident in processes like metaphase and chromosome segregation. On a detailed level, long range interactions between regulatory elements and promoters are essential for proper gene regulation. Microscopic techniques like FISH can detect chromatin contacts, although the resolution is generally low making detection of enhancer-promoter interaction difficult. The 3C methodology allows for high-resolution analysis of chromatin interactions. 3C is now widely used and has revealed that long-range looping interactions between genomic elements is widespread. However, studying chromatin interactions in large genomic regions by 3C is very labor intensive. This limitation is overcome by the 5C technology. 5C is an adaptation of 3C, in which the concurrent use of thousands of primers permits the simultaneous detection of millions of chromatin contacts. The design of the 5C primers is critical, since this will determine which and how many chromatin interactions will be examined in the assay. Starting material for 5C is a 3C template. To make a 3C template, chromatin interactions in living cells are crosslinked using formaldehyde. Next, chromatin is digested and subsequently ligated under conditions favoring ligation events between crosslinked fragments. This yields a genome-wide 3C library of ligation products representing all chromatin interactions in vivo. 5C then employs multiplex ligation mediated amplification to detect, in a single assay, up to millions of unique ligation products present in the 3C library. The resulting 5C library can be analyzed by microarray analysis or deep sequencing. The observed abundance of a 5C product is a measure of the interaction frequency between the two corresponding chromatin fragments. The power of the 5C technique described in this chapter is the high-throughput, high-resolution and quantitative way in which the spatial organization of chromatin can be examined.
Chromosome conformation capture; chromatin looping; chromatin structure; long-range gene regulation; high-throughput
How DNA is organized in three dimensions inside the cell nucleus and how that affects the ways in which cells access, read and interpret genetic information are among the longest standing questions in cell biology. Using newly developed molecular, genomic, and computational approaches based on the chromosome conformation capture technology (such as 3C, 4C, 5C and Hi-C) the spatial organization of genomes is being explored at unprecedented resolution. Interpreting the increasingly large chromatin interaction datasets is now posing novel challenges. Here we describe several types of statistical and computational approaches that have recently been developed to analyze chromatin interaction data.
Chromosome conformation capture; chromatin looping; long-range gene regulation; chromatin domains; 3D modeling; polymer physics; genomics; integrative modeling; topology; fractal globule
Chromosomes are folded in intricate ways inside cells and their spatial organization is intimately related to regulation of gene expression. Expression of genes can be controlled by regulatory elements that are located at large genomic distances from their target genes (in cis), or even on different chromosomes (in trans). Regulatory elements can act at large genomic distances by engaging in direct physical interactions with their target genes resulting in the formation of chromatin loops. Thus, genes and their regulatory elements come in close spatial proximity irrespective of their relative genomic positions. Analysis of interactions between genes and elements will reveal which elements regulate each gene, and will provide fundamental insights into the spatial organization of chromosomes in general.
Long-range cis- and trans- interactions can be studied at high resolution using the Chromosome Conformation Capture (3C) technology. 3C employs formaldehyde crosslinking to trap physical interactions between loci located throughout the genome. Crosslinked cells are then solubilized and chromatin is digested by a restriction enzyme. After digestion the chromatin is subjected to ligation under very dilute DNA concentrations. These conditions favor intramolecular ligation over intermolecular ligation, and thus result in selective ligation of interacting (and crosslinked) genomic elements. The crosslinks are reversed, the DNA is purified, and interaction frequencies between specific chromosomal loci can be determined by quantifying the amounts of corresponding ligation product that is formed using PCR. This chapter describes detailed protocols for 3C analysis of yeast Saccharomyces cerevisiae and mammalian chromosomes.
DNA; chromatin looping; long-range gene regulation; trans-regulation; formaldehyde crosslinking; spatial organization
We have determined the three-dimensional (3D) architecture of the Caulobacter crescentus genome by combining genome-wide chromatin interaction detection, live-cell imaging, and computational modeling. Using chromosome conformation capture carbon copy (5C) technology, we derive ~13 Kb resolution 3D models of the Caulobacter genome. These models illustrate that the genome is ellipsoidal with periodically arranged arms. The parS sites, a pair of short contiguous sequence elements involved in chromosome segregation, are positioned at one pole of this structure, where they nucleate a compact chromatin conformation. Both 5C and imaging experiments demonstrate that placing these sequence elements at new genomic positions yields large-scale rotations of the genome within the cell. Utilizing automated fluorescent imaging, we orient the genome within the cell and illustrate that within the resolution of our data the parS proximal region is the only portion of the genome stably attached to the cell envelope. Our approach provides an experimental paradigm for deriving insight into the cis-determinants of 3D genome architecture.
In eukaryotes, genome organization can be observed on many levels and at different scales. This organization is important not only to reduce chromosome length but also for the proper execution of various biological processes. High-resolution mapping of spatial chromatin structure was made possible by the development of the chromosome conformation capture (3C) technique. 3C uses chemical cross-linking followed by proximity-based ligation of fragmented DNA to capture frequently interacting chromatin segments in cell populations. Several 3C-related methods capable of higher chromosome conformation mapping throughput were reported afterwards. These techniques include the 3C-carbon copy (5C) approach, which offers the advantage of being highly quantitative and reproducible. We provide here a reference protocol for the production of 5C libraries analyzed by next-generation sequencing or onto microarrays. A procedure used to verify that 3C library templates bear the high quality required to produce superior 5C libraries is also described. We believe that this comprehensive detailed protocol will help guide researchers in probing spatial genome organization and its role in various biological processes.
Chromatin; transcription; epigenetics; genome organization; structure
We describe a method, Hi-C, to comprehensively detect chromatin interactions in the mammalian nucleus. This method is based on Chromosome Conformation Capture, in that chromatin is crosslinked with formaldehyde, then digested, and re-ligated in such a way that only DNA fragments that are covalently linked together form ligation products. The ligation products contain the information of not only where they originated from in the genomic sequence but also where they reside, physically, in the 3D organization of the genome. In Hi-C, a biotin-labeled nucleotide is incorporated at the ligation junction, making it possible to enrich for chimeric DNA ligation junctions when modifying the DNA molecules for deep sequencing. The compatibility of Hi-C with next generation sequencing platforms makes it possible to detect chromatin interactions on an unprecedented scale. This advance gives Hi-C the power to both explore the chromatin biophysics as well as the implications of chromatin structure in the biological functions of the nucleus. A massively parallel survey of chromatin interaction provides the previously missing dimension of spatial context to other genomic studies. This spatial context will provide a new perspective to studies of chromatin and its role in genome regulation in normal conditions and in disease.
Chromatin Conformation; Nucleus; Genomics
Extracting biologically meaningful information from chromosomal interactions obtained with genome-wide chromosome conformation capture (3C) analyses requires elimination of systematic biases. We present a pipeline that integrates a strategy for mapping of sequencing reads and a data-driven method for iterative correction of biases, yielding genome-wide maps of relative contact probabilities. We validate ICE (Iterative Correction and Eigenvector decomposition) on published Hi-C data, and demonstrate that eigenvector decomposition of the obtained maps provides insights into local chromatin states, global patterns of chromosomal interactions, and the conserved organization of human and mouse chromosomes.
Summary: The R/Bioconductor package HiTC facilitates the exploration of high-throughput 3C-based data. It allows users to import and export ‘C’ data, to transform, normalize, annotate and visualize interaction maps. The package operates within the Bioconductor framework and thus offers new opportunities for future development in this field.
Availability and implementation: The R package HiTC is available from the Bioconductor website. A detailed vignette provides additional documentation and help for using the package.
Supplementary data are available at Bioinformatics online.
DNaseI hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers, and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ~2.9 million DHSs that encompass virtually all known experimentally-validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation, and regulatory factor occupancy patterns. We connect ~580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is choreographed with dozens to hundreds of co-activated elements, and the trans-cellular DNaseI sensitivity pattern at a given region can predict cell type-specific functional behaviors. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
The genome is extensively transcribed into long intergenic noncoding RNAs (lincRNAs), many of which are implicated in gene silencing1,2. Potential roles of lincRNAs in gene activation are much less understood3,4,5. Development and homeostasis require coordinate regulation of neighboring genes through a process termed locus control6. Some locus control elements and enhancers transcribe lincRNAs7,8,9,10, hinting at possible roles in long range control. In vertebrates, 39 Hox genes, encoding homeodomain transcription factors critical for positional identity, are clustered in four chromosomal loci; the Hox genes are expressed in nested anterior-posterior and proximal-distal patterns co-linear with their genomic position from 3′ to 5′of the cluster11. Here we identify HOTTIP, a lincRNA transcribed from the 5′ tip of the HOXA locus that coordinates the activation of multiple 5′ HOXA genes in vivo. Chromosomal looping brings HOTTIP into close proximity to its target genes. HOTTIP directly binds the adaptor protein WDR5 and targets WDR5/MLL complexes across HOXA, driving histone H3 lysine 4 trimethylation and gene transcription. Induced proximity is necessary and sufficient for HOTTIP activation of its target genes. Thus, by serving as key intermediates that transmit information from higher order chromosomal looping into chromatin modifications, lincRNAs may organize chromatin domains to coordinate long-range gene activation.
The vast non-coding portion of the human genome is awash in functional elements and disease-causing regulatory variants. The principles defining the relationships between these elements and distal target genes remain unknown. Promoters and distal elements can engage in looping interactions that have been implicated in gene regulation1. Here we have applied chromosome conformation capture carbon copy, 5C2, to comprehensively interrogate interactions between transcription start sites (TSSs) and distal elements in 1% of the human genome representing the ENCODE pilot project regions3. 5C maps were generated for GM12878, K562 and HeLa-S3 cells and results were integrated with data from the ENCODE consortium4. In each cell line we discovered >1,000 long-range interactions between promoters and distal sites that include elements resembling enhancers, promoters and CTCF-bound sites. We observed significant correlations between gene expression, promoter-enhancer interactions and the presence of enhancer RNAs. Long-range interactions display striking asymmetry with a bias for interactions with elements located ~120 Kb upstream of the TSS. Long-range interactions are often not blocked by sites bound by CTCF and cohesin implying that many of these sites do not demarcate physically insulated gene domains. Further, only ~7% of looping interactions are with the nearest gene, suggesting that genomic proximity is not a simple predictor for long-range interactions. Finally, promoters and distal elements are engaged in multiple long-range interactions to form complex networks. Our results start to place genes and regulatory elements in three-dimensional context, revealing their functional relationships.
The mouse X-inactivation center (Xic) orchestrates initiation of X inactivation by controlling the expression of the non-coding Xist transcript. The full extent of Xist’s regulatory landscape remains to be defined however. Here we use Chromosome Conformation Capture Carbon-Copy and super-resolution microscopy to analyse the spatial organisation of a 4.5Mb region including Xist. We uncover a series of discrete 200kb-1Mb topologically associating domains (TADs), present both before and after cell differentiation and on the active and inactive X. These domains align with several domain-wide epigenomic features as well as co-regulated gene clusters. Disruption of a TAD boundary causes ectopic chromosomal contacts and long-range transcriptional mis-regulation. Xist/Tsix illustrates the spatial segregation of oppositely regulated chromosomal neighborhoods, with their promoters lying in two adjacent TADs, each containing their known positive regulators. This led to the identification of a distal regulatory region of Tsix producing a novel long intervening RNA, Linx, within its TAD. In addition to uncovering a new principle of the cis-regulatory architecture of mammalian chromosomes, our study sets the stage for the full genetic dissection of the Xic.
Transposable elements (TEs) and DNA repeats are commonly targeted by DNA and histone methylation to achieve epigenetic gene silencing. We isolated mutations in two Arabidopsis genes, AtMORC1 and AtMORC6, which cause de-repression of DNA-methylated genes and TEs, but no losses of DNA or histone methylation. AtMORC1 and AtMORC6 are members of the conserved Microrchidia (MORC) adenosine triphosphatase (ATPase) family, predicted to catalyze alterations in chromosome superstructure. The atmorc1 and atmorc6 mutants show decondensation of pericentromeric heterochromatin, increased interaction of pericentromeric regions with the rest of the genome, and transcriptional defects that are largely restricted to loci residing in pericentromeric regions. Knockdown of the single MORC homolog in Caenorhabditis elegans also impairs transgene silencing. We propose that the MORC ATPases are conserved regulators of gene silencing in eukaryotes.