Recent studies have shown that chromosomes in a range of organisms are compartmentalized in different types of chromatin domains. In mammals, chromosomes form compartments that are composed of smaller Topologically Associating Domains (TADs). TADs are thought to represent functional domains of gene regulation but much is still unknown about the mechanisms of their formation and how they exert their regulatory effect on embedded genes. Further, similar domains have been detected in other organisms, including flies, worms, fungi and bacteria. Although in all these cases these domains appear similar as detected by 3C-based methods, their biology appears to be quite distinct with differences in the protein complexes involved in their formation and differences in their internal organization. Here we outline our current understanding of such domains in different organisms and their roles in gene regulation.
topologically associating domain; long-range gene regulation; chromatin folding
Oncogenes are activated through well-known chromosomal alterations, including gene fusion, translocation and focal amplification. Recent evidence that the control of key genes depends on chromosome structures called insulated neighborhoods led us to investigate whether proto-oncogenes occur within these structures and if oncogene activation can occur via disruption of insulated neighborhood boundaries in cancer cells. We mapped insulated neighborhoods in T-cell acute lymphoblastic leukemia (T-ALL), and found that tumor cell genomes contain recurrent microdeletions that eliminate the boundary sites of insulated neighborhoods containing prominent T-ALL proto-oncogenes. Perturbation of such boundaries in non-malignant cells was sufficient to activate proto-oncogenes. Mutations affecting chromosome neighborhood boundaries were found in many types of cancer. Thus, oncogene activation can occur via genetic alterations that disrupt insulated neighborhoods in malignant cells.
One Sentence Summary
Proto-oncogenes can be activated by genetic alterations that disrupt 3D chromosome structure.
Mammalian interphase chromosomes interact with the nuclear lamina (NL) through hundreds of large Lamina Associated Domains (LADs). We report a method to map NL contacts genome-wide in single human cells. Analysis of nearly 400 maps reveals a core architecture of gene-poor LADs that contact the NL with high cell-to-cell consistency, interspersed by LADs with more variable NL interactions. The variable contacts tend to be cell-type specific and are more sensitive to changes in genome ploidy than the consistent contacts. Single-cell maps indicate that NL contacts involve multivalent interactions over hundreds of kilobases. Moreover, we observe extensive intra-chromosomal coordination of NL contacts, even over tens of megabases. Such coordinated loci exhibit preferential interactions as detected by Hi-C. Finally, consistency of NL contacts is inversely linked to gene activity in single cells, and correlates positively with the heterochromatic histone modification H3K9me3. These results highlight fundamental principles of single cell chromatin organization.
Early B cell development is characterized by large scale Igh locus contraction prior to V(D)J recombination to facilitate a highly diverse Ig repertoire. However, an understanding of the molecular architecture that mediates locus contraction remains unclear. We have combined high resolution chromosome conformation capture (3C) techniques with 3D DNA FISH to identify three conserved topological sub-domains. Each of these topological folds encompasses a major VH gene family that become juxtaposed in pro-B cells via Mb-scale chromatin looping. The transcription factor Pax5 organizes the sub-domain that spans the VHJ558 gene family. In its absence the J558 VH genes fail to associate with the proximal VH genes, thereby providing a plausible explanation for reduced VHJ558 gene rearrangements in Pax5-deficient pro-B cells. We propose that Igh locus contraction is the cumulative effect of several independently controlled chromatin sub-domains that provide the structural infrastructure to coordinate optimal antigen receptor assembly.
We describe a Hi-C based method, Micro-C, in which micrococcal nuclease is used instead of restriction enzymes to fragment chromatin, enabling nucleosome resolution chromosome folding maps. Analysis of Micro-C maps for budding yeast reveals abundant self-associating domains similar to those reported in other species, but not previously observed in yeast. These structures, far shorter than topologically-associating domains in mammals, typically encompass one to five genes in yeast. Strong boundaries between self-associating domains occur at promoters of highly transcribed genes and regions of rapid histone turnover that are typically bound by the RSC chromatin-remodeling complex. Investigation of chromosome folding in mutants confirms roles for RSC, “gene looping” factor Ssu72, Mediator, H3K56 acetyltransferase Rtt109, and the N-terminal tail of H4 in folding of the yeast genome. This approach provides detailed structural maps of a eukaryotic genome, and our findings provide insights into the machinery underlying chromosome compaction.
Dosage compensation mechanisms provide a paradigm to study the contribution of chromosomal conformation towards targeting and spreading of epigenetic regulators over a specific chromosome. By using Hi-C and 4C analyses we show that high-affinity sites (HAS), landing platforms of the male-specific lethal (MSL) complex, are enriched around topologically associating domain (TAD) boundaries on the X chromosome and harbor more long-range contacts in a sex-independent manner. Ectopically expressed roX1 and roX2 RNA target HAS on the X chromosome in trans and, via spatial proximity, induce spreading of the MSL complex in cis, leading to increased expression of neighboring autosomal genes. We show that the MSL complex regulates nucleosome positioning at HAS, thus acting locally rather than influencing the overall chromosomal architecture. We propose that sex-independent three-dimensional conformation of the X chromosome poises it for exploitation by the MSL complex, thereby facilitating spreading in males.
Over the last decade, development and application of a set of molecular genomic approaches based on the chromosome conformation capture method (3C), combined with increasingly powerful imaging approaches, have enabled high resolution and genome-wide analysis of the spatial organization of chromosomes. The aim of this paper is to provide guidelines for analyzing and interpreting data obtained with genome-wide 3C methods such as Hi-C and 3C-seq that rely on deep sequencing to detect and quantify pairwise chromatin interactions genome-wide.
The three-dimensional organization of a genome plays a critical role in regulating gene expression, yet little is known about the machinery and mechanisms that determine higher-order chromosome structure1,2. Here we perform genome-wide chromosome conformation capture analysis, FISH, and RNA-seq to obtain comprehensive 3D maps of the Caenorhabditis elegans genome and to dissect X-chromosome dosage compensation, which balances gene expression between XX hermaphrodites and XO males. The dosage compensation complex (DCC), a condensin complex, binds to both hermaphrodite X chromosomes via sequence-specific recruitment elements on X (rex sites) to reduce chromosome-wide gene expression by half3–7. Most DCC condensin subunits also act in other condensin complexes to control the compaction and resolution of all mitotic and meiotic chromosomes5,6. By comparing chromosome structure in wild-type and DCC-defective embryos, we show that the DCC remodels hermaphrodite X chromosomes into a sex-specific spatial conformation distinct from autosomes. Dosage-compensated X chromosomes consist of self-interacting domains (~1 Mb) resembling mammalian Topologically Associating Domains (TADs)8,9. TADs on X have stronger boundaries and more regular spacing than on autosomes. Many TAD boundaries on X coincide with the highest-affinity rex sites and become diminished or lost in DCC-defective mutants, thereby converting the topology of X to a conformation resembling autosomes. rex sites engage in DCC-dependent long-range interactions, with the most frequent interactions occurring between rex sites at DCC-dependent TAD boundaries. These results imply that the DCC reshapes the topology of X by forming new TAD boundaries and reinforcing weak boundaries through interactions between its highest-affinity binding sites. As this model predicts, deletion of an endogenous rex site at a DCC-dependent TAD boundary using CRISPR/Cas9 greatly diminished the boundary. Thus, the DCC imposes a distinct higher-order structure onto X while regulating gene expression chromosome wide.
Mating type switching in yeast occurs through gene conversion between the MAT locus and one of two silent loci (HML or HMR) on opposite ends of the chromosome. MATa cells choose HML as template, while MATα cells use HMR. The Recombination Enhancer (RE), located on the left arm regulates this process. One long-standing hypothesis is that switching is guided by mating type-specific, and possibly RE-dependent chromosome folding. Here we use Hi-C, 5C, and live cell imaging to characterize the conformation of chromosome III in both mating types. We discovered a mating type-specific conformational difference in the left arm. Deletion of a 1 kb subregion within the RE, which is not necessary during switching, abolished mating type-dependent chromosome folding. The RE is therefore a composite element with one subregion essential for donor selection during switching, and a separate region involved in modulating chromosome conformation.
Chromosome conformation; Mating type switching; long-range interactions; Recombination Enhancer; multicolor fluorescence microscopy
HiC-Pro is an optimized and flexible pipeline for processing Hi-C data from raw reads to normalized contact maps. HiC-Pro maps reads, detects valid ligation products, performs quality controls and generates intra- and inter-chromosomal contact maps. It includes a fast implementation of the iterative correction method and is based on a memory-efficient data format for Hi-C contact maps. In addition, HiC-Pro can use phased genotype data to build allele-specific contact maps. We applied HiC-Pro to different Hi-C datasets, demonstrating its ability to easily process large data in a reasonable time. Source code and documentation are available at http://github.com/nservant/HiC-Pro.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0831-x) contains supplementary material, which is available to authorized users.
Chromosome conformation; Hi-C; Bioinformatics pipeline; Normalization
Higher-order chromatin structure is often perturbed in cancer and other pathological states. Although several genetic and epigenetic differences have been charted between normal and breast cancer tissues, changes in higher-order chromatin organization during tumorigenesis have not been fully explored. To probe the differences in higher-order chromatin structure between mammary epithelial and breast cancer cells, we performed Hi-C analysis on MCF-10A mammary epithelial and MCF-7 breast cancer cell lines.
Our studies reveal that the small, gene-rich chromosomes chr16 through chr22 in the MCF-7 breast cancer genome display decreased interaction frequency with each other compared to the inter-chromosomal interaction frequency in the MCF-10A epithelial cells. Interestingly, this finding is associated with a higher occurrence of open compartments on chr16–22 in MCF-7 cells. Pathway analysis of the MCF-7 up-regulated genes located in altered compartment regions on chr16–22 reveals pathways related to repression of WNT signaling. There are also differences in intra-chromosomal interactions between the cell lines; telomeric and sub-telomeric regions in the MCF-10A cells display more frequent interactions than are observed in the MCF-7 cells.
We show evidence of an intricate relationship between chromosomal organization and gene expression between epithelial and breast cancer cells. Importantly, this work provides a genome-wide view of higher-order chromatin dynamics and a resource for studying higher-order chromatin interactions in two cell lines commonly used to study the progression of breast cancer.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0768-0) contains supplementary material, which is available to authorized users.
Hi-C; Chromosome Conformation Capture; Breast Cancer; Topologically Associating Domain; TAD; Telomere
We have examined the three-dimensional organization of the yeast genome during quiescence by a chromosome capture technique as a means of understanding how genome organization changes during development. For exponentially growing cells we observe high levels of inter-centromeric interaction but otherwise a predominance of intrachromosomal interactions over interchromosomal interactions, consistent with aggregation of centromeres at the spindle pole body and compartmentalization of individual chromosomes within the nucleoplasm. Three major changes occur in the organization of the quiescent cell genome. First, intrachromosomal associations increase at longer distances in quiescence as compared to growing cells. This suggests that chromosomes undergo condensation in quiescence, which we confirmed by microscopy by measurement of the intrachromosomal distances between two sites on one chromosome. This compaction in quiescence requires the condensin complex. Second, inter-centromeric interactions decrease, consistent with prior data indicating that centromeres disperse along an array of microtubules during quiescence. Third, inter-telomeric interactions significantly increase in quiescence, an observation also confirmed by direct measurement. Thus, survival during quiescence is associated with substantial topological reorganization of the genome.
Eukaryotic genomes are folded into three-dimensional structures, such as self-associating topological domains, the borders of which are enriched in cohesin and CCCTC-binding factor (CTCF) required for long-range interactions1-7. How local chromatin interactions govern higher-order folding of chromatin fibers and the function of cohesin in this process remain poorly understood. Here we perform genome-wide chromatin conformation capture (Hi-C) analysis8 to explore the high-resolution organization of the Schizosaccharomyces pombe genome, which despite its small size exhibits fundamental features found in other eukaryotes9. Our analyses of wild type and mutant strains reveal key elements of chromosome architecture and genome organization. On chromosome arms, small regions of chromatin locally interact to form “globules”. This feature requires a function of cohesin distinct from its role in sister chromatid cohesion. Cohesin is enriched at globule boundaries and its loss causes disruption of local globule structures and global chromosome territories. By contrast, heterochromatin, which loads cohesin at specific sites including pericentromeric and subtelomeric domains9-11, is dispensable for globule formation but nevertheless affects genome organization. We show that heterochromatin mediates chromatin fiber compaction at centromeres and promotes prominent interarm interactions within centromere-proximal regions, providing structural constraints crucial for proper genome organization. Loss of heterochromatin relaxes constraints on chromosomes, causing an increase in intra- and inter-chromosomal interactions. Together, our analyses uncover fundamental genome folding principles that drive higher-order chromosome organization crucial for coordinating nuclear functions.
A new level of chromosome organization, Topologically Associating Domains (TADs), was recently uncovered by chromosome-confirmation-capture (3C) techniques. To explore TAD structure and function, we developed a polymer model that can extract the full repertoire of chromatin conformations within TADs from population-based 3C data. This model predicts actual physical distances and to what extent chromosomal contacts vary between cells. It also identifies interactions within single TADs that stabilize boundaries between TADs and allows us to identify and genetically validate key structural elements within TADs. Combining the model’s predictions with high-resolution DNA FISH and quantitative RNA FISH for TADs within the X-inactivation center (Xic), we dissect the relationship between transcription and spatial proximity to cis-regulatory elements. We demonstrate that contacts between potential regulatory elements occur in the context of fluctuating structures rather than stable loops and propose that such fluctuations may contribute to asymmetric expression in the Xic during X inactivation.
Genetic and epigenetic inheritance through mitosis is critical for dividing cells to maintain their state. This process occurs in the context of large-scale re-organization of chromosome conformation during prophase leading to the formation of mitotic chromosomes, and during the reformation of the interphase nucleus during telophase and early G1. This review highlights how recent studies over the last 5 years employing chromosome conformation capture combined with classical models of chromosome organization based on decades of microscopic observations, are providing new insights into the three-dimensional organization of chromatin inside the interphase nucleus and within mitotic chromosomes. One striking observation is that interphase genome organization displays cell type-specific features that are related to cell type-specific gene expression, whereas mitotic chromosome folding appears universal and tissue invariant. This raises the question of whether or not there is a need for an epigenetic memory for genome folding. Herein, the two different folding states of mammalian genomes are reviewed and then models are discussed wherein instructions for cell type-specific genome folding are locally encoded in the linear genome and transmitted through mitosis, e.g., as open chromatin sites with or without continuous binding of transcription factors. In the next cell cycle these instructions are used to re-assemble protein complexes on regulatory elements which then drive three-dimensional folding of the genome from the bottom up through local action and self-assembly into higher order levels of cell type-specific organization. In this model, no explicit epigenetic memory for cell type-specific chromosome folding is required.
Chromatin looping; Chromosome conformation capture; Chromosome folding; Epigenetic inheritance; Mitotic chromosome; Nucleus
Understanding the topological configurations of chromatin may reveal valuable insights into how the genome and epigenome act in concert to control cell fate during development. Here we generate high-resolution architecture maps across seven genomic loci in embryonic stem cells and neural progenitor cells. We observe a hierarchy of 3-D interactions that undergo marked reorganization at the sub-Mb scale during differentiation. Distinct combinations of CTCF, Mediator, and cohesin show widespread enrichment in looping interactions at different length scales. CTCF/cohesin anchor long-range constitutive interactions that form the topological basis for invariant sub-domains. Conversely, Mediator/cohesin together with pioneer factors bridge shortrange enhancer-promoter interactions within and between larger sub-domains. Knockdown of Smc1 or Med12 in ES cells results in disruption of spatial architecture and down-regulation of genes found in cohesin-mediated interactions. We conclude that cell type-specific chromatin organization occurs at the sub-Mb scale and that architectural proteins shape the genome in hierarchical length scales.
Mitotic chromosomes are among the most recognizable structures in the cell, yet for over a century their internal organization remains largely unsolved. We applied chromosome conformation capture methods, 5C and Hi-C, across the cell cycle and revealed two alternative three-dimensional folding states of the human genome. We show that the highly compartmentalized and cell-type-specific organization described previously for non-synchronous cells is restricted to interphase. In metaphase, we identify a homogenous folding state, which is locus-independent, common to all chromosomes, and consistent among cell types, suggesting a general principle of metaphase chromosome organization. Using polymer simulations, we find that metaphase Hi-C data is inconsistent with classic hierarchical models, and is instead best described by a linearly-organized longitudinally compressed array of consecutive chromatin loops.
Despite advances in DNA-sequencing technology, assembly of complex genomes remains a major challenge, particularly for genomes sequenced using short reads, which yield highly fragmented assemblies. Here we show that genome-wide in vivo chromatin interaction frequency data, which are measurable with chromosome conformation capture–based experiments, can be used as genomic distance proxies to accurately position individual contigs without requiring any sequence overlap. We also use these data to construct approximate genome scaffolds de novo. Applying our approach to incomplete regions of the human genome, we predict the positions of 65 previously unplaced contigs, in agreement with alternative methods in 26/31 cases attempted in common. Our approach can theoretically bridge any gap size and should be applicable to any species for which global chromatin interaction data can be generated.
Mammalian genomes encode genetic information in their linear sequence, but appropriate expression of their genes requires chromosomes to fold into complex three-dimensional structures. Transcriptional control involves the establishment of physical connections among genes and regulatory elements, both along and between chromosomes. Recent technological innovations in probing the folding of chromosomes are providing new insights into the spatial organization of genomes and its role in gene regulation. It is emerging that folding of large complex chromosomes involves a hierarchy of structures, from chromatin loops that connect genes and enhancers to larger chromosomal domains and nuclear compartments. The larger these structures are along this hierarchy, the more stable they are within cells, while becoming more stochastic between cells. Here, we review the experimental and theoretical data on this hierarchy of structures, and propose a key role for the recently discovered Topologically Associating domains.
Spatial organization of chromatin plays an important role at multiple levels of genome regulation. On a global scale its function is evident in processes like metaphase and chromosome segregation. On a detailed level, long range interactions between regulatory elements and promoters are essential for proper gene regulation. Microscopic techniques like FISH can detect chromatin contacts, although the resolution is generally low making detection of enhancer-promoter interaction difficult. The 3C methodology allows for high-resolution analysis of chromatin interactions. 3C is now widely used and has revealed that long-range looping interactions between genomic elements is widespread. However, studying chromatin interactions in large genomic regions by 3C is very labor intensive. This limitation is overcome by the 5C technology. 5C is an adaptation of 3C, in which the concurrent use of thousands of primers permits the simultaneous detection of millions of chromatin contacts. The design of the 5C primers is critical, since this will determine which and how many chromatin interactions will be examined in the assay. Starting material for 5C is a 3C template. To make a 3C template, chromatin interactions in living cells are crosslinked using formaldehyde. Next, chromatin is digested and subsequently ligated under conditions favoring ligation events between crosslinked fragments. This yields a genome-wide 3C library of ligation products representing all chromatin interactions in vivo. 5C then employs multiplex ligation mediated amplification to detect, in a single assay, up to millions of unique ligation products present in the 3C library. The resulting 5C library can be analyzed by microarray analysis or deep sequencing. The observed abundance of a 5C product is a measure of the interaction frequency between the two corresponding chromatin fragments. The power of the 5C technique described in this chapter is the high-throughput, high-resolution and quantitative way in which the spatial organization of chromatin can be examined.
Chromosome conformation capture; chromatin looping; chromatin structure; long-range gene regulation; high-throughput
How DNA is organized in three dimensions inside the cell nucleus and how that affects the ways in which cells access, read and interpret genetic information are among the longest standing questions in cell biology. Using newly developed molecular, genomic, and computational approaches based on the chromosome conformation capture technology (such as 3C, 4C, 5C and Hi-C) the spatial organization of genomes is being explored at unprecedented resolution. Interpreting the increasingly large chromatin interaction datasets is now posing novel challenges. Here we describe several types of statistical and computational approaches that have recently been developed to analyze chromatin interaction data.
Chromosome conformation capture; chromatin looping; long-range gene regulation; chromatin domains; 3D modeling; polymer physics; genomics; integrative modeling; topology; fractal globule
Chromosomes are folded in intricate ways inside cells and their spatial organization is intimately related to regulation of gene expression. Expression of genes can be controlled by regulatory elements that are located at large genomic distances from their target genes (in cis), or even on different chromosomes (in trans). Regulatory elements can act at large genomic distances by engaging in direct physical interactions with their target genes resulting in the formation of chromatin loops. Thus, genes and their regulatory elements come in close spatial proximity irrespective of their relative genomic positions. Analysis of interactions between genes and elements will reveal which elements regulate each gene, and will provide fundamental insights into the spatial organization of chromosomes in general.
Long-range cis- and trans- interactions can be studied at high resolution using the Chromosome Conformation Capture (3C) technology. 3C employs formaldehyde crosslinking to trap physical interactions between loci located throughout the genome. Crosslinked cells are then solubilized and chromatin is digested by a restriction enzyme. After digestion the chromatin is subjected to ligation under very dilute DNA concentrations. These conditions favor intramolecular ligation over intermolecular ligation, and thus result in selective ligation of interacting (and crosslinked) genomic elements. The crosslinks are reversed, the DNA is purified, and interaction frequencies between specific chromosomal loci can be determined by quantifying the amounts of corresponding ligation product that is formed using PCR. This chapter describes detailed protocols for 3C analysis of yeast Saccharomyces cerevisiae and mammalian chromosomes.
DNA; chromatin looping; long-range gene regulation; trans-regulation; formaldehyde crosslinking; spatial organization