Enhancers are traditionally viewed as DNA sequences located some distance from a promoter that act in cis and in an orientation-independent fashion to increase utilization of specific promoters and thereby regulate gene expression. Much progress has been made over the last decade toward understanding how these distant elements interact with target promoters, but how transcription is enhanced remains an object of active inquiry. Recent reports convey the prevalence and diversity of enhancer transcription and transcripts and support both as key factors with mechanistically distinct, but not mutually exclusive roles in enhancer function. Decoupling the causes and effects of transcription on the local chromatin landscape and understanding the role of enhancer transcripts in the context of long-range interactions are challenges that require additional attention. In this review we focus on the possible functions of enhancer transcription by highlighting several recent eRNA papers and, within the context of other enhancer studies, speculate on the role of enhancer transcription in regulating differential gene expression.
Enhancer; chromatin; noncoding RNA; transcription; mediator; cohesion; histone modifications; enhancer transcripts; eRNA
Noncoding RNA; long noncoding RNA; small RNA
The human genome contains numerous large tandem repeats, many of which remain poorly characterized. Here we report a novel transfer RNA (tRNA) tandem repeat on human chromosome 1q23.3 that shows extensive copy number variation with 9–43 repeat units per allele and displays evidence of meiotic and mitotic instability. Each repeat unit consists of a 7.3 kb GC-rich sequence that binds the insulator protein CTCF and bears the chromatin hallmarks of a bivalent domain in human embryonic stem cells. A tRNA containing tandem repeat composed of at least three 7.6-kb GC-rich repeat units reside within a syntenic region of mouse chromosome 1. However, DNA sequence analysis reveals that, with the exception of the tRNA genes that account for less than 6% of a repeat unit, the remaining 7.2 kb is not conserved with the notable exception of a 24 base pair sequence corresponding to the CTCF binding site, suggesting an important role for this protein at the locus.
Williams syndrome transcription factor (WSTF) is a multifaceted protein that is involved in several nuclear processes, including replication, transcription, and the DNA damage response. WSTF participates in a chromatin-remodeling complex with the ISWI ATPase, SNF2H, and is thought to contribute to the maintenance of heterochromatin, including at the human inactive X chromosome (Xi). WSTF is encoded by BAZ1B, and is one of twenty-eight genes that are hemizygously deleted in the genetic disorder Williams-Beuren syndrome (WBS).
To explore the function of WSTF, we performed zinc finger nuclease-assisted targeting of the BAZ1B gene and isolated several independent knockout clones in human cells. Our results show that, while heterochromatin at the Xi is unaltered, new inappropriate areas of heterochromatin spontaneously form and resolve throughout the nucleus, appearing as large DAPI-dense staining blocks, defined by histone H3 lysine-9 trimethylation and association of the proteins heterochromatin protein 1 and structural maintenance of chromosomes flexible hinge domain containing 1. In three independent mutants, the expression of a large number of genes were impacted, both up and down, by WSTF loss.
Given the inappropriate appearance of regions of heterochromatin in BAZ1B knockout cells, it is evident that WSTF performs a critical role in maintaining chromatin and transcriptional states, a property that is likely compromised by WSTF haploinsufficiency in WBS patients.
The human X-linked macrosatellite DXZ4 is a large tandem repeat located at Xq23 that is packaged into heterochromatin on the male X chromosome and female active X chromosome and, in response to X chromosome, inactivation is organized into euchromatin bound by the insulator protein CCCTC-binding factor (CTCF) on the inactive X chromosome (Xi). The purpose served by this unusual epigenetic regulation is unclear, but suggests a Xi-specific gain of function for DXZ4. Other less extensive bands of euchromatin can be observed on the Xi, but the identity of the underlying DNA sequences is unknown. Here, we report the identification of two novel human X-linked tandem repeats, located 58 Mb proximal and 16 Mb distal to the macrosatellite DXZ4. Both tandem repeats are entirely contained within the transcriptional unit of novel spliced transcripts. Like DXZ4, the tandem repeats are packaged into Xi-specific CTCF-bound euchromatin. These sequences undergo frequent CTCF-dependent interactions with DXZ4 on the Xi, implicating DXZ4 as an epigenetically regulated Xi-specific structural element and providing the first putative functional attribute of a macrosatellite in the human genome.
Replicating the genome prior to each somatic cell division not only requires precise duplication of the genetic information, but also accurately reestablishing the epigenetic signatures that instruct how the genetic material is to be interpreted in the daughter cells. The mammalian inactive X chromosome (Xi), which is faithfully inherited in a silent state in each daughter cell, provides an excellent model of epigenetic regulation. While much is known about the early stages of X chromosome inactivation, much less is understood with regards to retaining the Xi chromatin through somatic cell division. Here we report that the WSTF-ISWI chromatin remodeling complex (WICH) associates with the Xi during late S-phase as the Xi DNA is replicated. Elevated levels of WICH at the Xi is restricted to late S-phase and appears before BRCA1 and γ-H2A.X. The sequential appearance of WICH and BRCA1/γ-H2A.X implicate each as performing important but distinct roles in the maturation and maintenance of heterochromatin at the Xi.
The X-linked macrosatellite DXZ4 is a large homogenous tandem repeat that in females adopts an alternative chromatin organization on the primate X chromosome in response to X-chromosome inactivation. It is packaged into heterochromatin on the active X chromosome but into euchromatin and bound by the epigenetic organizer protein CTCF on the inactive X chromosome. Because its DNA sequence diverges rapidly beyond the New World monkeys, the existence of DXZ4 outside the primate lineage is unknown.
Here we extend our comparative genome analysis and report the identification and characterization of the mouse homolog of the macrosatellite. Furthermore, we provide evidence of DXZ4 in a conserved location downstream of the PLS3 gene in a diverse group of mammals, and reveal that DNA sequence conservation is restricted to the CTCF binding motif, supporting a central role for this protein at this locus. However, many features that characterize primate DXZ4 differ in mouse, including the overall size of the array, the mode of transcription, the chromatin organization and conservation between adjacent repeat units of DNA sequence and length. Ctcf binds Dxz4 but is not exclusive to the inactive X chromosome, as evidenced by association in some males and equal binding to both X chromosomes in trophoblast stem cells.
Characterization of Dxz4 reveals substantial differences in the organization of DNA sequence, chromatin packaging, and the mode of transcription, so the potential roles performed by this sequence in mouse have probably diverged from those on the primate X chromosome.
Histone variants are non-allelic protein isoforms that play key roles in diversifying chromatin structure. The known number of such variants has greatly increased in recent years, but the lack of naming conventions for them has led to a variety of naming styles, multiple synonyms and misleading homographs that obscure variant relationships and complicate database searches. We propose here a unified nomenclature for variants of all five classes of histones that uses consistent but flexible naming conventions to produce names that are informative and readily searchable. The nomenclature builds on historical usage and incorporates phylogenetic relationships, which are strong predictors of structure and function. A key feature is the consistent use of punctuation to represent phylogenetic divergence, making explicit the relationships among variant subtypes that have previously been implicit or unclear. We recommend that by default new histone variants be named with organism-specific paralog-number suffixes that lack phylogenetic implication, while letter suffixes be reserved for structurally distinct clades of variants. For clarity and searchability, we encourage the use of descriptors that are separate from the phylogeny-based variant name to indicate developmental and other properties of variants that may be independent of structure.
DXZ4 is an X-linked macrosatellite composed of 12–100 tandemly arranged 3-kb repeat units. In females, it adopts opposite chromatin arrangements at the two alleles in response to X-chromosome inactivation. In males and on the active X chromosome, it is packaged into heterochromatin, but on the inactive X chromosome (Xi), it adopts a euchromatic conformation bound by CTCF. Here we report that the ubiquitous transcription factor YY1 associates with the euchromatic form of DXZ4 on the Xi. The binding of YY1 close to CTCF is reminiscent of that at other epigenetically regulated sequences, including sites of genomic imprinting, and at the X-inactivation centre, suggesting a common mode of action in this arrangement. As with CTCF, binding of YY1 to DXZ4 in vitro is not blocked by CpG methylation, yet in vivo both proteins are restricted to the hypomethylated form. In several male carcinoma cell lines, DXZ4 can adopt a Xi-like conformation in response to cellular transformation, characterized by CpG hypomethylation and binding of YY1 and CTCF. Analysis of a male melanoma cell line and normal skin cells from the same individual confirmed that a transition in chromatin state occurred in response to transformation.
Macrosatellites are some of the most polymorphic regions of the human genome, yet many remain uncharacterized despite the association of some arrays with disease susceptibility. This study sought to explore the polymorphic nature of the X-linked macrosatellite DXZ4. Four aspects of DXZ4 were explored in detail, including tandem repeat copy number variation, array instability, monomer sequence polymorphism and array expression. DXZ4 arrays contained between 12 and 100 3.0 kb repeat units with an average array containing 57. Monomers were confirmed to be arranged in uninterrupted tandem arrays by restriction digest analysis and extended fiber FISH, and therefore DXZ4 encompasses 36–288 kb of Xq23. Transmission of DXZ4 through three generations in three families displayed a high degree of meiotic instability (8.3%), consistent with other macrosatellite arrays, further highlighting the unstable nature of these sequences in the human genome. Subcloning and sequencing of complete DXZ4 monomers identified numerous single nucleotide polymorphisms and alleles for the three microsatellite repeats located within each monomer. Pairwise comparisons of DXZ4 monomer sequences revealed that repeat units from an array are more similar to one another than those originating from different arrays. RNA fluorescence in situ hybridization revealed significant variation in DXZ4 expression both within and between cell lines. DXZ4 transcripts could be detected originiating from both the active and inactive X chromosome. Expression levels of DXZ4 varied significantly between males, but did not relate to the size of the array, nor did inheritance of the same array result in similar expression levels. Collectively, these studies provide considerable insight into the polymorphic nature of DXZ4, further highlighting the instability and variation potential of macrosatellites in the human genome.
Comparative sequence analysis is a powerful means with which to identify functionally relevant non-coding DNA elements through conserved nucleotide sequence. The macrosatellite DXZ4 is a polymorphic, uninterrupted, tandem array of 3-kb repeat units located exclusively on the human X chromosome. While not obviously protein coding, its chromatin organization suggests differing roles for the array on the active and inactive X chromosomes.
In order to identify important elements within DXZ4, we explored preservation of DNA sequence and chromatin conformation of the macrosatellite in primates. We found that DXZ4 DNA sequence conservation beyond New World monkeys is limited to the promoter and CTCF binding site, although DXZ4 remains a GC-rich tandem array. Investigation of chromatin organization in macaques revealed that DXZ4 in males and on the active X chromosome is packaged into heterochromatin, whereas on the inactive X, DXZ4 was euchromatic and bound by CTCF.
Collectively, these data suggest an important conserved role for DXZ4 on the X chromosome involving expression, CTCF binding and tandem organization.
Macrosatellites are some of the largest variable number tandem repeats in the human genome, but what role these unusual sequences perform is unknown. Their importance to human health is clearly demonstrated by the 4q35 macrosatellite D4Z4 that is associated with the onset of the muscle degenerative disease facioscapulohumeral muscular dystrophy. Nevertheless, many other macrosatellite arrays in the human genome remain poorly characterized.
Here we describe the organization, tandem repeat copy number variation, transmission stability and expression of four macrosatellite arrays in the human genome: the TAF11-Like array located on chromosomes 5p15.1, the SST1 arrays on 4q28.3 and 19q13.12, the PRR20 array located on chromosome 13q21.1, and the ZAV array at 9q32. All are polymorphic macrosatellite arrays that at least for TAF11-Like and SST1 show evidence of meiotic instability. With the exception of the SST1 array that is ubiquitously expressed, all are expressed at high levels in the testis and to a lesser extent in the brain.
Our results extend the number of characterized macrosatellite arrays in the human genome and provide the foundation for formulation of hypotheses to begin assessing their functional role in the human genome.
Chromosomal replication results in the duplication not only of DNA sequence but also of the patterns of histone modification, DNA methylation, and nucleoprotein structure that constitute epigenetic information. Pericentromeric heterochromatin in human cells is characterized by unique patterns of histone and DNA modification. Here, we describe association of the Mi-2/NuRD complex with specific segments of pericentromeric heterochromatin consisting of Satellite II DNA located on human chromosomes 1, 9 and 16 in some, but not all cell types. This association is linked in part to DNA replication and chromatin assembly, and may suggest a role in these processes. Mi-2/NuRD accumulation is independent of Polycomb association and is characterized by a unique pattern of histone modification. We propose that Mi-2/NuRD constitutes an enzymatic component of a pathway for assembly and maturation of chromatin utilized by rapidly proliferating lymphoid cells for replication of constitutive heterochromatin.
Knockdown of the insulator factor CCCTC binding factor (CTCF), which binds XL9, an intergenic element located between HLA-DRB1 and HLA-DQA1, was found to diminish expression of these genes. The mechanism involved interactions between CTCF and class II transactivator (CIITA), the master regulator of major histocompatibility complex class II (MHC-II) gene expression, and the formation of long-distance chromatin loops between XL9 and the proximal promoter regions of these MHC-II genes. The interactions were inducible and dependent on the activity of CIITA, regulatory factor X, and CTCF. RNA fluorescence in situ hybridizations show that both genes can be expressed simultaneously from the same chromosome. Collectively, the results suggest a model whereby both HLA-DRB1 and HLA-DQA1 loci can interact simultaneously with XL9, and describe a new regulatory mechanism for these MHC-II genes involving the alteration of the general chromatin conformation of the region and their regulation by CTCF.
One of several features acquired by chromatin of the inactive X chromosome (Xi) is enrichment for the core histone H2A variant macroH2A within a distinct nuclear structure referred to as a macrochromatin body (MCB). In addition to localizing to the MCB, macroH2A accumulates at a perinuclear structure centered at the centrosome. To better understand the association of macroH2A1 with the centrosome and the formation of an MCB, we investigated the distribution of macroH2A1 throughout the somatic cell cycle. Unlike Xi-specific RNA, which associates with the Xi throughout interphase, the appearance of an MCB is predominantly a feature of S phase. Although the MCB dissipates during late S phase and G2 before reforming in late G1, macroH2A1 remains associated during mitosis with specific regions of the Xi, including at the X inactivation center. This association yields a distinct macroH2A banding pattern that overlaps with the site of histone H3 lysine-4 methylation centered at the DXZ4 locus in Xq24. The centrosomal pool of macroH2A1 accumulates in the presence of an inhibitor of the 20S proteasome. Therefore, targeting of macroH2A1 to the centrosome is likely part of a degradation pathway, a mechanism common to a variety of other chromatin proteins.
XIST; macroH2A; chromatin; centrosome; aggresome
Chromatin on the mammalian inactive X chromosome differs in a number of ways from that on the active X. One protein, macroH2A, whose amino terminus is closely related to histone H2A, is enriched on the heterochromatic inactive X chromosome in female cells. Here, we report the identification and localization of a novel and more distant histone variant, designated H2A-Bbd, that is only 48% identical to histone H2A. In both interphase and metaphase female cells, using either a myc epitope–tagged or green fluorescent protein–tagged H2A-Bbd construct, the inactive X chromosome is markedly deficient in H2A-Bbd staining, while the active X and the autosomes stain throughout. In double-labeling experiments, antibodies to acetylated histone H4 show a pattern of staining indistinguishable from H2A-Bbd in interphase nuclei and on metaphase chromosomes. Chromatin fractionation demonstrates association of H2A-Bbd with the histone proteins. Separation of micrococcal nuclease–digested chromatin by sucrose gradient ultracentrifugation shows cofractionation of H2A-Bbd with nucleosomes, supporting the idea that H2A-Bbd is incorporated into nucleosomes as a substitute for the core histone H2A. This finding, in combination with the overlap with acetylated forms of H4, raises the possibility that H2A-Bbd is enriched in nucleosomes associated with transcriptionally active regions of the genome. The distribution of H2A-Bbd thus distinguishes chromatin on the active and inactive X chromosomes.
histones; X chromosome inactivation; euchromatin; histone H4 acetylation; macroH2A
Chromatin on the inactive X chromosome (Xi) of female mammals
is enriched for the histone variant macroH2A that can be detected
at interphase as a distinct nuclear structure referred to as a macro chromatin
body (MCB). Green fluorescent protein-tagged and Myc epitope-tagged
macroH2A readily form an MCB in the nuclei of transfected female,
but not male, cells. Using targeted disruptions, we have identified
two macrochromatin domains within macroH2A that are independently
capable of MCB formation and association with the Xi. Complete removal
of the non-histone C-terminal tail does not reduce the efficiency
of association of the variant histone domain of macroH2A with the
Xi, indicating that the histone portion alone can target the Xi.
The non-histone domain by itself is incapable of MCB formation.
However, when directed to the nucleosome by fusion to core histone
H2A or H2B, the non-histone tail forms an MCB that appears identical
to that of the endogenous protein. Mutagenesis of the non-histone
portion of macroH2A localized the region required for MCB formation
and targeting to the Xi to an ∼190 amino