1.  Molecular Characterization and Chromosomal Distribution of a Species-Specific Transcribed Centromeric Satellite Repeat from the Olive Fruit Fly, Bactrocera oleae 
PLoS ONE  2013;8(11):e79393.
Satellite repetitive sequences that accumulate in the heterochromatin consist a large fraction of a genome and due to their properties are suggested to be implicated in centromere function. Current knowledge of heterochromatic regions of Bactrocera oleae genome, the major pest of the olive tree, is practically nonexistent. In our effort to explore the repetitive DNA portion of B. oleae genome, a novel satellite sequence designated BoR300 was isolated and cloned. The present study describes the genomic organization, abundance and chromosomal distribution of BoR300 which is organized in tandem, forming arrays of 298 bp-long monomers. Sequence analysis showed an AT content of 60.4%, a CENP-B like-motif and a high curvature value based on predictive models. Comparative analysis among randomly selected monomers demonstrated a high degree of sequence homogeneity (88% – 97%) of BoR300 repeats, which are present at approximately 3,000 copies per haploid genome accounting for about 0.28% of the total genomic DNA, based on two independent qPCR approaches. In addition, expression of the repeat was also confirmed through RT-PCR, by which BoR300 transcripts were detected in both sexes. Fluorescence in situ hybridization (FISH) of BoR300 on mitotic metaphases and polytene chromosomes revealed signals to the centromeres of two out of the six chromosomes which indicated a chromosome-specific centromeric localization. Moreover, BoR300 is not conserved in the closely related Bactrocera species tested and it is also absent in other dipterans, but it’s rather restricted to the B. oleae genome. This feature of species-specificity attributed to BoR300 satellite makes it a good candidate as an identification probe of the insect among its relatives at early development stages.
PMCID: PMC3828357  PMID: 24244494
2.  Loss of WSTF results in spontaneous fluctuations of heterochromatin formation and resolution, combined with substantial changes to gene expression 
BMC Genomics  2013;14:740.
Williams syndrome transcription factor (WSTF) is a multifaceted protein that is involved in several nuclear processes, including replication, transcription, and the DNA damage response. WSTF participates in a chromatin-remodeling complex with the ISWI ATPase, SNF2H, and is thought to contribute to the maintenance of heterochromatin, including at the human inactive X chromosome (Xi). WSTF is encoded by BAZ1B, and is one of twenty-eight genes that are hemizygously deleted in the genetic disorder Williams-Beuren syndrome (WBS).
To explore the function of WSTF, we performed zinc finger nuclease-assisted targeting of the BAZ1B gene and isolated several independent knockout clones in human cells. Our results show that, while heterochromatin at the Xi is unaltered, new inappropriate areas of heterochromatin spontaneously form and resolve throughout the nucleus, appearing as large DAPI-dense staining blocks, defined by histone H3 lysine-9 trimethylation and association of the proteins heterochromatin protein 1 and structural maintenance of chromosomes flexible hinge domain containing 1. In three independent mutants, the expression of a large number of genes were impacted, both up and down, by WSTF loss.
Given the inappropriate appearance of regions of heterochromatin in BAZ1B knockout cells, it is evident that WSTF performs a critical role in maintaining chromatin and transcriptional states, a property that is likely compromised by WSTF haploinsufficiency in WBS patients.
PMCID: PMC3870985  PMID: 24168170
3.  The macrosatellite DXZ4 mediates CTCF-dependent long-range intrachromosomal interactions on the human inactive X chromosome 
Human Molecular Genetics  2012;21(20):4367-4377.
The human X-linked macrosatellite DXZ4 is a large tandem repeat located at Xq23 that is packaged into heterochromatin on the male X chromosome and female active X chromosome and, in response to X chromosome, inactivation is organized into euchromatin bound by the insulator protein CCCTC-binding factor (CTCF) on the inactive X chromosome (Xi). The purpose served by this unusual epigenetic regulation is unclear, but suggests a Xi-specific gain of function for DXZ4. Other less extensive bands of euchromatin can be observed on the Xi, but the identity of the underlying DNA sequences is unknown. Here, we report the identification of two novel human X-linked tandem repeats, located 58 Mb proximal and 16 Mb distal to the macrosatellite DXZ4. Both tandem repeats are entirely contained within the transcriptional unit of novel spliced transcripts. Like DXZ4, the tandem repeats are packaged into Xi-specific CTCF-bound euchromatin. These sequences undergo frequent CTCF-dependent interactions with DXZ4 on the Xi, implicating DXZ4 as an epigenetically regulated Xi-specific structural element and providing the first putative functional attribute of a macrosatellite in the human genome.
PMCID: PMC3459461  PMID: 22791747
4.  Multiple Protein Domains Contribute to Nuclear Import and Cell Toxicity of DUX4, a Candidate Pathogenic Protein for Facioscapulohumeral Muscular Dystrophy 
PLoS ONE  2013;8(10):e75614.
DUX4 (Double Homeobox Protein 4) is a nuclear transcription factor encoded at each D4Z4 unit of a tandem-repeat array at human chromosome 4q35. DUX4 constitutes a major candidate pathogenic protein for facioscapulohumeral muscular dystrophy (FSHD), the third most common form of inherited myopathy. A low-level expression of DUX4 compromises cell differentiation in myoblasts and its overexpression induces apoptosis in cultured cells and living organisms. In this work we explore potential molecular determinants of DUX4 mediating nuclear import and cell toxicity. Deletion of the hypothetical monopartite nuclear localization sequences RRRR23, RRKR98 and RRAR148 (i.e. NLS1, NLS2 and NLS3, respectively) only partially delocalizes DUX4 from the cell nuclei. Nuclear entrance guided by NLS1, NLS2 and NLS3 does not follow the classical nuclear import pathway mediated by α/β importins. NLS and homeodomain mutants from DUX4 are dramatically less cell-toxic than the wild type molecule, independently of their subcellular localization. A triple ΔNLS1-2-3 deletion mutant is still partially localized in the nuclei, indicating that additional sequences in DUX4 contribute to nuclear import. Deletion of ≥111 amino acids from the C-terminal of DUX4, on a ΔNLS1-2-3 background, almost completely re-localizes DUX4 to the cytoplasm, indicating that the C-ter tail contributes to subcellular trafficking of DUX4. Also, C-terminal deletion mutants of DUX4 on a NLS wild type background are less toxic than wild type DUX4. Results reported here indicate that DUX4 possesses redundant mechanisms to assure nuclear entrance and that its various transcription-factor associated domains play an essential role in cell toxicity.
PMCID: PMC3792938  PMID: 24116060
5.  Inactive DNMT3B Splice Variants Modulate De Novo DNA Methylation 
PLoS ONE  2013;8(7):e69486.
Inactive DNA methyltransferase (DNMT) 3B splice isoforms are associated with changes in DNA methylation, yet the mechanisms by which they act remain largely unknown. Using biochemical and cell culture assays, we show here that the inactive DNMT3B3 and DNMT3B4 isoforms bind to and regulate the activity of catalytically competent DNMT3A or DNMT3B molecules. DNMT3B3 modestly stimulated the de novo methylation activity of DNMT3A and also counteracted the stimulatory effects of DNMT3L, therefore leading to subtle and contrasting effects on activity. DNMT3B4, by contrast, significantly inhibited de novo DNA methylation by active DNMT3 molecules, most likely due to its ability to reduce the DNA binding affinity of co-complexes, thereby sequestering them away from their substrate. Immunocytochemistry experiments revealed that in addition to their effects on the intrinsic catalytic function of active DNMT3 enzymes, DNMT3B3 and DNMT34 drive distinct types of chromatin compaction and patterns of histone 3 lysine 9 tri-methylation (H3K9me3) deposition. Our findings suggest that regulation of active DNMT3 members through the formation of co-complexes with inactive DNMT3 variants is a general mechanism by which DNMT3 variants function. This may account for some of the changes in DNA methylation patterns observed during development and disease.
PMCID: PMC3716610  PMID: 23894490
6.  Elevated Expression of H19 and Igf2 in the Female Mouse Eye 
PLoS ONE  2013;8(2):e56611.
The catalogue of genes expressed at different levels in the two sexes is growing, and the mechanisms underlying sex differences in regulation of the mammalian transcriptomes are being explored. Here we report that the expression of the imprinted non-protein-coding maternally expressed gene H19 was female-biased specifically in the female mouse eye (1.9-fold, p = 3.0E−6) while not being sex-biased in other somatic tissues. The female-to-male expression fold-change of H19 fell in the range expected from an effect of biallelic versus monoallelic expression. Recently, the possibility of sex-specific parent-of-origin allelic expression has been debated. This led us to hypothesize that H19 might express biallelically in the female mouse eye, thus escape its silencing imprint on the paternal allele specifically in this tissue. We therefore performed a sex-specific imprinting assay of H19 in female and male eye derived from a cross between Mus musculus and Mus spretus. However, this analysis demonstrated that H19 was exclusively expressed from the maternal gene copy, disproving the escape hypothesis. Instead, this supports that the female-biased expression of H19 is the result of upregulation of the single maternal. Furthermore, if H19 would have been expressed from both gene copies in the female eye, an associated downregulation of Insulin-like growth factor 2 (Igf2) was expected, since H19 and Igf2 compete for a common enhancer element located in the H19/Igf2 imprinted domain. On the contrary we found that also Igf2 was significantly upregulated in its expression in the female eye (1.2-fold, p = 6.1E−3), in further agreement with the conclusion that H19 is monoallelically elevated in females. The female-biased expression of H19 and Igf2 specifically in the eye may contribute to our understanding of sex differences in normal as well as abnormal eye physiology and processes.
PMCID: PMC3577879  PMID: 23437185
7.  B-Chromosome Ribosomal DNA Is Functional in the Grasshopper Eyprepocnemis plorans 
PLoS ONE  2012;7(5):e36600.
B-chromosomes are frequently argued to be genetically inert elements, but activity for some particular genes has been reported, especially for ribosomal RNA (rRNA) genes whose expression can easily be detected at the cytological level by the visualization of their phenotypic expression, i.e., the nucleolus. The B24 chromosome in the grasshopper Eyprepocnemis plorans frequently shows a nucleolus attached to it during meiotic prophase I. Here we show the presence of rRNA transcripts that unequivocally came from the B24 chromosome. To detect these transcripts, we designed primers specifically anchoring at the ITS-2 region, so that the reverse primer was complementary to the B chromosome DNA sequence including a differential adenine insertion being absent in the ITS2 of A chromosomes. PCR analysis carried out on genomic DNA showed amplification in B-carrying males but not in B-lacking ones. PCR analyses performed on complementary DNA showed amplification in about half of B-carrying males. Joint cytological and molecular analysis performed on 34 B-carrying males showed a close correspondence between the presence of B-specific transcripts and of nucleoli attached to the B chromosome. In addition, the molecular analysis revealed activity of the B chromosome rDNA in 10 out of the 13 B-carrying females analysed. Our results suggest that the nucleoli attached to B chromosomes are actively formed by expression of the rDNA carried by them, and not by recruitment of nucleolar materials formed in A chromosome nucleolar organizing regions. Therefore, B-chromosome rDNA in E. plorans is functional since it is actively transcribed to form the nucleolus attached to the B chromosome. This demonstrates that some heterochromatic B chromosomes can harbour functional genes.
PMCID: PMC3343036  PMID: 22570730
8.  Late Replication Domains in Polytene and Non-Polytene Cells of Drosophila melanogaster 
PLoS ONE  2012;7(1):e30035.
In D. melanogaster polytene chromosomes, intercalary heterochromatin (IH) appears as large dense bands scattered in euchromatin and comprises clusters of repressed genes. IH displays distinctly low gene density, indicative of their particular regulation. Genes embedded in IH replicate late in the S phase and become underreplicated. We asked whether localization and organization of these late-replicating domains is conserved in a distinct cell type. Using published comprehensive genome-wide chromatin annotation datasets (modENCODE and others), we compared IH organization in salivary gland cells and in a Kc cell line. We first established the borders of 60 IH regions on a molecular map, these regions containing underreplicated material and encompassing ∼12% of Drosophila genome. We showed that in Kc cells repressed chromatin constituted 97% of the sequences that corresponded to IH bands. This chromatin is depleted for ORC-2 binding and largely replicates late. Differences in replication timing between the cell types analyzed are local and affect only sub-regions but never whole IH bands. As a rule such differentially replicating sub-regions display open chromatin organization, which apparently results from cell-type specific gene expression of underlying genes. We conclude that repressed chromatin organization of IH is generally conserved in polytene and non-polytene cells. Yet, IH domains do not function as transcription- and replication-regulatory units, because differences in transcription and replication between cell types are not domain-wide, rather they are restricted to small “islands” embedded in these domains. IH regions can thus be defined as a special class of domains with low gene density, which have narrow temporal expression patterns, and so displaying relatively conserved organization.
PMCID: PMC3254639  PMID: 22253867
9.  YY1 associates with the macrosatellite DXZ4 on the inactive X chromosome and binds with CTCF to a hypomethylated form in some male carcinomas 
Nucleic Acids Research  2011;40(4):1596-1608.
DXZ4 is an X-linked macrosatellite composed of 12–100 tandemly arranged 3-kb repeat units. In females, it adopts opposite chromatin arrangements at the two alleles in response to X-chromosome inactivation. In males and on the active X chromosome, it is packaged into heterochromatin, but on the inactive X chromosome (Xi), it adopts a euchromatic conformation bound by CTCF. Here we report that the ubiquitous transcription factor YY1 associates with the euchromatic form of DXZ4 on the Xi. The binding of YY1 close to CTCF is reminiscent of that at other epigenetically regulated sequences, including sites of genomic imprinting, and at the X-inactivation centre, suggesting a common mode of action in this arrangement. As with CTCF, binding of YY1 to DXZ4 in vitro is not blocked by CpG methylation, yet in vivo both proteins are restricted to the hypomethylated form. In several male carcinoma cell lines, DXZ4 can adopt a Xi-like conformation in response to cellular transformation, characterized by CpG hypomethylation and binding of YY1 and CTCF. Analysis of a male melanoma cell line and normal skin cells from the same individual confirmed that a transition in chromatin state occurred in response to transformation.
PMCID: PMC3287207  PMID: 22064860
10.  Obesity Risk Gene TMEM18 Encodes a Sequence-Specific DNA-Binding Protein 
PLoS ONE  2011;6(9):e25317.
Transmembrane protein 18 (TMEM18) has previously been connected to cell migration and obesity. However, the molecular function of the protein has not yet been described. Here we show that TMEM18 localises to the nuclear membrane and binds to DNA in a sequence-specific manner. The protein binds DNA with its positively charged C-terminus that contains also a nuclear localisation signal. Increase in the amount of TMEM18 in cells suppresses expression from a reporter vector with the TMEM18 target sequence. TMEM18 is a small protein of 140 residues and is predicted to be mostly alpha-helical with three transmembrane parts. As a consequence the DNA binding by TMEM18 would bring the chromatin very near to nuclear membrane. We speculate that this closed perinuclear localisation of TMEM18-bound DNA might repress transcription from it.
PMCID: PMC3182218  PMID: 21980424
11.  Variation in Array Size, Monomer Composition and Expression of the Macrosatellite DXZ4 
PLoS ONE  2011;6(4):e18969.
Macrosatellites are some of the most polymorphic regions of the human genome, yet many remain uncharacterized despite the association of some arrays with disease susceptibility. This study sought to explore the polymorphic nature of the X-linked macrosatellite DXZ4. Four aspects of DXZ4 were explored in detail, including tandem repeat copy number variation, array instability, monomer sequence polymorphism and array expression. DXZ4 arrays contained between 12 and 100 3.0 kb repeat units with an average array containing 57. Monomers were confirmed to be arranged in uninterrupted tandem arrays by restriction digest analysis and extended fiber FISH, and therefore DXZ4 encompasses 36–288 kb of Xq23. Transmission of DXZ4 through three generations in three families displayed a high degree of meiotic instability (8.3%), consistent with other macrosatellite arrays, further highlighting the unstable nature of these sequences in the human genome. Subcloning and sequencing of complete DXZ4 monomers identified numerous single nucleotide polymorphisms and alleles for the three microsatellite repeats located within each monomer. Pairwise comparisons of DXZ4 monomer sequences revealed that repeat units from an array are more similar to one another than those originating from different arrays. RNA fluorescence in situ hybridization revealed significant variation in DXZ4 expression both within and between cell lines. DXZ4 transcripts could be detected originiating from both the active and inactive X chromosome. Expression levels of DXZ4 varied significantly between males, but did not relate to the size of the array, nor did inheritance of the same array result in similar expression levels. Collectively, these studies provide considerable insight into the polymorphic nature of DXZ4, further highlighting the instability and variation potential of macrosatellites in the human genome.
PMCID: PMC3081327  PMID: 21544201
12.  Characterization of DXZ4 conservation in primates implies important functional roles for CTCF binding, array expression and tandem repeat organization on the X chromosome 
Genome Biology  2011;12(4):R37.
Comparative sequence analysis is a powerful means with which to identify functionally relevant non-coding DNA elements through conserved nucleotide sequence. The macrosatellite DXZ4 is a polymorphic, uninterrupted, tandem array of 3-kb repeat units located exclusively on the human X chromosome. While not obviously protein coding, its chromatin organization suggests differing roles for the array on the active and inactive X chromosomes.
In order to identify important elements within DXZ4, we explored preservation of DNA sequence and chromatin conformation of the macrosatellite in primates. We found that DXZ4 DNA sequence conservation beyond New World monkeys is limited to the promoter and CTCF binding site, although DXZ4 remains a GC-rich tandem array. Investigation of chromatin organization in macaques revealed that DXZ4 in males and on the active X chromosome is packaged into heterochromatin, whereas on the inactive X, DXZ4 was euchromatic and bound by CTCF.
Collectively, these data suggest an important conserved role for DXZ4 on the X chromosome involving expression, CTCF binding and tandem organization.
PMCID: PMC3218863  PMID: 21489251
13.  Genome-Wide Analysis of the Chromatin Composition of Histone H2A and H3 Variants in Mouse Embryonic Stem Cells 
PLoS ONE  2014;9(3):e92689.
Genome-wide distribution of the majority of H2A and H3 variants (H2A, H2AX, H2AZ, macroH2A, H3.1, H3.2 and H3.3) was simultaneously investigated in mouse embryonic stem cells by chromatin immunoprecipitation sequencing. Around the transcription start site, histone variant distribution differed between genes possessing promoters of high and low CpG density, regardless of their expression levels. In the intergenic regions, regulatory elements were enriched in H2A.Z and H3.3, whereas repeat elements were abundant in H2A and macroH2A, and H3.1, respectively. Analysis of H2A and H3 variant combinations composing nucleosomes revealed that the H2A.Z and H3.3 combinations were present at a higher frequency throughout the genome than the other combinations, suggesting that H2A.Z and H3.3 associate preferentially with each other to comprise the nucleosomes independently of genome region. Finally, we found that chromatin was unstable only in regions where it was enriched in both H2A.Z and H3.3, but strongly quantified stable in regions in which only H3.3 was abundant. Therefore, histone variant composition is an important determinant of chromatin structure, which is associated with specific genomic functions.
PMCID: PMC3962432  PMID: 24658136
14.  Expression, tandem repeat copy number variation and stability of four macrosatellite arrays in the human genome 
BMC Genomics  2010;11:632.
Macrosatellites are some of the largest variable number tandem repeats in the human genome, but what role these unusual sequences perform is unknown. Their importance to human health is clearly demonstrated by the 4q35 macrosatellite D4Z4 that is associated with the onset of the muscle degenerative disease facioscapulohumeral muscular dystrophy. Nevertheless, many other macrosatellite arrays in the human genome remain poorly characterized.
Here we describe the organization, tandem repeat copy number variation, transmission stability and expression of four macrosatellite arrays in the human genome: the TAF11-Like array located on chromosomes 5p15.1, the SST1 arrays on 4q28.3 and 19q13.12, the PRR20 array located on chromosome 13q21.1, and the ZAV array at 9q32. All are polymorphic macrosatellite arrays that at least for TAF11-Like and SST1 show evidence of meiotic instability. With the exception of the SST1 array that is ubiquitously expressed, all are expressed at high levels in the testis and to a lesser extent in the brain.
Our results extend the number of characterized macrosatellite arrays in the human genome and provide the foundation for formulation of hypotheses to begin assessing their functional role in the human genome.
PMCID: PMC3018141  PMID: 21078170
15.  An Improved Canine Genome and a Comprehensive Catalogue of Coding Genes and Non-Coding Transcripts 
PLoS ONE  2014;9(3):e91172.
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.
PMCID: PMC3953330  PMID: 24625832
16.  The Mi-2/NuRD complex associates with pericentromeric heterochromatin during S phase in rapidly proliferating lymphoid cells 
Chromosoma  2009;118(4):445-457.
Chromosomal replication results in the duplication not only of DNA sequence but also of the patterns of histone modification, DNA methylation, and nucleoprotein structure that constitute epigenetic information. Pericentromeric heterochromatin in human cells is characterized by unique patterns of histone and DNA modification. Here, we describe association of the Mi-2/NuRD complex with specific segments of pericentromeric heterochromatin consisting of Satellite II DNA located on human chromosomes 1, 9 and 16 in some, but not all cell types. This association is linked in part to DNA replication and chromatin assembly, and may suggest a role in these processes. Mi-2/NuRD accumulation is independent of Polycomb association and is characterized by a unique pattern of histone modification. We propose that Mi-2/NuRD constitutes an enzymatic component of a pathway for assembly and maturation of chromatin utilized by rapidly proliferating lymphoid cells for replication of constitutive heterochromatin.
PMCID: PMC2808998  PMID: 19296121
17.  The insulator factor CTCF controls MHC class II gene expression and is required for the formation of long-distance chromatin interactions 
Knockdown of the insulator factor CCCTC binding factor (CTCF), which binds XL9, an intergenic element located between HLA-DRB1 and HLA-DQA1, was found to diminish expression of these genes. The mechanism involved interactions between CTCF and class II transactivator (CIITA), the master regulator of major histocompatibility complex class II (MHC-II) gene expression, and the formation of long-distance chromatin loops between XL9 and the proximal promoter regions of these MHC-II genes. The interactions were inducible and dependent on the activity of CIITA, regulatory factor X, and CTCF. RNA fluorescence in situ hybridizations show that both genes can be expressed simultaneously from the same chromosome. Collectively, the results suggest a model whereby both HLA-DRB1 and HLA-DQA1 loci can interact simultaneously with XL9, and describe a new regulatory mechanism for these MHC-II genes involving the alteration of the general chromatin conformation of the region and their regulation by CTCF.
PMCID: PMC2292219  PMID: 18347100
18.  Epigenetic Control of SPI1 Gene by CTCF and ISWI ATPase SMARCA5 
PLoS ONE  2014;9(2):e87448.
CCCTC-binding factor (CTCF) can both activate as well as inhibit transcription by forming chromatin loops between regulatory regions and promoters. In this regard, Ctcf binding on non-methylated DNA and its interaction with the Cohesin complex results in differential regulation of the H19/Igf2 locus. Similarly, a role for CTCF has been established in normal hematopoietic development; however its involvement in leukemia remains elusive. Here, we show that Ctcf binds to the imprinting control region of H19/Igf2 in AML blasts. We also demonstrate that Smarca5, which also associates with the Cohesin complex, facilitates Ctcf binding to its target sites on DNA. Furthermore, Smarca5 supports Ctcf functionally and is needed for enhancer-blocking effect at ICR. We next asked whether CTCF and SMARCA5 control the expression of key hematopoiesis regulators. In normally differentiating myeloid cells both CTCF and SMARCA5 together with members of the Cohesin complex are recruited to the SPI1 gene, a key hematopoiesis regulator and leukemia suppressor. Due to DNA methylation, CTCF binding to the SPI1 gene is blocked in AML blasts. Upon AZA-mediated DNA demethylation of human AML blasts, CTCF and SMARCA5 are recruited to the −14.4 Enhancer of SPI1 gene and block its expression. Our data provide new insight into complex SPI1 gene regulation now involving additional key epigenetic factors, CTCF and SMARCA5 that control PU.1 expression at the −14.4 Enhancer.
PMCID: PMC3911986  PMID: 24498324
19.  Cell cycle–dependent localization of macroH2A in chromatin of the inactive X chromosome 
The Journal of Cell Biology  2002;157(7):1113-1123.
One of several features acquired by chromatin of the inactive X chromosome (Xi) is enrichment for the core histone H2A variant macroH2A within a distinct nuclear structure referred to as a macrochromatin body (MCB). In addition to localizing to the MCB, macroH2A accumulates at a perinuclear structure centered at the centrosome. To better understand the association of macroH2A1 with the centrosome and the formation of an MCB, we investigated the distribution of macroH2A1 throughout the somatic cell cycle. Unlike Xi-specific RNA, which associates with the Xi throughout interphase, the appearance of an MCB is predominantly a feature of S phase. Although the MCB dissipates during late S phase and G2 before reforming in late G1, macroH2A1 remains associated during mitosis with specific regions of the Xi, including at the X inactivation center. This association yields a distinct macroH2A banding pattern that overlaps with the site of histone H3 lysine-4 methylation centered at the DXZ4 locus in Xq24. The centrosomal pool of macroH2A1 accumulates in the presence of an inhibitor of the 20S proteasome. Therefore, targeting of macroH2A1 to the centrosome is likely part of a degradation pathway, a mechanism common to a variety of other chromatin proteins.
PMCID: PMC2173542  PMID: 12082075
XIST; macroH2A; chromatin; centrosome; aggresome
20.  A Novel Chromatin Protein, Distantly Related to Histone H2a, Is Largely Excluded from the Inactive X Chromosome 
The Journal of Cell Biology  2001;152(2):375-384.
Chromatin on the mammalian inactive X chromosome differs in a number of ways from that on the active X. One protein, macroH2A, whose amino terminus is closely related to histone H2A, is enriched on the heterochromatic inactive X chromosome in female cells. Here, we report the identification and localization of a novel and more distant histone variant, designated H2A-Bbd, that is only 48% identical to histone H2A. In both interphase and metaphase female cells, using either a myc epitope–tagged or green fluorescent protein–tagged H2A-Bbd construct, the inactive X chromosome is markedly deficient in H2A-Bbd staining, while the active X and the autosomes stain throughout. In double-labeling experiments, antibodies to acetylated histone H4 show a pattern of staining indistinguishable from H2A-Bbd in interphase nuclei and on metaphase chromosomes. Chromatin fractionation demonstrates association of H2A-Bbd with the histone proteins. Separation of micrococcal nuclease–digested chromatin by sucrose gradient ultracentrifugation shows cofractionation of H2A-Bbd with nucleosomes, supporting the idea that H2A-Bbd is incorporated into nucleosomes as a substitute for the core histone H2A. This finding, in combination with the overlap with acetylated forms of H4, raises the possibility that H2A-Bbd is enriched in nucleosomes associated with transcriptionally active regions of the genome. The distribution of H2A-Bbd thus distinguishes chromatin on the active and inactive X chromosomes.
PMCID: PMC2199617  PMID: 11266453
histones; X chromosome inactivation; euchromatin; histone H4 acetylation; macroH2A
21.  Histone variant macroH2A contains two distinct macrochromatin domains capable of directing macroH2A to the inactive X chromosome 
Nucleic Acids Research  2001;29(13):2699-2705.
Chromatin on the inactive X chromosome (Xi) of female mammals is enriched for the histone variant macroH2A that can be detected at interphase as a distinct nuclear structure referred to as a macro chromatin body (MCB). Green fluorescent protein-tagged and Myc epitope-tagged macroH2A readily form an MCB in the nuclei of transfected female, but not male, cells. Using targeted disruptions, we have identified two macrochromatin domains within macroH2A that are independently capable of MCB formation and association with the Xi. Complete removal of the non-histone C-terminal tail does not reduce the efficiency of association of the variant histone domain of macroH2A with the Xi, indicating that the histone portion alone can target the Xi. The non-histone domain by itself is incapable of MCB formation. However, when directed to the nucleosome by fusion to core histone H2A or H2B, the non-histone tail forms an MCB that appears identical to that of the endogenous protein. Mutagenesis of the non-histone portion of macroH2A localized the region required for MCB formation and targeting to the Xi to an ∼190 amino acid region.
PMCID: PMC55781  PMID: 11433014
22.  Structural Variation-Associated Expression Changes Are Paralleled by Chromatin Architecture Modifications 
PLoS ONE  2013;8(11):e79973.
Copy number variants (CNVs) influence the expression of genes that map not only within the rearrangement, but also to its flanks. To assess the possible mechanism(s) underlying this “neighboring effect”, we compared intrachromosomal interactions and histone modifications in cell lines of patients affected by genomic disorders and control individuals. Using chromosome conformation capture (4C-seq), we observed that a set of genes flanking the Williams-Beuren Syndrome critical region (WBSCR) were often looping together. The newly identified interacting genes include AUTS2, mutations of which are associated with autism and intellectual disabilities. Deletion of the WBSCR disrupts the expression of this group of flanking genes, as well as long-range interactions between them and the rearranged interval. We also pinpointed concomitant changes in histone modifications between samples.
We conclude that large genomic rearrangements can lead to chromatin conformation changes that extend far away from the structural variant, thereby possibly modulating expression globally and modifying the phenotype.
GEO Series accession number: GSE33784, GSE33867.
PMCID: PMC3827143  PMID: 24265791
23.  Direct Visualization of the Highly Polymorphic RNU2 Locus in Proximity to the BRCA1 Gene 
PLoS ONE  2013;8(10):e76054.
Although the breast cancer susceptibility gene BRCA1 is one of the most extensively characterized genetic loci, much less is known about its upstream variable number tandem repeat element, the RNU2 locus. RNU2 encodes the U2 small nuclear RNA, an essential splicing element, but this locus is missing from the human genome assembly due to the inherent difficulty in the assembly of repetitive sequences. To fill the gap between RNU2 and BRCA1, we have reconstructed the physical map of this region by re-examining genomic clone sequences of public databases, which allowed us to precisely localize the RNU2 array 124 kb telomeric to BRCA1. We measured by performing FISH analyses on combed DNA for the first time the exact number of repeats carried by each of the two alleles in 41 individuals and found a range of 6-82 copies and a level of heterozygosity of 98%. The precise localisation of the RNU2 locus in the genome reference assembly and the implementation of a new technical tool to study it will make the detailed exploration of this locus possible. This recently neglected macrosatellite could be valuable for evaluating the potential role of structural variations in disease due to its location next to a major cancer susceptibility gene.
PMCID: PMC3795722  PMID: 24146815
24.  Functional Characterization of cis-Elements Conferring Vascular Vein Expression of At4g34880 Amidase Family Protein Gene in Arabidopsis 
PLoS ONE  2013;8(7):e67562.
The expression of At4g34880 gene encoding amidase in Arabidopsis was characterized in this study. A promoter region of 1.5 kb on the upstream of the start codon of the gene (referred as AmidP) was fused with uidA (GUS) reporter gene, and transformed into Arabidopsis plant for determining its spatial expression. The results indicated that AmidP drived GUS expression in vascular system, predominately in leaves. Truncation analysis of AmidP demonstrated that VASCULAR VEIN ELEMENT (VVE) motif with a region of 176 bp sequence (−1500 to −1324) was necessary and sufficient to direct the vascular vein specific GUS expression in the transgenic plant. Tandem copy of VVE increased vascular system expression, and 5′- and 3′- deletions of VVE motif in combination with a truncated −65 CaMV 35S minimal promoter showed that 11bp cis-acting element, naming DOF2 domain, played an essential role for the vascular vein specific expression. Meanwhile, it was also observed that the other cis-acting elements among the VVE region are also associated with specificity or strength of GUS activities in vascular system.
PMCID: PMC3699661  PMID: 23844031
25.  Genome-Wide Identification of Chromatin Transitional Regions Reveals Diverse Mechanisms Defining the Boundary of Facultative Heterochromatin 
PLoS ONE  2013;8(6):e67156.
Due to the self-propagating nature of the heterochromatic modification H3K27me3, chromatin barrier activities are required to demarcate the boundary and prevent it from encroaching into euchromatic regions. Studies in Drosophila and vertebrate systems have revealed several important chromatin barrier elements and their respective binding factors. However, epigenomic data indicate that the binding of these factors are not exclusive to chromatin boundaries. To gain a comprehensive understanding of facultative heterochromatin boundaries, we developed a two-tiered method to identify the Chromatin Transitional Region (CTR), i.e. the nucleosomal region that shows the greatest transition rate of the H3K27me3 modification as revealed by ChIP-Seq. This approach was applied to identify CTRs in Drosophila S2 cells and human HeLa cells. Although many insulator proteins have been characterized in Drosophila, less than half of the CTRs in S2 cells are associated with known insulator proteins, indicating unknown mechanisms remain to be characterized. Our analysis also revealed that the peak binding of insulator proteins are usually 1–2 nucleosomes away from the CTR. Comparison of CTR-associated insulator protein binding sites vs. those in heterochromatic region revealed that boundary-associated binding sites are distinctively flanked by nucleosome destabilizing sequences, which correlates with significant decreased nucleosome density and increased binding intensities of co-factors. Interestingly, several subgroups of boundaries have enhanced H3.3 incorporation but reduced nucleosome turnover rate. Our genome-wide study reveals that diverse mechanisms are employed to define the boundaries of facultative heterochromatin. In both Drosophila and mammalian systems, only a small fraction of insulator protein binding sites co-localize with H3K27me3 boundaries. However, boundary-associated insulator binding sites are distinctively flanked by nucleosome destabilizing sequences, which correlates with significantly decreased nucleosome density and increased binding of co-factors.
PMCID: PMC3696093  PMID: 23840609

