PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1055278)

Clipboard (0)
None

Related Articles

1.  modMine: flexible access to modENCODE data 
Nucleic Acids Research  2011;40(D1):D1082-D1088.
In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database (http://intermine.modencode.org) described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.
doi:10.1093/nar/gkr921
PMCID: PMC3245176  PMID: 22080565
2.  Sequence-Specific Targeting of Dosage Compensation in Drosophila Favors an Active Chromatin Context 
PLoS Genetics  2012;8(4):e1002646.
The Drosophila MSL complex mediates dosage compensation by increasing transcription of the single X chromosome in males approximately two-fold. This is accomplished through recognition of the X chromosome and subsequent acetylation of histone H4K16 on X-linked genes. Initial binding to the X is thought to occur at “entry sites” that contain a consensus sequence motif (“MSL recognition element” or MRE). However, this motif is only ∼2 fold enriched on X, and only a fraction of the motifs on X are initially targeted. Here we ask whether chromatin context could distinguish between utilized and non-utilized copies of the motif, by comparing their relative enrichment for histone modifications and chromosomal proteins mapped in the modENCODE project. Through a comparative analysis of the chromatin features in male S2 cells (which contain MSL complex) and female Kc cells (which lack the complex), we find that the presence of active chromatin modifications, together with an elevated local GC content in the surrounding sequences, has strong predictive value for functional MSL entry sites, independent of MSL binding. We tested these sites for function in Kc cells by RNAi knockdown of Sxl, resulting in induction of MSL complex. We show that ectopic MSL expression in Kc cells leads to H4K16 acetylation around these sites and a relative increase in X chromosome transcription. Collectively, our results support a model in which a pre-existing active chromatin environment, coincident with H3K36me3, contributes to MSL entry site selection. The consequences of MSL targeting of the male X chromosome include increase in nucleosome lability, enrichment for H4K16 acetylation and JIL-1 kinase, and depletion of linker histone H1 on active X-linked genes. Our analysis can serve as a model for identifying chromatin and local sequence features that may contribute to selection of functional protein binding sites in the genome.
Author Summary
The genomes of complex organisms encompass hundreds of millions of base pairs of DNA, and regulatory molecules must distinguish specific targets within this vast landscape. In general, regulatory factors find target genes through sequence-specific interactions with the underlying DNA. However, sequence-specific factors typically bind only a fraction of the candidate genomic regions containing their specific target sequence motif. Here we identify potential roles for chromatin environment and flanking sequence composition in helping regulatory factors find their appropriate binding sites, using targeting of the Drosophila dosage compensation complex as a model. The initial stage of dosage compensation involves binding of the Male Specific Lethal (MSL) complex to a sequence motif called the MSL recognition element [1]. Using data from a large chromatin mapping effort (the modENCODE project), we successfully identify an active chromatin environment as predictive of selective MRE binding by the MSL complex. Our study provides a framework for using genome-wide datasets to analyze and predict functional protein–DNA binding site selection.
doi:10.1371/journal.pgen.1002646
PMCID: PMC3343056  PMID: 22570616
3.  Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data 
PLoS Computational Biology  2011;7(11):e1002190.
We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
Author Summary
The precise control of gene expression lies at the heart of many biological processes. In eukaryotes, the regulation is performed at multiple levels, mediated by different regulators such as transcription factors and miRNAs, each distinguished by different spatial and temporal characteristics. These regulators are further integrated to form a complex regulatory network responsible for the orchestration. The construction and analysis of such networks is essential for understanding the general design principles. Recent advances in high-throughput techniques like ChIP-Seq and RNA-Seq provide an opportunity by offering a huge amount of binding and expression data. We present a general framework to combine these types of data into an integrated network and perform various topological analyses, including its hierarchical organization and motif enrichment. We find that the integrated network possesses an intrinsic hierarchical organization and is enriched in several network motifs that include both transcription factors and miRNAs. We further demonstrate that the framework can be easily applied to other species like human and mouse. As more and more genome-wide ChIP-Seq and RNA-Seq data are going to be generated in the near future, our methods of data integration have various potential applications.
doi:10.1371/journal.pcbi.1002190
PMCID: PMC3219617  PMID: 22125477
4.  Regulation of Heterochromatin Assembly on Unpaired Chromosomes during Caenorhabditis elegans Meiosis by Components of a Small RNA-Mediated Pathway 
PLoS Genetics  2009;5(8):e1000624.
Many organisms have a mechanism for down regulating the expression of non-synapsed chromosomes and chromosomal regions during meiosis. This phenomenon is thought to function in genome defense. During early meiosis in Caenorhabditis elegans, unpaired chromosomes (e.g., the male X chromosome) become enriched for a modification associated with heterochromatin and transcriptional repression, dimethylation of histone H3 on lysine 9 (H3K9me2). This enrichment requires activity of the cellular RNA-directed RNA polymerase, EGO-1. Here we use genetic mutation, RNA interference, immunofluorescence microscopy, fluorescence in situ hybridization, and molecular cloning methods to identify and analyze three additional regulators of meiotic H3K9me2 distribution: CSR-1 (a Piwi/PAZ/Argonaute protein), EKL-1 (a Tudor domain protein), and DRH-3 (a DEAH/D-box helicase). In csr-1, ekl-1, and drh-3 mutant males, we observed a reduction in H3K9me2 accumulation on the unpaired X chromosome and an increase in H3K9me2 accumulation on paired autosomes relative to controls. We observed a similar shift in H3K9me2 pattern in hermaphrodites that carry unpaired chromosomes. Based on several assays, we conclude that ectopic H3K9me2 accumulates on paired and synapsed chromosomes in these mutants. We propose alternative models for how a small RNA-mediated pathway may regulate H3K9me2 accumulation during meiosis. We also describe the germline phenotypes of csr-1, ekl-1, and drh-3 mutants. Our genetic data suggest that these factors, together with EGO-1, participate in a regulatory network to promote diverse aspects of development.
Author Summary
DNA within a cell's nucleus is packaged together with proteins into a higher order structure called chromatin. In its simplest form, chromatin consists of DNA and a set of proteins called histones, arranged so that the DNA strand is wrapped around histone protein clusters. This basic chromatin structure can be modified in various ways to regulate access to the genetic information encoded in the DNA. Such regulation is critical for cellular function and development of the organism. As cells form gametes, they undergo a specialized type of cell division called meiosis. During meiosis, chromatin is regulated in specific ways to ensure proper development of the embryo. During meiosis in the nematode C. elegans, the chromatin structure of the single male X chromosome depends on an RNA-directed RNA polymerase called EGO-1. Here, we identify three more regulators of meiotic chromatin, the proteins CSR-1, EKL-1, and DRH-3. Our data suggest that these proteins collaborate with EGO-1 to ensure that paired chromosomes (autosomes and hermaphrodite X chromosomes) are regulated correctly and in a manner distinct from the male X chromosome. Our findings suggest that these four proteins participate in a mechanism to ensure proper gene expression for gamete formation.
doi:10.1371/journal.pgen.1000624
PMCID: PMC2726613  PMID: 19714217
5.  Chromosome-Biased Binding and Gene Regulation by the Caenorhabditis elegans DRM Complex 
PLoS Genetics  2011;7(5):e1002074.
DRM is a conserved transcription factor complex that includes E2F/DP and pRB family proteins and plays important roles in development and cancer. Here we describe new aspects of DRM binding and function revealed through genome-wide analyses of the Caenorhabditis elegans DRM subunit LIN-54. We show that LIN-54 DNA-binding activity recruits DRM to promoters enriched for adjacent putative E2F/DP and LIN-54 binding sites, suggesting that these two DNA–binding moieties together direct DRM to its target genes. Chromatin immunoprecipitation and gene expression profiling reveals conserved roles for DRM in regulating genes involved in cell division, development, and reproduction. We find that LIN-54 promotes expression of reproduction genes in the germline, but prevents ectopic activation of germline-specific genes in embryonic soma. Strikingly, C. elegans DRM does not act uniformly throughout the genome: the DRM recruitment motif, DRM binding, and DRM-regulated embryonic genes are all under-represented on the X chromosome. However, germline genes down-regulated in lin-54 mutants are over-represented on the X chromosome. We discuss models for how loss of autosome-bound DRM may enhance germline X chromosome silencing. We propose that autosome-enriched binding of DRM arose in C. elegans as a consequence of germline X chromosome silencing and the evolutionary redistribution of germline-expressed and essential target genes to autosomes. Sex chromosome gene regulation may thus have profound evolutionary effects on genome organization and transcriptional regulatory networks.
Author Summary
X chromosomes differ in number between the sexes and differ from autosomes in their associated proteins and gene regulatory properties. In C. elegans both X chromosomes are partially silenced in hermaphrodite germlines. Germline-expressed and essential genes are autosome-enriched and are thought to have fled the X chromosome during evolution because silencing these genes would result in sterility or lethality. We discovered that the C. elegans DRM complex, which controls transcription of genes implicated in development and cancer, avoids the X chromosome. We first describe how DNA–binding components of the DRM complex together recognize DNA sequences upstream of its target genes, and we describe that DRM controls different target genes in the germline versus the soma. We show that the DRM binding motif, the genes bound by DRM, and the embryonic genes regulated by DRM are all under-represented on the X chromosome. Interestingly, compromising DRM function in the germline enhances X chromosome silencing, and we discuss how autosome-bound DRM might regulate X-linked genes in trans. We propose that autosome-enriched binding of DRM co-evolved with the redistribution of its germline-expressed and essential target genes to autosomes. Our data highlight how X chromosome gene regulation may impact both the genomic distribution of gene sets and their transcriptional regulators.
doi:10.1371/journal.pgen.1002074
PMCID: PMC3093354  PMID: 21589891
6.  The Caenorhabditis elegans Myc-Mondo/Mad Complexes Integrate Diverse Longevity Signals 
PLoS Genetics  2014;10(4):e1004278.
The Myc family of transcription factors regulates a variety of biological processes, including the cell cycle, growth, proliferation, metabolism, and apoptosis. In Caenorhabditis elegans, the “Myc interaction network” consists of two opposing heterodimeric complexes with antagonistic functions in transcriptional control: the Myc-Mondo:Mlx transcriptional activation complex and the Mad:Max transcriptional repression complex. In C. elegans, Mondo, Mlx, Mad, and Max are encoded by mml-1, mxl-2, mdl-1, and mxl-1, respectively. Here we show a similar antagonistic role for the C. elegans Myc-Mondo and Mad complexes in longevity control. Loss of mml-1 or mxl-2 shortens C. elegans lifespan. In contrast, loss of mdl-1 or mxl-1 increases longevity, dependent upon MML-1:MXL-2. The MML-1:MXL-2 and MDL-1:MXL-1 complexes function in both the insulin signaling and dietary restriction pathways. Furthermore, decreased insulin-like/IGF-1 signaling (ILS) or conditions of dietary restriction increase the accumulation of MML-1, consistent with the notion that the Myc family members function as sensors of metabolic status. Additionally, we find that Myc family members are regulated by distinct mechanisms, which would allow for integrated control of gene expression from diverse signals of metabolic status. We compared putative target genes based on ChIP-sequencing data in the modENCODE project and found significant overlap in genomic DNA binding between the major effectors of ILS (DAF-16/FoxO), DR (PHA-4/FoxA), and Myc family (MDL-1/Mad/Mxd) at common target genes, which suggests that diverse signals of metabolic status converge on overlapping transcriptional programs that influence aging. Consistent with this, there is over-enrichment at these common targets for genes that function in lifespan, stress response, and carbohydrate metabolism. Additionally, we find that Myc family members are also involved in stress response and the maintenance of protein homeostasis. Collectively, these findings indicate that Myc family members integrate diverse signals of metabolic status, to coordinate overlapping metabolic and cytoprotective transcriptional programs that determine the progression of aging.
Author Summary
Transcription factors are essential proteins that regulate the expression of genes and play an important role in most biological processes. The results of our study presented here demonstrate for the first time a role in aging for a small family of transcription factors in the nematode worm Caenorhabditis elegans. Importantly, these proteins have close relatives in higher organisms, including humans that influence metabolism, cell replication, and have been implicated in the development of cancer. Moreover, the loss of one homologue has also been implicated in Williams-Beuren syndrome, a disease characterized in part by signs of premature aging. Our data demonstrate that these transcription factors function within insulin/IGF-1 signaling and dietary restriction, two highly conserved pathways that link nutrient sensing to longevity. Taken together, our findings provide exciting new insight into a family of proteins that may be essential for linking nutrient sensing to longevity and have implications for the improvement of human healthspan.
doi:10.1371/journal.pgen.1004278
PMCID: PMC3974684  PMID: 24699255
7.  Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE 
BMC Genomics  2013;14:494.
Background
Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition.
Results
In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies.
Conclusions
Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around.
doi:10.1186/1471-2164-14-494
PMCID: PMC3734164  PMID: 23875683
8.  Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation 
eLife  2013;2:e00808.
The X-chromosome gene regulatory process called dosage compensation ensures that males (1X) and females (2X) express equal levels of X-chromosome transcripts. The mechanism in Caenorhabditis elegans has been elusive due to improperly annotated transcription start sites (TSSs). Here we define TSSs and the distribution of transcriptionally engaged RNA polymerase II (Pol II) genome-wide in wild-type and dosage-compensation-defective animals to dissect this regulatory mechanism. Our TSS-mapping strategy integrates GRO-seq, which tracks nascent transcription, with a new derivative of this method, called GRO-cap, which recovers nascent RNAs with 5′ caps prior to their removal by co-transcriptional processing. Our analyses reveal that promoter-proximal pausing is rare, unlike in other metazoans, and promoters are unexpectedly far upstream from the 5′ ends of mature mRNAs. We find that C. elegans equalizes X-chromosome expression between the sexes, to a level equivalent to autosomes, by reducing Pol II recruitment to promoters of hermaphrodite X-linked genes using a chromosome-restructuring condensin complex.
DOI: http://dx.doi.org/10.7554/eLife.00808.001
eLife digest
In many species, including humans, females have two X chromosomes whereas males have only one. To ensure that females do not end up with a double dose of the proteins encoded by genes on the X chromosome, animals employ a strategy called dosage compensation to control the expression of X-linked genes.
The mechanisms underlying dosage compensation vary between species, but they typically involve a regulatory complex that binds to the X chromosomes of one sex to modify gene expression. In the nematode worm Caenorhabditis elegans—which consists of hermaphrodites (XX) and males (XO)—this regulatory complex, called the dosage compensation complex (DCC), binds to both X chromosomes of XX individuals, reducing gene expression from each by 50%. DCC shares many subunits with a protein complex called condensin, which regulates the structure of chromosomes to achieve proper chromosome segregation. However, it is unclear exactly how the DCC controls the expression of X-linked genes.
For a gene to be expressed, an enzyme called RNA polymerase II must bind to the gene’s promoter—a stretch of DNA upstream of the protein-coding part of the gene—so that it can begin transcribing the DNA into RNA. Promoters have been difficult to define in C. elegans, but Kruesi et al. devised a strategy to map transcription start sites, and hence promoters, throughout the worm genome. The strategy integrates the results of two methods: One measures the extent and orientation of each gene’s transcribed region, and the other locates the distinctive cap structures that mark the true 5′ ends of newly made RNAs.
Using this new promoter information, coupled with genome-wide measurements of the levels of newly synthesized transcripts from wild-type and dosage-compensation-defective animals, they showed that C. elegans achieves dosage compensation by reducing the recruitment of RNA polymerase II to the promoters of X-linked genes in XX individuals.
Kruesi et al. also identified a second regulatory mechanism that acts in both sexes to increase the level of transcription of genes on the X chromosome. This ensures that after dosage compensation, genes on the X chromosome are expressed at a similar level to those on the autosomes (all chromosomes other than X and Y).
As well as shedding light on the mechanism by which dosage compensation occurs in C. elegans, the study by Kruesi et al. provides a valuable data set on transcription start sites in the worm, and puts forward a general strategy that could be used to map these sites in other species.
DOI: http://dx.doi.org/10.7554/eLife.00808.002
doi:10.7554/eLife.00808
PMCID: PMC3687364  PMID: 23795297
dosage compensation; transcription; X-chromosome and autosome balance; transcription start site identification technology; X chromosome; C. elegans
9.  Flynet: a genomic resource for Drosophila melanogaster transcriptional regulatory networks 
Bioinformatics  2009;25(22):3001-3004.
Motivation: The highly coordinated expression of thousands of genes in an organism is regulated by the concerted action of transcription factors, chromatin proteins and epigenetic mechanisms. High-throughput experimental data for genome wide in vivo protein–DNA interactions and epigenetic marks are becoming available from large projects, such as the model organism ENCyclopedia Of DNA Elements (modENCODE) and from individual labs. Dissemination and visualization of these datasets in an explorable form is an important challenge.
Results: To support research on Drosophila melanogaster transcription regulation and make the genome wide in vivo protein–DNA interactions data available to the scientific community as a whole, we have developed a system called Flynet. Currently, Flynet contains 101 datasets for 38 transcription factors and chromatin regulator proteins in different experimental conditions. These factors exhibit different types of binding profiles ranging from sharp localized peaks to broad binding regions. The protein–DNA interaction data in Flynet was obtained from the analysis of chromatin immunoprecipitation experiments on one color and two color genomic tiling arrays as well as chromatin immunoprecipitation followed by massively parallel sequencing. A web-based interface, integrated with an AJAX based genome browser, has been built for queries and presenting analysis results. Flynet also makes available the cis-regulatory modules reported in literature, known and de novo identified sequence motifs across the genome, and other resources to study gene regulation.
Contact: grossman@uic.edu
Availability: Flynet is available at https://www.cistrack.org/flynet/.
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp469
PMCID: PMC2773252  PMID: 19656951
10.  Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE 
Roy, Sushmita | Ernst, Jason | Kharchenko, Peter V. | Kheradpour, Pouya | Negre, Nicolas | Eaton, Matthew L. | Landolin, Jane M. | Bristow, Christopher A. | Ma, Lijia | Lin, Michael F. | Washietl, Stefan | Arshinoff, Bradley I. | Ay, Ferhat | Meyer, Patrick E. | Robine, Nicolas | Washington, Nicole L. | Di Stefano, Luisa | Berezikov, Eugene | Brown, Christopher D. | Candeias, Rogerio | Carlson, Joseph W. | Carr, Adrian | Jungreis, Irwin | Marbach, Daniel | Sealfon, Rachel | Tolstorukov, Michael Y. | Will, Sebastian | Alekseyenko, Artyom A. | Artieri, Carlo | Booth, Benjamin W. | Brooks, Angela N. | Dai, Qi | Davis, Carrie A. | Duff, Michael O. | Feng, Xin | Gorchakov, Andrey A. | Gu, Tingting | Henikoff, Jorja G. | Kapranov, Philipp | Li, Renhua | MacAlpine, Heather K. | Malone, John | Minoda, Aki | Nordman, Jared | Okamura, Katsutomo | Perry, Marc | Powell, Sara K. | Riddle, Nicole C. | Sakai, Akiko | Samsonova, Anastasia | Sandler, Jeremy E. | Schwartz, Yuri B. | Sher, Noa | Spokony, Rebecca | Sturgill, David | van Baren, Marijke | Wan, Kenneth H. | Yang, Li | Yu, Charles | Feingold, Elise | Good, Peter | Guyer, Mark | Lowdon, Rebecca | Ahmad, Kami | Andrews, Justen | Berger, Bonnie | Brenner, Steven E. | Brent, Michael R. | Cherbas, Lucy | Elgin, Sarah C. R. | Gingeras, Thomas R. | Grossman, Robert | Hoskins, Roger A. | Kaufman, Thomas C. | Kent, William | Kuroda, Mitzi I. | Orr-Weaver, Terry | Perrimon, Norbert | Pirrotta, Vincenzo | Posakony, James W. | Ren, Bing | Russell, Steven | Cherbas, Peter | Graveley, Brenton R. | Lewis, Suzanna | Micklem, Gos | Oliver, Brian | Park, Peter J. | Celniker, Susan E. | Henikoff, Steven | Karpen, Gary H. | Lai, Eric C. | MacAlpine, David M. | Stein, Lincoln D. | White, Kevin P. | Kellis, Manolis
Science (New York, N.Y.)  2010;330(6012):1787-1797.
To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.
doi:10.1126/science.1198374
PMCID: PMC3192495  PMID: 21177974
11.  Diverse Chromatin Remodeling Genes Antagonize the Rb-Involved SynMuv Pathways in C. elegans 
PLoS Genetics  2006;2(5):e74.
In Caenorhabditis elegans, vulval cell-fate specification involves the activities of multiple signal transduction and regulatory pathways that include a receptor tyrosine kinase/Ras/mitogen-activated protein kinase pathway and synthetic multivulva (SynMuv) pathways. Many genes in the SynMuv pathways encode transcription factors including the homologs of mammalian Rb, E2F, and components of the nucleosome-remodeling deacetylase complex. To further elucidate the functions of the SynMuv genes, we performed a genome-wide RNA interference (RNAi) screen to search for genes that antagonize the SynMuv gene activities. Among those that displayed a varying degree of suppression of the SynMuv phenotype, 32 genes are potentially involved in chromatin remodeling (called SynMuv suppressor genes herein). Genetic mutations of two representative genes (zfp-1 and mes-4) were used to further characterize their positive roles in vulval induction and relationships with Ras function. Our analysis revealed antagonistic roles of the SynMuv suppressor genes and the SynMuv B genes in germline-soma distinction, RNAi, somatic transgene silencing, and tissue specific expression of pgl-1 and the lag-2/Delta genes. The opposite roles of these SynMuv B and SynMuv suppressor genes on transcriptional regulation were confirmed in somatic transgene silencing. We also report the identifications of ten new genes in the RNAi pathway and six new genes in germline silencing. Among the ten new RNAi genes, three encode homologs of proteins involved in both protein degradation and chromatin remodeling. Our findings suggest that multiple chromatin remodeling complexes are involved in regulating the expression of specific genes that play critical roles in developmental decisions.
Synopsis
In animal cells, DNA and genes are packed into a structure called chromatin. Chromatin-modifying protein complexes play a critical role in the regulation of gene expression. These complexes can alter the chemical and structural properties of the chromosome leading to either the repression or activation of gene expression. How these different complexes coordinate to regulate animal development remains to be explored. Several developmental processes in the nematode Caenorhabditis elegans present excellent model systems to study the functions of chromatin modifications. Using a genome-wide screen, the authors have identified 32 genes that encode potential chromatin-modifying proteins that antagonize the function of another set of transcription regulators including homologs of the mammalian Rb tumor suppressor and components of other chromatin-modifying complexes. The antagonistic roles of these two sets of genes have been observed in a variety of cellular and developmental processes, including organ development and expression of genes in particular tissues. This work indicates that multiple chromatin-modifying complexes are involved in maintaining proper expression of many genes that are critical for precise developmental decisions. Studies on these worm genes should shed light on the roles of the mammalian counterparts in development and related human diseases.
doi:10.1371/journal.pgen.0020074
PMCID: PMC1463046  PMID: 16710447
12.  The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details 
The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at http://www.modencode.org.
Database URL: http://www.modencode.org.
doi:10.1093/database/bar023
PMCID: PMC3170170  PMID: 21856757
13.  Functional modularity of nuclear hormone receptors in a Caenorhabditis elegans metabolic gene regulatory network 
We present the first gene regulatory network (GRN) that pertains to post-developmental gene expression. Specifically, we mapped a transcription regulatory network of Caenorhabditis elegans metabolic gene promoters using gene-centered yeast one-hybrid assays. We found that the metabolic GRN is enriched for nuclear hormone receptors (NHRs) compared with other gene-centered regulatory networks, and that these NHRs organize into functional network modules.The NHR family has greatly expanded in nematodes; C. elegans has 284 NHRs, whereas humans have only 48. We show that the NHRs in the metabolic GRN have metabolic phenotypes, suggesting that they do not simply function redundantly.The mediator subunit MDT-15 preferentially interacts with NHRs that occur in the metabolic GRN.We describe an NHR circuit that responds to nutrient availability and propose a model for the evolution and organization of NHRs in C. elegans metabolic regulatory networks.
Physical and/or regulatory interactions between transcription factors (TFs) and their target genes are essential to establish body plans of multicellular organisms during development, and these interactions have been studied extensively in the context of GRNs. The precise control of differential gene expression is also of critical importance to maintain physiological homeostasis, and many metabolic disorders such as obesity and diabetes coincide with substantial changes in gene expression. Much work has focused on the GRNs that control metazoan development; however, the design principles and organization of the GRNs that control systems physiology remain largely unexplored.
In this study, we present the first gene-centered GRN that includes ∼70 genes involved in C. elegans metabolism and physiology, 100 TFs and more than 500 protein–DNA interactions between them. The resulting metabolic GRN is enriched for NHRs, compared with other gene-centered regulatory networks. NHRs are well-known regulators of lipid meta-qj;bolism in mammals. The transcriptional activity of NHRs can be modified by diffusible ligands, which allows these TFs to function as molecular sensors and rapidly alter the expression of their target genes. Interestingly, NHRs comprise the largest family of TFs in nematodes; the C. elegans genome encodes 284 NHRs, most of which are uncharacterized. Furthermore, their organization in GRNs has not yet been investigated. In our study, we show that the C. elegans NHRs that we retrieved in the metabolic GRN organize into network modules, and that most of these NHRs function to maintain lipid homeostasis in the nematode. Interestingly, network modularity has been proposed to facilitate rapid and robust changes in gene expression. Our results suggest that the C. elegans metabolic GRN may have evolved by combining NHR family expansion with the specific modular wiring of NHRs to enable the rapid adaptation of the animal to different environmental cues.
NHRs can interact with transcriptional cofactors such as chromatin remodeling complexes and Mediator components. For instance, the C. elegans Mediator subunit, MDT-15, can interact with NHR-49 to regulate the expression of its target genes. To find all the TFs that MDT-15 can interact with, we performed systematic yeast two-hybrid assays with MDT-15 versus 755 full-length TFs. We found that MDT-15 preferentially associates with NHRs, and specifically with those NHRs that confer a metabolic phenotype and that occur in the metabolic GRN. This illustrates the central role of MDT-15 in the regulation of metabolic gene expression.
Using a variety of genetic and biochemical approaches, we characterized NHR-86 in more detail. NHR-86 participates in one of the two NHR modules, and has a high-flux capacity; that is it has both a high incoming and a high outgoing degree. We obtained an nhr-86 mutant and generated an NHR-86 antibody, and showed that NHR-86 functions as an auto-repressor in vivo and that nhr-86 mutant animals store abnormally high levels of body fat.
Finally, we discovered a novel NHR circuit that responds to nutrient availability. In this circuit NHR-45 regulates the activity of nhr-178 promoter in two distinct physiologically important tissues: the intestine and the hypodermis. Both of these NHRs are required to maintain lipid homeostasis in C. elegans. The expression of nhr-178 is responsive to the nutritional status of the animal, which switches between ON and OFF states in the hypodermis. We found that NHR-45 activity is necessary to control this switch in the hypodermis. Interestingly, NHR-45 has opposite effects on the activity of the nhr-178 promoter in these tissues: NHR-45 activates this promoter in the intestine, but represses it in the hypodermis.
Altogether our study leads to a model in which the expansion of the NHR family, TFs that have the capacity to act as fast molecular sensors, is combined with a modular network organization to enable rapid and robust responses to various environmental cues.
Gene regulatory networks (GRNs) provide insights into the mechanisms of differential gene expression at a systems level. GRNs that relate to metazoan development have been studied extensively. However, little is still known about the design principles, organization and functionality of GRNs that control physiological processes such as metabolism, homeostasis and responses to environmental cues. In this study, we report the first experimentally mapped metazoan GRN of Caenorhabditis elegans metabolic genes. This network is enriched for nuclear hormone receptors (NHRs). The NHR family has greatly expanded in nematodes: humans have 48 NHRs, but C. elegans has 284, most of which are uncharacterized. We find that the C. elegans metabolic GRN is highly modular and that two GRN modules predominantly consist of NHRs. Network modularity has been proposed to facilitate a rapid response to different cues. As NHRs are metabolic sensors that are poised to respond to ligands, this suggests that C. elegans GRNs evolved to enable rapid and adaptive responses to different cues by a concurrence of NHR family expansion and modular GRN wiring.
doi:10.1038/msb.2010.23
PMCID: PMC2890327  PMID: 20461074
C. elegans; gene regulatory network; metabolism; nuclear hormone receptor; transcription factor
14.  Global Quantitative Modeling of Chromatin Factor Interactions 
PLoS Computational Biology  2014;10(3):e1003525.
Chromatin is the driver of gene regulation, yet understanding the molecular interactions underlying chromatin factor combinatorial patterns (or the “chromatin codes”) remains a fundamental challenge in chromatin biology. Here we developed a global modeling framework that leverages chromatin profiling data to produce a systems-level view of the macromolecular complex of chromatin. Our model ultilizes maximum entropy modeling with regularization-based structure learning to statistically dissect dependencies between chromatin factors and produce an accurate probability distribution of chromatin code. Our unsupervised quantitative model, trained on genome-wide chromatin profiles of 73 histone marks and chromatin proteins from modENCODE, enabled making various data-driven inferences about chromatin profiles and interactions. We provided a highly accurate predictor of chromatin factor pairwise interactions validated by known experimental evidence, and for the first time enabled higher-order interaction prediction. Our predictions can thus help guide future experimental studies. The model can also serve as an inference engine for predicting unknown chromatin profiles — we demonstrated that with this approach we can leverage data from well-characterized cell types to help understand less-studied cell type or conditions.
Author Summary
Chromatin, like many other molecular biological systems, is composed of multiple interacting factors. Our knowledge about chromatin factors is mostly qualitative, and such qualitative knowledge can be insufficient for predicting collective behaviors. It's also extremely challenging to study collective behaviors involving multiple interacting factors through genetic and biochemical experiments. An alternative approach is to leverage large-scale genome-wide chromatin profiles and statistical modeling to create predictive models and infer underlying interaction mechanisms based on these observed high-throughput data. In this study, we developed a novel maximum entropy-based modeling approach to quantitatively capture interactions between chromatin factors at the same genomic location, which we see as a step toward quantitative understanding of chromatin organization that involves a system of multiple interacting factors. We applied this quantitative model to successfully infer functional properties of chromatin including interactions between chromatin factors. Furthermore, the model predicts unmeasured chromatin profiles with high accuracy based on its inferred dependencies with other factors within and across cell-types. Thus our modeling approach effectively ultilizes large-scale chromatin profiles to dissect chromatin factor interactions and to make data-driven inferences about chromatin regulation.
doi:10.1371/journal.pcbi.1003525
PMCID: PMC3967939  PMID: 24675896
15.  The Genomic Distribution and Function of Histone Variant HTZ-1 during C. elegans Embryogenesis 
PLoS Genetics  2008;4(9):e1000187.
In all eukaryotes, histone variants are incorporated into a subset of nucleosomes to create functionally specialized regions of chromatin. One such variant, H2A.Z, replaces histone H2A and is required for development and viability in all animals tested to date. However, the function of H2A.Z in development remains unclear. Here, we use ChIP-chip, genetic mutation, RNAi, and immunofluorescence microscopy to interrogate the function of H2A.Z (HTZ-1) during embryogenesis in Caenorhabditis elegans, a key model of metazoan development. We find that HTZ-1 is expressed in every cell of the developing embryo and is essential for normal development. The sites of HTZ-1 incorporation during embryogenesis reveal a genome wrought by developmental processes. HTZ-1 is incorporated upstream of 23% of C. elegans genes. While these genes tend to be required for development and occupied by RNA polymerase II, HTZ-1 incorporation does not specify a stereotypic transcription program. The data also provide evidence for unexpectedly widespread independent regulation of genes within operons during development; in 37% of operons, HTZ-1 is incorporated upstream of internally encoded genes. Fewer sites of HTZ-1 incorporation occur on the X chromosome relative to autosomes, which our data suggest is due to a paucity of developmentally important genes on X, rather than a direct function for HTZ-1 in dosage compensation. Our experiments indicate that HTZ-1 functions in establishing or maintaining an essential chromatin state at promoters regulated dynamically during C. elegans embryogenesis.
Author Summary
To fit within a cell's nucleus, DNA is wrapped around protein spools composed of the histones H3, H4, H2A, and H2B. One spool and the DNA wrapped around it are called a nucleosome, and all of the packaged DNA in a cell's nucleus is collectively called “chromatin.” Chromatin is important because it modulates access to information encoded in the underlying DNA. Spools with specialized functions can be created by replacing a typical histone component with a variant version of the histone protein. Here, we examine the distribution and function of the C. elegans histone H2A variant H2A.Z (called HTZ-1) during development. We demonstrate that HTZ-1 is required for proper development, and that embryos are dependent on a contribution of HTZ-1 from their mothers for survival. We mapped the location of HTZ-1 incorporation genome-wide and found that HTZ-1 binds upstream of 23% of genes, which tend to be genes that are essential for development and occupied by RNA polymerase. Fewer sites of HTZ-1 incorporation were found on the X chromosome, probably due to an under-representation of essential genes on X rather than a direct role for HTZ-1 in X-chromosome dosage compensation. Our study reveals how the genome is remodeled by HTZ-1 to allow the proper regulation of genes critical for development.
doi:10.1371/journal.pgen.1000187
PMCID: PMC2522285  PMID: 18787694
16.  Defining the budding yeast chromatin-associated interactome 
We report here the first large-scale affinity purification and mass spectrometry (AP-MS) study of chromatin-associated protein, in which over 100 different baits involved in chromatin biology were studied by modified chromatin immunopurification (mChIP)-MS. In particular, focus was placed on poorly studied chromatin binding proteins, such as transcription factors, which have been underrepresented in previous AP-MS studies.mChIP-MS analysis of transcription factors identified dense networks of protein associated with chromatin that were composed of specific transcriptional co-activators, information not accessible through the use of classical AP-MS methods.Finally, we demonstrate that novel protein–protein interactions identified in study by mChIP have functional implications exemplified by the detailed study of both the ubiquitination of the proline isomerase Cpr1 and of histone chaperones involved in the regulation of the HTA1-HTB1 promoter.Our work demonstrates the value of targeted interactome studies, in which affinity purification methods are adapted to the needs of specific baits, as is the case for chromatin binding proteins.
The maintenance of cellular fitness requires living organisms to integrate multiple signals into coordinated outputs. Central to this process is the regulation of the expression of the genetic information encoded into DNA. As a result, there are numerous constraints imposed on gene expression. The access to DNA is restricted by the formation of nucleosomes, in which DNA is wrapped around histone octamers to form chromatin wherein the volume of DNA is considerably reduced. As such, nucleosome positioning is critical and must be defined precisely, particularly during transcription (Workman, 2006). Furthermore, nucleosomes can be actively assembled/disassembled by histone chaperones and can be made to ‘slide' along DNA by the actions of chromatin remodelers. Moreover, the histone proteins are heavily regulated at the expression level and by extensive post-translational modifications (PTMs) (Campos and Reinberg, 2009). Histone PTMs have also been shown to help recruit numerous chromatin-associated factors in accordance with the histone code (Strahl and Allis, 2000). Although our understanding of chromatin and its roles has improved, we still have limited knowledge of the chromatin-associated protein complexes and their interactions.
The characterization of biological systems and of specific subdomain within them, such as chromatin, remains a difficult task. An efficient approach to gain insight in the function of protein is to define its interactome. The underlying principle of protein interaction mapping is that proteins found to interact must be involved in common processes and localization, i.e., guilt by association. The large-scale mapping of proteins interactions allows to annotate protein of unknown functions, implicate protein of known functions in different processes and derive new hypothesis. This is possible because most proteins do not act in isolation but rather as part of complexes, and thus possess interaction partners that can now be detected with the right tools. AP-MS has emerged as a powerful tool for characterizing protein–protein interactions and biological systems in general (Gingras et al, 2007; Gstaiger and Aebersold, 2009).
Recently, we reported the development of a novel affinity purification approach termed mChIP, which was designed to improve the characterization of DNA binding proteins interactome (Lambert et al, 2009). The mChIP method consists of a single affinity purification step, whereby chromatin-associated proteins are isolated from mildly sonicated and gently clarified cellular extracts using magnetic beads coated with antibodies (Lambert et al, 2009; Figure 1A). As such, the mChIP approach maintains chromatin fragments in solution enabling their specific purification, something not previously possible in classical AP-MS methods (Lambert et al, 2009).
In this study, we report the utilization of mChIP followed by MS for the characterization of more than 100 proteins and their associated protein networks (Figure 1B). We initially focused on DNA-associated proteins that had been poorly characterized in past AP-MS studies, such as transcription factors. In addition, many histone modifiers, such as lysine acetyl transferases (KAT) and lysine methyl transferases, critical components of chromatin function and regulation, were also studied by mChIP. This resulted in raw non-redundant mChIP-MS data containing ∼9000 protein–protein interactions between ∼900 proteins. Following a two-step curation process designed to remove common contaminants and protein not specifically associated with the baits under study, a high confidence mChIP-MS data set was produced containing 2966 protein–protein interactions between 724 proteins (Figure 1B). It is important to note that our curation strategy was capable of maintaining the majority of the protein–protein interaction identified in previous AP-MS studies, while removing the bulk of protein–protein interaction not related to chromatin biology. Further analysis of the mChIP-MS data set revealed that for most bait tested, mChIP-MS resulted in the identification of more interaction partners than classical TAP-MS.
Visualization of the mChIP-MS data set was achieved by generating heat maps from two-dimensional hierarchical clustering of the bait–prey interactions. This revealed numerous clusters within our data set supporting functional relationship. For instance, mChIP analysis of the highly homologous heat-shock-inducible transcription factors Msn2 and Msn4 clustered with different transcriptional co-activators. Importantly, our analysis also revealed key differences in the co-activators associated with Msn2 and Msn4 relevant to their function. Another example that we explore in greater details is the Cpr1 proline isomerase, a known member of the Set3 complex (Pijnappel et al, 2001). mChIP-MS analysis of Cpr1 revealed an extended network of associated proteins, including the E3 ubiquitin ligase Bre1 and its association partner Lge1 (Figure 5A). This association raised the possibility of a direct action of Bre1/Lge1 on Cpr1 to ubiquitinate it. In targeted experiments, we observed that Cpr1 is in fact ubiquitinated in a process involving Bre1/Lge1 (Figure 5E), confirming their functional relationship. As such, mChIP is capable of uncovering novel protein–protein interactions with physiological impacts.
In this study, we report how the use of an AP-MS method designed for a given class of protein (chromatin-associated proteins) can help uncover numerous novel protein–protein interactions. Furthermore, our work detected dense chromatin-associated protein networks being co-purified with multiple transcription factors and other DNA binding proteins. The fact that even in the best-characterized model organism Saccharomyces cerevisiae, thousands of novel protein–protein interactions can be detected supports our view that targeted interactome studies are worthwhile and desirable. As such, the budding yeast interactome can still be consider incomplete and warrant further study.
We previously reported a novel affinity purification (AP) method termed modified chromatin immunopurification (mChIP), which permits selective enrichment of DNA-bound proteins along with their associated protein network. In this study, we report a large-scale study of the protein network of 102 chromatin-related proteins from budding yeast that were analyzed by mChIP coupled to mass spectrometry. This effort resulted in the detection of 2966 high confidence protein associations with 724 distinct preys. mChIP resulted in significantly improved interaction coverage as compared with classical AP methodology for ∼75% of the baits tested. Furthermore, mChIP successfully identified novel binding partners for many lower abundance transcription factors that previously failed using conventional AP methodologies. mChIP was also used to perform targeted studies, particularly of Asf1 and its associated proteins, to allow for a understanding of the physical interplay between Asf1 and two other histone chaperones, Rtt106 and the HIR complex, to be gained.
doi:10.1038/msb.2010.104
PMCID: PMC3018163  PMID: 21179020
affinity purification; chromatin-associated protein networks; mass spectrometry; nucleosome assembly factor Asf1; protein–DNA interaction
17.  Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome 
Genome Biology and Evolution  2012;4(4):427-442.
The functional repertoire of long intergenic noncoding RNA (lincRNA) molecules has begun to be elucidated in mammals. Determining the biological relevance and potential gene regulatory mechanisms of these enigmatic molecules would be expedited in a more tractable model organism, such as Drosophila melanogaster. To this end, we defined a set of 1,119 putative lincRNA genes in D. melanogaster using modENCODE whole transcriptome (RNA-seq) data. A large majority (1.1 of 1.3 Mb; 85%) of these bases were not previously reported by modENCODE as being transcribed. Significant selective constraint on the sequences of these loci predicts that virtually all have sustained functionality across the Drosophila clade. We observe biases in lincRNA genomic locations and expression profiles that are consistent with some of these lincRNAs being involved in the regulation of neighboring protein-coding genes with developmental functions. We identify lincRNAs that may be important in the developing nervous system and in male-specific organs, such as the testes. LincRNA loci were also identified whose positions, relative to nearby protein-coding loci, are equivalent between D. melanogaster and mouse. This study predicts that the genomes of not only vertebrates, such as mammals, but also an invertebrate (fruit fly) harbor large numbers of lincRNA loci. Our findings now permit exploitation of Drosophila genetics for the investigation of lincRNA mechanisms, including lincRNAs with potential functional analogues in mammals.
doi:10.1093/gbe/evs020
PMCID: PMC3342871  PMID: 22403033
long intergenic noncoding RNAs; modENCODE; transcriptional regulation; evolution; development
18.  Genome-Wide Tissue-Specific Gene Expression, Co-expression and Regulation of Co-expressed Genes in Adult Nematode Ascaris suum 
Background
Caenorhabditis elegans has traditionally been used as a model for studying nematode biology, but its small size limits the ability for researchers to perform some experiments such as high-throughput tissue-specific gene expression studies. However, the dissection of individual tissues is possible in the parasitic nematode Ascaris suum due to its relatively large size. Here, we take advantage of the recent genome sequencing of Ascaris suum and the ability to physically dissect its separate tissues to produce a wide-scale tissue-specific nematode RNA-seq datasets, including data on three non-reproductive tissues (head, pharynx, and intestine) in both male and female worms, as well as four reproductive tissues (testis, seminal vesicle, ovary, and uterus). We obtained fundamental information about the biology of diverse cell types and potential interactions among tissues within this multicellular organism.
Methodology/Principal Findings
Overexpression and functional enrichment analyses identified many putative biological functions enriched in each tissue studied, including functions which have not been previously studied in detail in nematodes. Putative tissue-specific transcriptional factors and corresponding binding motifs that regulate expression in each tissue were identified, including the intestine-enriched ELT-2 motif/transcription factor previously described in nematode intestines. Constitutively expressed and novel genes were also characterized, with the largest number of novel genes found to be overexpressed in the testis. Finally, a putative acetylcholine-mediated transcriptional network connecting biological activity in the head to the male reproductive system is described using co-expression networks, along with a similar ecdysone-mediated system in the female.
Conclusions/Significance
The expression profiles, co-expression networks and co-expression regulation of the 10 tissues studied and the tissue-specific analysis presented here are a valuable resource for studying tissue-specific biological functions in nematodes.
Author Summary
Tissue-specific gene expression provides fundamental information about the biology of diverse cell types within an organism and interactions among tissues within multicellular organisms. However, such studies are experimentally challenging in smaller organisms such as many nematodes species, including the species (Caenorhabditis elegans) that is widely used in biomedical research. Ascaris suum (the large roundworm of swine), however, is of particular interest as a model nematode because it is large enough to allow for the dissection of individual tissues, and equally important because it is closely related to A. lumbricoides, which infects ∼1 billion people worldwide. Here, we build significantly on the previous tissue-specific gene expression research in A. suum by producing the first nematode RNA-seq dataset that spans multiple specific tissues, including three non-reproductive and two reproductive tissues in both male and female A. suum worms. This analysis provides significant details on the biological functions occurring within each of these tissues, which has not been previously explored. It also provides insight into specific gene regulation pathways active in each of the tissues, which have broad applicability across other nematodes, including both non-parasitic and parasitic species.
doi:10.1371/journal.pntd.0002678
PMCID: PMC3916258  PMID: 24516681
19.  H4K20me1 Contributes to Downregulation of X-Linked Genes for C. elegans Dosage Compensation 
PLoS Genetics  2012;8(9):e1002933.
The Caenorhabditis elegans dosage compensation complex (DCC) equalizes X-chromosome gene dosage between XO males and XX hermaphrodites by two-fold repression of X-linked gene expression in hermaphrodites. The DCC localizes to the X chromosomes in hermaphrodites but not in males, and some subunits form a complex homologous to condensin. The mechanism by which the DCC downregulates gene expression remains unclear. Here we show that the DCC controls the methylation state of lysine 20 of histone H4, leading to higher H4K20me1 and lower H4K20me3 levels on the X chromosomes of XX hermaphrodites relative to autosomes. We identify the PR-SET7 ortholog SET-1 and the Suv4-20 ortholog SET-4 as the major histone methyltransferases for monomethylation and di/trimethylation of H4K20, respectively, and provide evidence that X-chromosome enrichment of H4K20me1 involves inhibition of SET-4 activity on the X. RNAi knockdown of set-1 results in synthetic lethality with dosage compensation mutants and upregulation of X-linked gene expression, supporting a model whereby H4K20me1 functions with the condensin-like C. elegans DCC to repress transcription of X-linked genes. H4K20me1 is important for mitotic chromosome condensation in mammals, suggesting that increased H4K20me1 on the X may restrict access of the transcription machinery to X-linked genes via chromatin compaction.
Author Summary
In many animals, males have one X chromosome and females have two. However, the same amount of gene expression from X chromosomes is needed in the two sexes. The process of dosage compensation (DC) globally regulates X-chromosome gene expression to make it equal between the sexes, and it occurs in different ways in different animals. In mammals, one X chromosome in females is randomly inactivated, leaving one active X chromosome. In contrast, in the nematode worm C. elegans, the two X chromosomes in hermaphrodites are repressed two-fold to match gene expression to the single X chromosome in males. Previous work in C. elegans identified proteins required for DC that bind to the X chromosome, but their mode of action is not known. Here we show that DC proteins lead to higher levels of histone H4 lysine 20 monomethylation (H4K20me1) on hermaphrodite X chromosomes and that H4K20me1 functions in repressing X-chromosome gene expression. This shows that histone modification is an important aspect of the mechanism of dosage compensation. Together with previous work linking H4K20me1 to chromatin structure regulation, our results suggest that dosage compensation might lower gene expression on hermaphrodite X chromosomes by compacting them.
doi:10.1371/journal.pgen.1002933
PMCID: PMC3441679  PMID: 23028348
20.  Identification and characterization of alternative splicing in parasitic nematode transcriptomes 
Parasites & Vectors  2014;7:151.
Background
Alternative splicing (AS) of mRNA is a vital mechanism for enhancing genomic complexity in eukaryotes. Spliced isoforms of the same gene can have diverse molecular and biological functions and are often differentially expressed across various tissues, times, and conditions. Thus, AS has important implications in the study of parasitic nematodes with complex life cycles. Transcriptomic datasets are available from many species, but data must be revisited with splice-aware assembly protocols to facilitate the study of AS in helminthes.
Methods
We sequenced cDNA from the model worm Caenorhabditis elegans using 454/Roche technology for use as an experimental dataset. Reads were assembled with Newbler software, invoking the cDNA option. Several combinations of parameters were tested and assembled transcripts were verified by comparison with previously reported C. elegans genes and transcript isoforms and with Illumina RNAseq data.
Results
Thoughtful adjustment of program parameters increased the percentage of assembled transcripts that matched known C. elegans sequences, decreased mis-assembly rates (i.e., cis- and trans-chimeras), and improved the coverage of the geneset. The optimized protocol was used to update de novo transcriptome assemblies from nine parasitic nematode species, including important pathogens of humans and domestic animals. Our assemblies indicated AS rates in the range of 20-30%, typically with 2-3 transcripts per AS locus, depending on the species. Transcript isoforms from the nine species were translated and searched for similarity to known proteins and functional domains. Some 21 InterPro domains, including several involved in nucleotide and chromatin binding, were statistically correlated with AS genetic loci. In most cases, the Roche/454 data explored in this study are the only sequences available from the species in question; however, the recently published genome of the human hookworm Necator americanus provided an additional opportunity to validate our results.
Conclusions
Our optimized assembly parameters facilitated the first survey of AS among parasitic nematodes. The nine transcriptome assemblies, their protein translations, and basic annotations are available from Nematode.net as a resource for the research community. These should be useful for studies of specific genes and gene families of interest as well as for curating draft genome assemblies as they become available.
doi:10.1186/1756-3305-7-151
PMCID: PMC3997825  PMID: 24690220
Parasitic nematodes; Transcriptomes; Alternative splicing; Next-generation sequencing
21.  Caenorhabditis elegans chromosome arms are anchored to the nuclear membrane via discontinuous association with LEM-2 
Genome Biology  2010;11(12):R120.
Background
Although Caenorhabditis elegans was the first multicellular organism with a completely sequenced genome, how this genome is arranged within the nucleus is not known.
Results
We determined the genomic regions associated with the nuclear transmembrane protein LEM-2 in mixed-stage C. elegans embryos via chromatin immunoprecipitation. Large regions of several megabases on the arms of each autosome were associated with LEM-2. The center of each autosome was mostly free of such interactions, suggesting that they are largely looped out from the nuclear membrane. Only the left end of the X chromosome was associated with the nuclear membrane. At a finer scale, the large membrane-associated domains consisted of smaller subdomains of LEM-2 associations. These subdomains were characterized by high repeat density, low gene density, high levels of H3K27 trimethylation, and silent genes. The subdomains were punctuated by gaps harboring highly active genes. A chromosome arm translocated to a chromosome center retained its association with LEM-2, although there was a slight decrease in association near the fusion point.
Conclusions
Local DNA or chromatin properties are the main determinant of interaction with the nuclear membrane, with position along the chromosome making a minor contribution. Genes in small gaps between LEM-2 associated regions tend to be highly expressed, suggesting that these small gaps are especially amenable to highly efficient transcription. Although our data are derived from an amalgamation of cell types in mixed-stage embryos, the results suggest a model for the spatial arrangement of C. elegans chromosomes within the nucleus.
doi:10.1186/gb-2010-11-12-r120
PMCID: PMC3046480  PMID: 21176223
22.  Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing 
BMC Genomics  2009;10:37.
Background
Short-read high-throughput DNA sequencing technologies provide new tools to answer biological questions. However, high cost and low throughput limit their widespread use, particularly in organisms with smaller genomes such as S. cerevisiae. Although ChIP-Seq in mammalian cell lines is replacing array-based ChIP-chip as the standard for transcription factor binding studies, ChIP-Seq in yeast is still underutilized compared to ChIP-chip. We developed a multiplex barcoding system that allows simultaneous sequencing and analysis of multiple samples using Illumina's platform. We applied this method to analyze the chromosomal distributions of three yeast DNA binding proteins (Ste12, Cse4 and RNA PolII) and a reference sample (input DNA) in a single experiment and demonstrate its utility for rapid and accurate results at reduced costs.
Results
We developed a barcoding ChIP-Seq method for the concurrent analysis of transcription factor binding sites in yeast. Our multiplex strategy generated high quality data that was indistinguishable from data obtained with non-barcoded libraries. None of the barcoded adapters induced differences relative to a non-barcoded adapter when applied to the same DNA sample. We used this method to map the binding sites for Cse4, Ste12 and Pol II throughout the yeast genome and we found 148 binding targets for Cse4, 823 targets for Ste12 and 2508 targets for PolII. Cse4 was strongly bound to all yeast centromeres as expected and the remaining non-centromeric targets correspond to highly expressed genes in rich media. The presence of Cse4 non-centromeric binding sites was not reported previously.
Conclusion
We designed a multiplex short-read DNA sequencing method to perform efficient ChIP-Seq in yeast and other small genome model organisms. This method produces accurate results with higher throughput and reduced cost. Given constant improvements in high-throughput sequencing technologies, increasing multiplexing will be possible to further decrease costs per sample and to accelerate the completion of large consortium projects such as modENCODE.
doi:10.1186/1471-2164-10-37
PMCID: PMC2656530  PMID: 19159457
23.  Holocentromeres are dispersed point centromeres localized at transcription factor hotspots 
eLife  2014;3:e02025.
Centromeres vary greatly in size and sequence composition, ranging from ‘point’ centromeres with a single cenH3-containing nucleosome to ‘regional’ centromeres embedded in tandemly repeated sequences to holocentromeres that extend along the length of entire chromosomes. Point centromeres are defined by sequence, whereas regional and holocentromeres are epigenetically defined by the location of cenH3-containing nucleosomes. In this study, we show that Caenorhabditis elegans holocentromeres are organized as dispersed but discretely localized point centromeres, each forming a single cenH3-containing nucleosome. These centromeric sites co-localize with kinetochore components, and their occupancy is dependent on the cenH3 loading machinery. These sites coincide with non-specific binding sites for multiple transcription factors (‘HOT’ sites), which become occupied when cenH3 is lost. Our results show that the point centromere is the basic unit of holocentric organization in support of the classical polycentric model for holocentromeres, and provide a mechanistic basis for understanding how centromeric chromatin might be maintained.
DOI: http://dx.doi.org/10.7554/eLife.02025.001
eLife digest
During cell division, the chromosomes in the original cell must be replicated and these ‘sister chromosomes’ must then be divided equally between the two new daughter cells. At first, the sister chromosomes are held together near a region called the centromere, which is important because the microtubules that pull the sister chromosomes apart attach themselves to the centromere. In many cases, the centromere is a small region near the middle of the chromosomes, which produces a classic X shape. However, in some organisms centromeres span the entire length of the chromosomes. There are at least 13 plant and animal lineages with such holocentromeres.
Inside the nucleus of cells, DNA is wrapped around molecules called histones. There are five major families of histones, and histones belonging to one of these families—the H3 histones—are replaced by cenH3 variant histones at both conventional centromeres and holocentromeres. There are many unanswered questions about holocentromeres. In particular, do holocentromeres truly extend along the full length of the chromosomes, or are they found at a large number of specific sites?
Now Steiner and Henikoff have studied the distribution of cenH3 in the genome of the worm C. elegans to investigate holocentromeres in greater detail. These experiments showed that the holocentromere in C. elegans is actually made of about 700 individual centromeric sites distributed along the length of the chromosomes. Each of these sites contains just one nucleosome that contains cenH3, and these sites are likely to be the sites that microtubules attach to during cell division. Surprisingly, the same sites can also act as so-called ‘HOT–sites’: these sites are bound by many proteins that are involved in regulating the process by which genes are expressed as proteins, which suggests a link between centromeres and these regulatory proteins.
The work of Steiner and Henikoff describes how centromeric nucleosomes are distributed across the genome, but why and how cenH3 ends up at these particular 700 sites remains an open question.
DOI: http://dx.doi.org/10.7554/eLife.02025.002
doi:10.7554/eLife.02025
PMCID: PMC3975580  PMID: 24714495
CenH3; CENP-A; holocentromere; point centromere; transcription factor hotspots; C. elegans
24.  Caenorhabditis elegans Histone Methyltransferase MET-2 Shields the Male X Chromosome from Checkpoint Machinery and Mediates Meiotic Sex Chromosome Inactivation 
PLoS Genetics  2011;7(9):e1002267.
Meiosis is a specialized form of cellular division that results in the precise halving of the genome to produce gametes for sexual reproduction. Checkpoints function during meiosis to detect errors and subsequently to activate a signaling cascade that prevents the formation of aneuploid gametes. Indeed, asynapsis of a homologous chromosome pair elicits a checkpoint response that can in turn trigger germline apoptosis. In a heterogametic germ line, however, sex chromosomes proceed through meiosis with unsynapsed regions and are not recognized by checkpoint machinery. We conducted a directed RNAi screen in Caenorhabditis elegans to identify regulatory factors that prevent recognition of heteromorphic sex chromosomes as unpaired and uncovered a role for the SET domain histone H3 lysine 9 histone methyltransferase (HMTase) MET-2 and two additional HMTases in shielding the male X from checkpoint machinery. We found that MET-2 also mediates the transcriptional silencing program of meiotic sex chromosome inactivation (MSCI) but not meiotic silencing of unsynapsed chromatin (MSUC), suggesting that these processes are distinct. Further, MSCI and checkpoint shielding can be uncoupled, as double-strand breaks targeted to an unpaired, transcriptionally silenced extra-chromosomal array induce checkpoint activation in germ lines depleted for met-2. In summary, our data uncover a mechanism by which repressive chromatin architecture enables checkpoint proteins to distinguish between the partnerless male X chromosome and asynapsed chromosomes thereby shielding the lone X from inappropriate activation of an apoptotic program.
Author Summary
Meiosis results in the generation of non-identical haploid gametes and maintenance of chromosome number during sexual reproduction. Precise meiotic chromosome segregation is essential for life, and in humans errors in this process contribute to aneuploidy or failure in meiosis, which manifests as spontaneous abortions or infertility. Cellular surveillance pathways monitor the steps of meiosis; and, if homologous chromosomes fail to pair and recombine, checkpoint machinery responds by eliciting signals to induce apoptosis. However, in many species males possess a single X chromosome that is transcriptionally silenced, accumulates repressive histone marks, and is not recognized as partnerless by meiotic checkpoints. Here, we used C. elegans to investigate how the male X is precluded from checkpoint signaling and uncovered a role for conserved chromatin-remodeling proteins that block checkpoints and mediate meiotic silencing. Our data elucidate the molecular mechanisms by which chromatin architecture influences both transcriptional silencing and checkpoint response to breaks on unpaired sex chromosomes, and we propose a model by which repressive chromatin modifiers directly block meiotic checkpoints from accessing the male X chromosome.
doi:10.1371/journal.pgen.1002267
PMCID: PMC3164706  PMID: 21909284
25.  Dynamic Chromatin Organization during Foregut Development Mediated by the Organ Selector Gene PHA-4/FoxA 
PLoS Genetics  2010;6(8):e1001060.
Central regulators of cell fate, or selector genes, establish the identity of cells by direct regulation of large cohorts of genes. In Caenorhabditis elegans, foregut (or pharynx) identity relies on the FoxA transcription factor PHA-4, which activates different sets of target genes at various times and in diverse cellular environments. An outstanding question is how PHA-4 distinguishes between target genes for appropriate transcriptional control. We have used the Nuclear Spot Assay and GFP reporters to examine PHA-4 interactions with target promoters in living embryos and with single cell resolution. While PHA-4 was found throughout the digestive tract, binding and activation of pharyngeally expressed promoters was restricted to a subset of pharyngeal cells and excluded from the intestine. An RNAi screen of candidate nuclear factors identified emerin (emr-1) as a negative regulator of PHA-4 binding within the pharynx, but emr-1 did not modulate PHA-4 binding in the intestine. Upon promoter association, PHA-4 induced large-scale chromatin de-compaction, which, we hypothesize, may facilitate promoter access and productive transcription. Our results reveal two tiers of PHA-4 regulation. PHA-4 binding is prohibited in intestinal cells, preventing target gene expression in that organ. PHA-4 binding within the pharynx is limited by the nuclear lamina component EMR-1/emerin. The data suggest that association of PHA-4 with its targets is a regulated step that contributes to promoter selectivity during organ formation. We speculate that global re-organization of chromatin architecture upon PHA-4 binding promotes competence of pharyngeal gene transcription and, by extension, foregut development.
Author Summary
Central regulators of cell fate establish the identity of cells by direct regulation of large cohorts of genes. In Caenorhabditis elegans, foregut (or pharynx) identity relies on the FoxA transcription factor PHA-4, which activates different target genes in different cellular environments. An outstanding question is how PHA-4 distinguishes between target genes for appropriate transcriptional control. Here we examine PHA-4 interactions with target promoters in living embryos and with single-cell resolution. While PHA-4 was found throughout the digestive tract, binding and activation of pharyngeally expressed promoters was restricted to a subset of pharyngeal cells and excluded from the intestine. An RNAi screen identified emerin (emr-1) as a negative regulator of PHA-4 binding within the pharynx. Upon promoter association, PHA-4 induced large-scale chromatin de-compaction, which, we hypothesize, facilitates promoter access. Our results reveal two tiers of PHA-4 regulation. PHA-4 binding is prohibited in intestinal cells and is limited in the pharynx by the nuclear lamina component EMR-1/emerin. The data suggest that association of PHA-4 with its targets is a regulated step that contributes to promoter selectivity during organ formation. We speculate that global re-organization of chromatin architecture upon PHA-4 binding promotes competence of pharyngeal gene transcription and, by extension, foregut development.
doi:10.1371/journal.pgen.1001060
PMCID: PMC2920861  PMID: 20714352

Results 1-25 (1055278)