The C. elegans cell lineage provides a unique opportunity to look at how cell lineage affects patterns of gene expression. We developed an automatic cell lineage analyzer that converts high-resolution images of worms into a data table showing fluorescence expression with single cell resolution. We generated expression profiles of 93 genes in 363 specific cells from L1 stage larvae and found that cells with identical fates can be formed by different gene regulatory pathways. We used molecular signatures to find repeating cell fate modules within the cell lineage and to create a molecular differentiation map, which shows points in the cell lineage when developmental fates of daughter cells begin to diverge. These results demonstrate insights that become possible using computational approaches to analyze quantitative expression from many genes in parallel using a digital gene expression atlas.
Discovering the structure and dynamics of transcriptional regulatory events in the genome with cellular and temporal resolution is crucial to understanding the regulatory underpinnings of development and disease. We determined the genomic distribution of binding sites for 92 transcription factors (TFs) and regulatory proteins across multiple stages of C. elegans development by performing 241 ChIP-seq experiments. Integrating regulatory binding and cellular-resolution expression data yielded a spatiotemporally-resolved metazoan TF binding map. Using this map, we explore developmental regulatory circuits that encode combinatorial logic at the levels of co-binding and co-expression of TFs, characterizing (1) the genomic coverage and clustering of regulatory binding, (2) the binding preferences of and biological processes regulated by TFs, (3) the global TF co-associations and genomic subdomains that suggest shared patterns of regulation, and (4) key TFs and TF co-associations for fate specification of individual lineages and cell-types.
Transcription Factor; Gene Regulation; ChIP-seq; Cellular Expression; Development
The spatial and temporal control of transgene expression is an important tool in C. elegans biology. We previously described a method for evoking gene expression in arbitrary cells by using a focused pulsed infrared laser to induce a heat shock response (Churgin et al 2013). Here we describe detailed methods for building and testing a system for performing single-cell heat shock. Steps include setting up the laser and associated components, coupling the laser beam to a microscope, and testing heat shock protocols. All steps can be carried out using readily available off-the-shelf components.
C. elegans; transgenes; heat shock; lasers
The coupling of transgenes to heat shock promoters is a widely applied method for regulating gene expression. In C. elegans, gene induction can be controlled temporally through timing of heat shock and spatially via targeted rescue in heat shock mutants. Here, we present a method for evoking gene expression in arbitrary cells, with single-cell resolution. We use a focused pulsed infrared laser to locally induce a heat shock response in specific cells. Our method builds on and extends a previously reported method using a continuous-wave laser. In our technique, the pulsed laser illumination enables a much higher degree of spatial selectivity because of diffusion of heat between pulses. We apply our method to induce transient and long-term transgene expression in embryonic, larval, and adult cells. Our method allows highly selective spatiotemporal control of transgene expression and is a powerful tool for model organism biology.
transgene induction; heat shock; C. elegans
Cells perform wide varieties of functions that are facilitated, in part, by adopting unique shapes. Many of the genes and pathways that promote cell fate specification have been elucidated. However, relatively few transcription factors have been identified that promote shape acquisition after fate specification. Here we show that the Nkx5/HMX homeodomain protein MLS-2 is required for cellular elongation and shape maintenance of two tubular epithelial cells in the C.elegans excretory system, the duct and pore cells. The Nkx5/HMX family is highly conserved from sea urchins to humans, with known roles in neuronal and glial development. MLS-2 is expressed in the duct and pore, and defects in mls-2 mutants first arise when the duct and pore normally adopt unique shapes. MLS-2 cooperates with the EGF-Ras-ERK pathway to turn on the LIN-48/Ovo transcription factor in the duct cell during morphogenesis. These results reveal a novel interaction between the Nkx5/HMX family and the EGF-Ras pathway and implicate a transcription factor, MLS-2, as a regulator of cell shape.
Caenorhabditis elegans; tubulogenesis; morphogenesis; cytoskeleton
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
MicroRNAs (miRNAs) have been found to regulate gene expression across eukaryotic species, but the function of most miRNA genes remains unknown. Here we describe how the analysis of the expression patterns of a well-conserved miRNA gene, mir-57, at cellular resolution for every minute during early development of Caenorhabditis elegans provided key insights in understanding its function. Remarkably, mir-57 expression shows strong positional bias but little tissue specificity, a pattern reminiscent of Hox gene function. Despite the minor defects produced by a loss of function mutation, overexpression of mir-57 causes dramatic posterior defects, which also mimic the phenotypes of mutant alleles of a posterior Hox gene, nob-1, an Abd homolog. More importantly, nob-1 expression is found in the same two posterior AB sublineages as those expressing mir-57 but with an earlier onset. Intriguingly, nob-1 functions as an activator for mir-57 expression; it is also a direct target of mir-57. In agreement with this, loss of mir-57 function partially rescues the nob-1 allele defects, indicating a negative feedback regulatory loop between the miRNA and Hox gene to provide positional cues. Given the conservation of the miRNA and Hox gene, the regulatory mechanism might be broadly used across species. The strategy used here to explore mir-57 function provides a path to dissect the regulatory relationship between genes.
miRNAs are small RNAs found in many multi-cellular species that inhibit gene expression. Many of them play important roles in cancer and cell fate determination, but the function of most miRNAs is uncertain. Using live cell imaging and automated expression analysis, we found a miRNA gene, mir-57, is expressed in a position rather than tissue dependent way. Hox genes also regulate cell fate patterning along anterior-posterior (a-p) axis across different tissues. By investigating interactions between genes of these classes expressed in mir-57 expressing cells, we demonstrated by both genetic analysis and gene expression assays that a negative feedback loop between a posterior Hox gene, nob-1, and mir-57 regulates posterior cell fate determination in C. elegans. On the one hand, the Hox gene is required for normal activation of mir-57 expression, and on the other, the Hox gene functions as a direct target of and is repressed by the miRNA. Given the conservation of the two genes, a negative feedback loop between Hox and miRNA genes might be broadly used across species to regulate cell fate along the a-p axis. Detailed expression analysis may provide a general way to dissect the regulatory role of miRNAs.
Transcription factors are key components of regulatory networks that control development, as well as the response to environmental stimuli. We have established an experimental pipeline in Caenorhabditis elegans that permits global identification of the binding sites for transcription factors using chromatin immunoprecipitation and deep sequencing. We describe and validate this strategy, and apply it to the transcription factor PHA-4, which plays critical roles in organ development and other cellular processes. We identified thousands of binding sites for PHA-4 during formation of the embryonic pharynx, and also found a role for this factor during the starvation response. Many binding sites were found to shift dramatically between embryos and starved larvae, from developmentally regulated genes to genes involved in metabolism. These results indicate distinct roles for this regulator in two different biological processes and demonstrate the versatility of transcription factors in mediating diverse biological roles.
The C. elegans transcription factor PHA-4 is a member of the highly conserved FOXA family of transcription factors. These factors act as master regulators of organ development by controlling how genes are turned off and on as tissues are formed. Additionally they regulate genes in response to nutrient levels and control both longevity and survival of the organism. However, the extent to which these factors control similar or distinct gene targets for each of these functions is unknown. For this reason, we have used the technique of chromatin immunoprecipitation followed by deep sequencing (ChIP–Seq), to define the target binding sites of PHA-4 on a genome-wide scale, when it is either functioning as an organ identity regulator or in response to environmental stress. Our data clearly demonstrate distinct sets of biologically relevant target genes for the transcription factor PHA-4 under these two different conditions. Not only have we defined PHA-4 targets, but we established an experimental ChIP–Seq pipeline to facilitate the identification of binding sites for many transcription factors in the future.
Image analysis is an essential component in many biological experiments that study gene expression, cell cycle progression, and protein localization. A protocol for tracking the expression of individual C. elegans genes was developed that collects image samples of a developing embryo by 3-D time lapse microscopy. In this protocol, a program called StarryNite performs the automatic recognition of fluorescently labeled cells and traces their lineage. However, due to the amount of noise present in the data and due to the challenges introduced by increasing number of cells in later stages of development, this program is not error free. In the current version, the error correction (i.e., editing) is performed manually using a graphical interface tool named AceTree, which is specifically developed for this task. For a single experiment, this manual annotation task takes several hours.
In this paper, we reduce the time required to correct errors made by StarryNite. We target one of the most frequent error types (movements annotated as divisions) and train a support vector machine (SVM) classifier to decide whether a division call made by StarryNite is correct or not. We show, via cross-validation experiments on several benchmark data sets, that the SVM successfully identifies this type of error significantly. A new version of StarryNite that includes the trained SVM classifier is available at http://starrynite.sourceforge.net.
We demonstrate the utility of a machine learning approach to error annotation for StarryNite. In the process, we also provide some general methodologies for developing and validating a classifier with respect to a given pattern recognition task.
Comparative genomic analysis of important signaling pathways in C. briggase and C. elegans reveals both conserved features and also differences. To build a framework to address the significance of these features we determined the C. briggsae embryonic cell lineage, using the tools StarryNite and AceTree. We traced both cell divisions and cell positions for all cells through all but the last round of cell division and for selected cells through the final round. We found the lineage to be remarkably similar to that of C. elegans. Not only did the founder cells give rise to similar numbers of progeny, the relative cell division timing and positions were largely maintained. These lineage similarities appear to give rise to similar cell fates as judged both by the positions of lineally-equivalent cells and by the patterns of cell deaths in both species. However, some reproducible differences were seen, e.g., the P4 cell cycle length is more than 40% longer in C. briggsae than that in C. elegans (p < 0.01). The extensive conservation of embryonic development between such divergent species suggests that substantial evolutionary distance between these two species has not altered these early developmental cellular events, although the developmental defects of transpecies hybrids suggest that the details of the underlying molecular pathways have diverged sufficiently so as to not be interchangeable.
C. briggsae; C. elegans; embryo; cell lineage; signaling pathway
As a fundamental process of development, cell proliferation must be coordinated with other processes such as fate differentiation. Through statistical analysis of individual cell cycle lengths of the first eight out of ten rounds of embryonic cell division in C. elegans, we identified synchronous and invariantly ordered divisions that are tightly associated with fate differentiation. Our results suggest a three-tier model for fate control of cell cycle pace: the primary control of cell cycle pace is established by lineage and the founder cell fate, then fine-tuned by tissue and organ differentiation within each lineage, then further modified by individualization of cells as they acquire unique morphological and physiological roles in the variant body plan. We then set out to identify the pace-setting mechanisms in different fates. Our results suggest that ubiquitin-mediated degradation of CDC-25.1 is a rate-determining step for the E (gut) and P3 (muscle and germline) lineages but not others, even though CDC-25.1 and its apparent decay have been detected in all lineages. Our results demonstrate the power of C. elegans embryogenesis as a model to dissect the interaction between differentiation and proliferation, and an effective approach combining genetic and statistical analysis at single-cell resolution.
statistics; single cell; fate differentiation; cdc25; Skp1-related
The invariant lineage of the nematode Caenorhabditis elegans has potential as a powerful tool for the description of mutant phenotypes and gene expression patterns. We previously described procedures for the imaging and automatic extraction of the cell lineage from C. elegans embryos. That method uses time-lapse confocal imaging of a strain expressing histone-GFP fusions and a software package, StarryNite, processes the thousands of images and produces output files that describe the location and lineage relationship of each nucleus at each time point.
We have developed a companion software package, AceTree, which links the images and the annotations using tree representations of the lineage. This facilitates curation and editing of the lineage. AceTree also contains powerful visualization and interpretive tools, such as space filling models and tree-based expression patterning, that can be used to extract biological significance from the data.
By pairing a fast lineaging program written in C with a user interface program written in Java we have produced a powerful software suite for exploring embryonic development.
Previous work has implicated heat shock transcription factor 1 (HSF1) as the primary transcription factor responsible for the transcriptional response to heat stress in mammalian cells. We characterized the heat shock response of mammalian cells by measuring changes in transcript levels and assaying binding of HSF1 to promoter regions for candidate heat shock genes chosen by a combination of genome-wide computational and experimental methods. We found that many heat-inducible genes have HSF1 binding sites (heat shock elements, HSEs) in their promoters that are bound by HSF1. Surprisingly, for 24 heat-inducible genes, we detected no HSEs and no HSF1 binding. Furthermore, of 182 promoters with likely HSE sequences, we detected HSF1 binding at only 94 of these promoters. Also unexpectedly, we found 48 genes with HSEs in their promoters that are bound by HSF1 but that nevertheless did not show induction after heat shock in the cell types we examined. We also studied the transcriptional response to heat shock in fibroblasts from mice lacking the HSF1 gene. We found 36 genes in these cells that are induced by heat as well as they are in wild-type cells. These results provide evidence that HSF1 does not regulate the induction of every transcript that accumulates after heat shock, and our results suggest that an independent posttranscriptional mechanism regulates the accumulation of a significant number of transcripts.
The genome-wide program of gene expression during the cell division cycle in a human cancer cell line (HeLa) was characterized using cDNA microarrays. Transcripts of >850 genes showed periodic variation during the cell cycle. Hierarchical clustering of the expression patterns revealed coexpressed groups of previously well-characterized genes involved in essential cell cycle processes such as DNA replication, chromosome segregation, and cell adhesion along with genes of uncharacterized function. Most of the genes whose expression had previously been reported to correlate with the proliferative state of tumors were found herein also to be periodically expressed during the HeLa cell cycle. However, some of the genes periodically expressed in the HeLa cell cycle do not have a consistent correlation with tumor proliferation. Cell cycle-regulated transcripts of genes involved in fundamental processes such as DNA replication and chromosome segregation seem to be more highly expressed in proliferative tumors simply because they contain more cycling cells. The data in this report provide a comprehensive catalog of cell cycle regulated genes that can serve as a starting point for functional discovery. The full dataset is available at http://genome-www.stanford.edu/Human-CellCycle/HeLa/.