RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.
mel-28 (maternal-effect-lethal-28) encodes a conserved protein required for nuclear envelope function and chromosome segregation in Caenorhabditis elegans. Because mel-28 is a strict maternal-effect lethal gene, its function is required in the early embryo but appears to be dispensable for larval development. We wanted to test the idea that mel-28 has postembryonic roles that are buffered by the contributions of other genes. To find genes that act coordinately with mel-28, we did an RNA interference−based genetic interaction screen using mel-28 and wild-type larvae. We screened 18,364 clones and identified 65 genes that cause sterility in mel-28 but not wild-type worms. Some of these genes encode components of the nuclear pore. In addition we identified genes involved in dynein and dynactin function, vesicle transport, and cell-matrix attachments. By screening mel-28 larvae we have bypassed the requirement for mel-28 in the embryo, uncovering pleiotropic functions for mel-28 later in development that are normally provided by other genes. This work contributes toward revealing the gene networks that underlie cellular processes and reveals roles for a maternal-effect lethal gene later in development.
synthetic sterility; C. elegans; gonad; germline; mel-28
High-content screening for gene profiling has generally been limited to single cells. Here, we explore an alternative approach—profiling gene function by analyzing effects of gene knockdowns on the architecture of a complex tissue in a multicellular organism. We profile 554 essential C. elegans genes by imaging gonad architecture and scoring 94 phenotypic features. To generate a reference for evaluating methods for network construction, genes were manually partitioned into 102 phenotypic classes, predicting functions for uncharacterized genes across diverse cellular processes. Using this classification as a benchmark, we developed a robust computational method for constructing gene networks from high-content profiles based on a network context-dependent measure that ranks the significance of links between genes. Our analysis reveals that multi-parametric profiling in a complex tissue yields functional maps with a resolution similar to genetic interaction-based profiling in unicellular eukaryotes—pinpointing subunits of macromolecular complexes and components functioning in common cellular processes.
We present a hierarchical principle for object recognition and its application to automatically classify developmental stages of C. elegans animals from a population of mixed stages. The object recognition machine consists of four hierarchical layers, each composed of units upon which evaluation functions output a label score, followed by a grouping mechanism that resolves ambiguities in the score by imposing local consistency constraints. Each layer then outputs groups of units, from which the units of the next layer are derived. Using this hierarchical principle, the machine builds up successively more sophisticated representations of the objects to be classified. The algorithm segments large and small objects, decomposes objects into parts, extracts features from these parts, and classifies them by SVM. We are using this system to analyze phenotypic data from C. elegans high-throughput genetic screens, and our system overcomes a previous bottleneck in image analysis by achieving near real-time scoring of image data. The system is in current use in a functioning C. elegans laboratory and has processed over two hundred thousand images for lab users.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
Three-prime untranslated regions (3′UTRs) of metazoan messenger RNAs (mRNAs) contain numerous regulatory elements, yet remain largely uncharacterized. Using polyA capture, 3′ rapid amplification of complementary DNA (cDNA) ends, full-length cDNAs, and RNA-seq, we defined ∼26,000 distinct 3′UTRs in Caenorhabditis elegans for ∼85% of the 18,328 experimentally supported protein-coding genes and revised ∼40% of gene models. Alternative 3′UTR isoforms are frequent, often differentially expressed during development. Average 3′UTR length decreases with animal age. Surprisingly, no polyadenylation signal (PAS) was detected for 13% of polyadenylation sites, predominantly among shorter alternative isoforms. Trans-spliced (versus non–trans-spliced) mRNAs possess longer 3′UTRs and frequently contain no PAS or variant PAS. We identified conserved 3′UTR motifs, isoform-specific predicted microRNA target sites, and polyadenylation of most histone genes. Our data reveal a rich complexity of 3′UTRs, both genome-wide and throughout development.
The molecular underpinnings of the oocyte-to-embryo transition are poorly understood. Here we show that two protein tyrosine phosphatase-like (PTPL) family proteins, EGG-4 and EGG-5, are required for key events of the oocyte-to-embryo transition in Caenorhabditis elegans. The predicted EGG-4 and EGG-5 amino acid sequences are 99% identical and their functions are redundant. In embryos lacking EGG-4 and EGG-5 we observe defects in meiosis, polar body formation, the block to polyspermy, F-actin dynamics, and eggshell deposition. During oogenesis, EGG-4 and EGG-5 assemble at the oocyte cortex with the previously identified regulators or effectors of the oocyte-to-embryo transition EGG-3, CHS-1 and MBK-2 [1, 2]. All of these molecules share a complex interdependence with regards to their dynamics and subcellular localization. Shortly after fertilization, EGG-4 and EGG-5 are required to properly coordinate a redistribution of CHS-1 and EGG-3 away from the cortex during meiotic anaphase I. Therefore EGG-4 and EGG-5 are not only required for critical events of the oocyte-to-embryo transition but also link the dynamics of the regulatory machinery with the advancing cell cycle.
The cell biological events that guide early embryonic development occur with great precision within species but can be quite diverse across species. How these cellular processes evolve and which molecular components underlie evolutionary changes is poorly understood. To begin to address these questions, we systematically investigated early embryogenesis, from the one- to the four-cell embryo, in 34 nematode species related to C. elegans. We found 40 cell-biological characters that captured the phenotypic differences between these species. By tracing the evolutionary changes on a molecular phylogeny, we found that these characters evolved multiple times and independently of one another. Strikingly, all these phenotypes are mimicked by single-gene RNAi experiments in C. elegans. We use these comparisons to hypothesize the molecular mechanisms underlying the evolutionary changes. For example, we predict that a cell polarity module was altered during the evolution of the Protorhabditis group and show that PAR-1, a kinase localized asymmetrically in C. elegans early embryos, is symmetrically localized in the one-cell stage of Protorhabditis group species. Our genome-wide approach identifies candidate molecules—and thereby modules—associated with evolutionary changes in cell-biological phenotypes.
Nematoda; C. elegans; embryogenesis; early development; phenotypic analysis; cell polarity; phenotypic plasticity
Caenorhabditis elegans is one of the most prominent model systems for embryogenesis. However, it has been impractical to collect large amounts of precisely staged embryos. Thus, early C. elegans embryogenesis has not been amenable to most modern high-throughput genomics or biochemistry assays. To overcome this problem, we devised a method to collect large amounts of staged C. elegans embryos by Fluorescent Activated Cell Sorting (termed eFACS). eFACS can in principle be applied to all embryonic stages. As a proof of principle we show that a single eFACS run routinely yields tens of thousands of almost perfectly staged one-cell embryos. Since the earliest embryonic events are driven by post-transcriptional regulation, we combined eFACS with next-generation sequencing to profile the embryonic expression of small, non-coding RNAs. We discovered complex and orchestrated changes in the expression between and within almost all classes of small RNAs, including miRNAs and 26G-RNAs, during embryogenesis.
Despite the successes of genomics, little is known about how genetic information produces complex organisms. A look at the crucial functional elements of fly and worm genomes could change that.
Many protein-protein interactions are mediated through independently folding modular domains. Proteome-wide efforts to model protein-protein interaction or “interactome” networks have largely ignored this modular organization of proteins. We developed an experimental strategy to efficiently identify interaction domains and generated a domain-based interactome network for proteins involved in C. elegans early embryonic cell divisions. Minimal interacting regions were identified for over 200 proteins, providing important information on their domain organization. Furthermore, our approach increased the sensitivity of the two-hybrid system, resulting in a more complete interactome network. This interactome modeling strategy revealed new insights into C. elegans centrosome function and is applicable to other biological processes in this and other organisms.
Nucleoporins are components of the nuclear pore, which is required for nucleo-cytoplasmic transport. We report a role for a subclass of nucleoporins in orienting the mitotic spindle in C. elegans embryos. RNAi-mediated depletion of any of five putative nucleoporins npp-1, npp-3, npp-4, npp-11, and npp-13 leads to indistinguishable spindle orientation defects. Transgenic worms expressing NPP-1::GFP or NPP-11::GFP show GFP localization at the nuclear envelope, consistent with their predicted function. NPP-1 interacts with the other nucleoporins in yeast two-hybrid assays suggesting that the proteins affect spindle orientation by a common process. The failed orientation phenotype of npp-1(RNAi) is at least partially epistatic to the ectopic spindle rotation in the AB blastomere of par-3 mutant embryos. This suggests that NPP-1 contributes to the mechanics of spindle orientation. However, NPP-1 is also required for PAR-6 asymmetry at the two-cell stage, indicating that nucleoporins may be required to define cortical domains in the germ line blasotmere P1. Nuclear envelope structure is abnormal in npp-1(RNAi) embryos but the envelope maintains its integrity and most nuclear proteins we assayed accumulate normally. These findings raise the possibility that these nucleoporins may have direct roles in orienting the mitotic spindle and the maintenance of cell polarity.
nucleoporins; spindle orientation; Caenorhabditis elegans; embryo
The actin cytoskeleton plays critical roles in early development in Caenorhabditis elegans. To further understand the complex roles of actin in early embryogenesis we use RNAi and in vivo imaging of filamentous actin (F-actin) dynamics.
Using RNAi, we found processes that are differentially sensitive to levels of actin during early embryogenesis. Mild actin depletion shows defects in cortical ruffling, pseudocleavage, and establishment of polarity, while more severe depletion shows defects in polar body extrusion, cytokinesis, chromosome segregation, and eventually, egg production. These defects indicate that actin is required for proper oocyte development, fertilization, and a wide range of important events during early embryogenesis, including proper chromosome segregation. In vivo visualization of the cortical actin cytoskeleton shows dynamics that parallel but are distinct from the previously described myosin dynamics. Two distinct types of actin organization are observed at the cortex. During asymmetric polarization to the anterior, or the establishment phase (Phase I), actin forms a meshwork of microfilaments and focal accumulations throughout the cortex, while during the anterior maintenance phase (Phase II) it undergoes a morphological transition to asymmetrically localized puncta. The proper asymmetric redistribution is dependent on the PAR proteins, while both asymmetric redistribution and morphological transitions are dependent upon PFN-1 and NMY-2. Just before cytokinesis, actin disappears from most of the cortex and is only found around the presumptive cytokinetic furrow. Finally, we describe dynamic actin-enriched comets in the early embryo.
During early C. elegans embryogenesis actin plays more roles and its organization is more dynamic than previously described. Morphological transitions of F-actin, from meshwork to puncta, as well as asymmetric redistribution, are regulated by the PAR proteins. Results from this study indicate new insights into the cellular and developmental roles of the actin cytoskeleton.
Three-prime untranslated regions (3′UTRs) are widely recognized as important post-transcriptional regulatory regions of mRNAs. RNA-binding proteins and small non-coding RNAs such as microRNAs (miRNAs) bind to functional elements within 3′UTRs to influence mRNA stability, translation and localization. These interactions play many important roles in development, metabolism and disease. However, even in the most well-annotated metazoan genomes, 3′UTRs and their functional elements are not well defined. Comprehensive and accurate genome-wide annotation of 3′UTRs and their functional elements is thus critical. We have developed an open-access database, available at http://www.UTRome.org, to provide a rich and comprehensive resource for 3′UTR biology in the well-characterized, experimentally tractable model system Caenorhabditis elegans. UTRome.org combines data from public repositories and a large-scale effort we are undertaking to characterize 3′UTRs and their functional elements in C. elegans, including 3′UTR sequences, graphical displays, predicted and validated functional elements, secondary structure predictions and detailed data from our cloning pipeline. UTRome.org will grow substantially over time to encompass individual 3′UTR isoforms for the majority of genes, new and revised functional elements, and in vivo data on 3′UTR function as they become available. The UTRome database thus represents a powerful tool to better understand the biology of 3′UTRs.
To initiate studies on how protein-protein interaction (or “interactome”) networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains ∼5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.
Though posttranscriptional regulation is important for early embryogenesis, little is understood regarding control of mRNA decay during development. Previous work defined two major pathways by which normal transcripts are degraded in eukaryotes. However it is not known which pathways are key in mRNA decay during early patterning or whether developmental transcripts are turned over via specific pathways. Here we show that Caenorhabditis elegans Dcp2 is localized to distinct foci during embryogenesis, reminiscent of P-bodies, the sites of mRNA degradation in yeast and mammals. However the decapping enzyme of the 3′ to 5′ transcript decay system (DcpS) localizes throughout the cytoplasm, suggesting this degradation pathway is not highly organized. In addition we find that Dcp2 is localized to P-granules, showing that Dcp2 is stored and/or active in these structures. However RNAi of these decapping enzymes has no obvious effect on embryogenesis. In contrast we find that nuclear cap binding proteins (CBP-20 and 80), eIF4G, and PAB-1 are absolutely required for development. Together our data provides further evidence that pathways of general mRNA metabolism can be remarkably organized during development, with two different decapping enzymes localized in distinct cytoplasmic domains.
RNA interference (RNAi) is being used in large-scale genomic studies as a rapid way to obtain in vivo functional information associated with specific genes. How best to archive and mine the complex data derived from these studies provides a series of challenges associated with both the methods used to elicit the RNAi response and the functional data gathered. RNAiDB (RNAi Database; http://www.rnai.org) has been created for the archival, distribution and analysis of phenotypic data from large-scale RNAi analyses in Caenorhabditis elegans. The database contains a compendium of publicly available data and provides information on experimental methods and phenotypic results, including raw data in the form of images and streaming time-lapse movies. Phenotypic summaries together with graphical displays of RNAi to gene mappings allow quick intuitive comparison of results from different RNAi assays and visualization of the gene product(s) potentially inhibited by each RNAi experiment based on multiple sequence analysis methods. RNAiDB can be searched using combinatorial queries and using the novel tool PhenoBlast, which ranks genes according to their overall phenotypic similarity. RNAiDB could serve as a model database for distributing and navigating in vivo functional information from large-scale systematic phenotypic analyses in different organisms.