|Home | About | Journals | Submit | Contact Us | Français|
Caenorhabditis elegans is one of the most prominent model systems for embryogenesis. However, it has been impractical to collect large amounts of precisely staged embryos. Thus, early C. elegans embryogenesis has not been amenable to most modern high-throughput genomics or biochemistry assays. To overcome this problem, we devised a method to collect large amounts of staged C. elegans embryos by Fluorescent Activated Cell Sorting (termed eFACS). eFACS can in principle be applied to all embryonic stages. As a proof of principle we show that a single eFACS run routinely yields tens of thousands of almost perfectly staged one-cell embryos. Since the earliest embryonic events are driven by post-transcriptional regulation, we combined eFACS with next-generation sequencing to profile the embryonic expression of small, non-coding RNAs. We discovered complex and orchestrated changes in the expression between and within almost all classes of small RNAs, including miRNAs and 26G-RNAs, during embryogenesis.
The nematode Caenorhabditis elegans is one of the best-explored model organisms for developmental biology. The mechanistic basis of embryogenesis in C. elegans has been dissected by describing the entire cell lineage1 and by numerous molecular and genetic analyses. Various key proteins involved in early cell division as well as hundreds of essential genes required for early embryogenesis and their knock-down phenotypes have been described2-8. However, a true understanding of embryogenesis will require the knowledge of stage-specific gene expression. Modern high-throughput technologies such as deep sequencing, proteomics and their many applications can be used, for example, to identify and quantify the transcriptome, protein levels, and protein-protein interactions on a genome-wide scale. Prerequisite to study the progression of embryogenesis with many of these methods is large amounts of precisely staged embryos to yield enough RNA or other material. However, this is currently not possible. Isolated embryos represent mixtures of developmental stages, ranging from the early one-cell zygote to the almost hatching larvae with approximately 600 cells. To date staged embryos are usually obtained by hand using a mouth pipette, making it impractical to apply large-scale techniques that typically require tens of thousands of embryos. While methods to collect cells from dissociated embryos exist9, there is currently no method to collect specific embryonic stages in large quantities. Methods have been described to obtain large numbers of semi-synchronized embryos by blocking their development with fluorodeoxyuridine10, or to obtain young embryos from hermaphrodites that have just begun to produce mature oocytes11. Although these methods can yield reasonable quantities of young embryos, the collected embryos are not synchronous and these approaches cannot be used to select specific developmental stages.
Here we describe a method to collect large amounts of staged embryos by Fluorescent Activated Cell Sorting (eFACS). eFACS can be applied to any embryonic stage in which a specific fluorescent marker protein can be stably expressed and allows the resolution of embryonic stages with sufficient yield for high throughput technology that requires large amounts of starting material.
In C. elegans embryos, some specific zygotic transcription of protein coding genes is initiated at the 4-cell stage, although pharmacological and genetic experiments have suggested that zygotic genes are not required until later in embryogenesis12,13. Maternal components seem sufficient to direct the embryo through the initial cleavage rounds up to approximately the onset of gastrulation. Interference with key enzymes involved in the RNAi pathway lead to numerous defects including embryonic lethality, suggesting functional roles for non-coding RNAs in embryogenesis14-16. It is unknown which small RNA populations (miRNAs, 21U-RNAs, endogenous siRNAs, 26G-RNAs) are present in the early embryo and it is unclear how the complexity and composition of the transcriptome changes within the very first cell cycles17-20. We thus set out to use eFACS in combination with deep sequencing to profile small RNA expression during early embryogenesis.
The strain used for our eFACS experiments expresses an OMA-1::GFP fusion protein under control of the oma-1 promoter21. The OMA-1::GFP fluorescence is detected in developing oocytes and the one-cell stage embryo. The GFP signal rapidly decreases in the two-cell embryo and is too weak to be detected in the embryo after the 4-cell stage21. These characteristics make the strain useful for selecting one-cell stage embryos by fluorescence.
After passing the embryos through a FACS we observed a GFP positive (fluorescent) population that was distinct from background. Around 3-7% of embryos within the sample have high GFP fluorescence (Fig. 1a). Specifically sorting this population yields a sample of ~70% one-cell stage embryos contaminated with older embryos (Fig. 1b,d). We investigated if we could sort twice (hereafter referred to as resorting) to obtain an even higher enrichment in one-cell embryos. The first embryonic cleavages progress rapidly and allow a time window of only 40 minutes for sorting living embryos1. After this time a mixed embryo population is depleted from one-cell stage embryos. Even with extensive cooling to delay cell division we were unable to resort to achieve further enrichment (see Discussion). However, methanol fixation of embryos solved this problem and allowed further enrichment of the desired population (Fig. 1b). This procedure yielded routinely ~60,000 embryos as a practically pure one-cell stage sample (>98%) (Fig. 1c,e). Most of those resorted one-cell embryos resided in the pronuclear migration and pseudocleavage part of the first cell cycle (Fig. 1c,e). Selecting a less GFP fluorescent population (Supplementary Fig. 1a) and resorting (Supplementary Fig. 1b) yielded a mixture of two-to-four-cell embryos with some contamination (15% one-cell, 5% eight-cell, < 2% older stages) (Supplementary Fig. 1e).
To study composition and dynamics of small RNAs in early embryogenesis we obtained altogether six samples (four by eFACS) of embryos covering various developmental stages (Fig. 2) from which small RNA libraries were generated and deep sequenced (Methods). Using our mapping pipeline (Methods), between 52% and 83% of reads could be mapped to the genome (Fig. 2; further details in Supplementary Table 1).
The methanol fixation of embryos enabled us to sort twice and further enrich embryo populations. To test if methanol fixation altered miRNA expression, we compared expression profiles for 10 miRNAs between fixed and non-fixed embryos by qRT-PCRs (Supplementary Fig. 2). Relative expression of these miRNAs was unaffected by fixation. Further, to test if sorting fixed embryos is comparable to living-sorted embryos we compared sequencing-based estimates of miRNA expression between these samples. We expect some differences between these samples since we know that contrary to the fixed resorted eFACS sample (>98% one-cell embryos), the living, once sorted sample is contaminated by ~30% mixed embryos.
We found that miRNA expression is overall highly correlated between these samples (Fig. 3a, log expression, Pearson 0.86), although we observed substantial scatter and some miRNAs which were absent in the fixed-sorted sample. We suspected that these miRNAs are expressed only in older embryos and are therefore not detected in the one-cell stage sample. To test this hypothesis, we first directly measured miRNA expression in an independently obtained, living mixed embryo sample by sequencing (‘mixed embryos’ Fig. 2). We then respectively subtracted miRNA expression (quantified by normalized sequencing reads, Methods) of this mixed embryo sample from miRNA expression from both sorted samples. The small remaining set of miRNAs had strongly reduced scatter, correlated almost perfectly between both samples (Fig. 3b, Pearson 0.94), and was more strongly expressed in the fixed sample. Thus, these miRNAs (which contain, among others, all miRNAs from the miR-35 cluster) are likely one-cell embryo specific. Furthermore, miRNA expression fold-changes during embryonic development quantified by eFACS and deep sequencing can accurately reflect in vivo fold-changes. We will present further evidence for this based on qRT-PCR validation below.
Perhaps surprisingly, eFACS revealed that ~60% of all known miRNAs are already expressed in the one-cell embryo (Fig. 3a and Supplementary Table 5). We selected 16 miRNAs, with read counts covering three orders of magnitude, for independent validation by qRT-PCR (Fig 3c, marked in red) on hand-picked, living one-cell embryos and confirmed the expression of all of them (Supplementary Table 8). We then computed fold-changes of miRNA expression from sequencing data between one-cell embryos and our post-gastrulation sample according to a logistic model (Methods). These fold-changes were also directly assayed by qRT-PCRs for the 16 miRNAs on independently hand picked, living embryos from corresponding developmental stages. Fold-changes determined by sequencing and qRT-PCR were found to be in good correlation (Fig. 3d, Pearson 0.85). However, there are marked differences for some miRNAs (see discussion). We next examined expression changes of all miRNAs between one-cell, two-to-four-cell, and post-gastrulation samples. The least amount of change was visible across the first cell divisions (one-cell stage to two-to-four-cell stage). However, miR-48 seemed to decrease >5-fold from one-cell to two-to-four cell embryos (Supplementary Fig. 4a). The strongest miRNA expression changes were observed upon gastrulation, where several miRNAs were for the first time highly expressed (Supplementary Fig. 4b). Nevertheless, we also observed miRNAs that peaked in expression in the early embryo, including the miR-35 cluster (miR-35-41), the miR-61 cluster (miR-61, miR-250) and miRNA-1829b/c. As noted above, we already observed these miRNAs to be enriched in the one-cell embryo (Fig. 3b). Altogether, we conclude that these miRNAs are markers for very early embryogenesis.
To discover potentially novel miRNAs, we mined our pooled datasets with miRDeep, an algorithm that identifies Dicer hairpin products such as miRNAs in deep sequencing data22. miRDeep reported 19 novel miRNAs (Methods, Supplementary Table 2). 16 were supported by detected star strands. Further, precursors of two novel miRNAs fell exactly between adjacent coding exons, strongly suggesting that they are mirtrons. We observed expression from 7506 of 15341 known 21U-RNA loci (Supplementary Tables 6 and 7). Reads mapping to known 21U-RNA loci derived almost exclusively from the sense strand, had almost always a 5′ Uracil, and their length distribution sharply peaked at 21 nt. We discovered 389 novel 21U-RNAs (Supplementary Table 4). Their genomic distribution followed the published pattern20,23 with additional dispersed genomic loci.
We next compared the expression of all known classes of small RNAs during embryogenesis. However, we note that we most likely only observe small RNAs with a 5′ monophosphate due to the cloning protocol (Methods). Overall, we observed strong, orchestrated changes in the composition of small RNAs between the sequenced samples (Fig. 5 and Supplementary Fig. 2). Older embryos are dominated by miRNAs while in very early stages additional small RNA classes are observed, including mitochondrial tRNA in one-cell and two-to-four-cell embryos as well as a sizable fraction of rRNA. The rRNA and tRNA-derived fractions in all samples have a uniform length distribution and are thus likely degradation products. 21U-RNAs are highly expressed in early embryos but difficult to detect in older embryos. We also observed differential expression of endo-siRNAs and 26G-RNAs (see below).
The length distribution of reads mapping sense or antisense to exons or introns of mRNA transcripts varied distinctly (Fig. 6). Sense reads were distributed uniformly, suggesting that they originated from degraded mRNAs. Reads mapping antisense to exons were dominated by 22 nt and 26 nt reads with a strong bias for a 5′ Uracil or Guanine, respectively (consistent with previous reports17,20). We will refer to the corresponding small RNAs as endogenous siRNAs (endo-siRNAs). Perhaps surprisingly, most one-cell embryo endo-siRNAs mapped to mitochondrial enzymes. The majority of these mRNAs are known to be upregulated in rrf-1, eri-1, rde-3 and dcr-1 mutants, which suggests that they are under control of small RNAs (Supplementary Table 3). We also consistently observed possible degradation products of mitochondrial tRNAs in the early embryo but not in other samples (Fig. 5). Interestingly, we found more endo-siRNA of length ~22 nt in the one- and two-to-four-cell stage embryos, while endo-siRNAs of length ~26 nt dominated in the older samples. Additionally, we observed in older embryos a two-fold enrichment of reads mapping antisense to 3′UTRs (27-32%) when compared to one-cell or two-to-four-cell stages (15% 3′UTR reads).
After removing known RNA classes (see Methods), we studied the set of remaining reads. The length distribution of these RNAs peaked at 26 nt. These 26mers did not map to any annotated loci and had a strong 5′ Guanine bias (75.7%). Hereafter, we refer to 26 nt reads with a 5′G as 26G-RNAs24. While lowly present in early embryos, we observed high 26G-RNA expression in older embryos. Computational analyses revealed that 26G-RNAs mapped to several clusters in intergenic regions on different chromosomes (Fig. 6a). We validated five (out of five tested) 26G-RNAs from two clusters (Fig. 6b).
Sorting C. elegans embryos expressing a stage specific GFP marker via FACS (eFACS) made it possible to obtain a large staged embryo population. As a proof of principle we used eFACS to obtain large samples of staged one-cell and two-to-four cell embryos. Due to the low abundance of one-cell zygotes in embryos extracted via standard bleaching, resorting is necessary to yield a pure one-cell population. Methanol fixation allows resorting. Our resorted one-cell sample had a purity of >98%. Methanol fixation did not alter miRNA expression (Supplementary Fig. 2) or mRNA expression (data not shown). Nevertheless, we cannot rule out that methanol fixation or sorting does induce some artifacts when using eFACS for other purposes. We observed some differences in miRNA expression fold-changes determined by sequencing and qRT-PCRs in independently assayed embryos (Fig. 3d), including an outstanding discrepancy for miR-58. We believe that this discrepancy can be in part explained by saturation effects in the library preparation for sequencing because miR-58 is by far the most highly expressed miRNA. This problem and biases in sequencing in general may also be responsible for other discrepancies. Moreover, we observed a shift in overall miRNA fold-changes towards increased fold-changes quantified by qRT-PCR. An inherent problem when comparing sequencing and qRT-PCR data is that both methods require normalization. Sequencing data were normalized under the assumption that net fold-changes are close to zero while qRT-PCRs were normalized to an internal standard. While both assumptions have their problems, different normalization procedures only shift the baseline of fold-changes and do not influence the relative fold-changes to each other and thus do not influence any conclusions presented in this study.
In principle, eFACS can be used to extract large samples of embryos enriched in any desired embryonic stage. Thus, eFACS opens the door to many modern high-throughput technologies to assay embryonic stage specific gene expression. Several of such investigations are already ongoing. A limitation of eFACS is that eFACS depends on the availability of a good fluorescent marker gene for the desired embryonic stage. State-of-the-art FACS allows the simultaneous usage of up to eight fluorescence channels. Strains expressing different fluorescent fusion proteins with temporally overlapping changes in gene expression could be combined and thus it should be possible to use eFACS in situations where a single optimal marker gene is not available. Moreover, it was shown that it is possible to engineer the specific degradation of a protein at a specific time and in a specific cell type25.
Alternatively, we have shown that embryos can be sorted alive to obtain staged samples at a purity of ~70%. However, one technical constraint in eFACS is that the large size of the embryos forced us to sort at very low speeds of ~400/second. We cooled embryos (15°C) to delay cell divisions but were still unable to resort living embryos. We also experimented with lower temperature settings (4-10°C). However, these settings reduced viability (<60%) after sorting and still resulted in relatively low purity. It is entirely possible that more advanced FACS machines will allow to sort at higher speeds comparable to standard cell sorting (>20,000/second). In this case, one could sort and even resort to obtain samples of the same size and purity as our fixed embryo eFACS runs. Fixation could be omitted and eFACS just with the OMA-1::GFP strain could be used to obtain thousands of staged living embryos that could be allowed to develop synchronously to the embryonic stage of interest. Further improvements to this approach might also be achieved by careful animal staging11 prior to eFACS.
We applied eFACS to profile small RNA expression during embryogenesis. Previous large-scale studies of small RNA expression had to use samples composed of mixed-stage embryos. These studies could not detect the orchestrated and dynamic changes between and within different classes of small RNAs that we observed when comparing the one-cell embryo to later stages. Several examples illustrate this finding. Firstly, it is very interesting that the majority of miRNAs is already expressed in the one-cell embryo, suggesting that they are maternally deposited. This raises the question why so many miRNAs are expressed in the early embryo. Secondly, we showed that miRNAs from the miR-35 cluster are likely early embryo specific. Genetic knockouts and mutations for 95 miRNAs have been published26. Strikingly, the miR-35 cluster is the only known miRNA cluster with an embryonic lethal knockout phenotype. Thirdly, we observed many small RNAs of uniform length mapping sense to rRNAs in one-cell embryos (living-sorted or methanol-fixed), with decreased expression in two–to-four-cell embryos, but virtually absent in samples from older stages that were in part also obtained by sorting. Thus, although we do not have independent validation, it seems unlikely that the observed rRNA expression is an experimental artifact. rRNAs, unlike mRNAs, are already transcribed in the one-cell embryo13. One may speculate about a turnover of maternally and paternally provided rRNAs to zygotically transcribed rRNAs. Finally, we find consistent evidence for a turnover of mitochondrial components in the one-cell embryo. We observed degradation products of mitochondrial tRNAs in the early embryo as well as many siRNAs directed against mitochondrial enzymes. Thus, it is tempting to speculate about mechanisms that selectively degrade paternal mitochondria in early zygotes, as described in vertebrates27.
Our data allowed us to shed light on the nature of yet virtually undescribed classes of small RNAs such as 26G-RNAs. Observations of small RNAs, in particular of length ~26 nt with a 5′ Guanine bias have been reported earlier17,20 and were recently dubbed 26G-RNAs24. However, we found that 26G-RNAs are dynamically expressed and that they cluster in several intergenic regions. Northern blot analysis suggested that they may be generated or modified such that they appear in different sizes. Besides an increase of 26G-RNAs in older embryos we also observed an increase of 26nt endo-siRNAs mapping antisense to coding mRNAs. We failed to computationally detect a ‘ping pong’ biogenesis mechanism28,29 between 26G-RNAs and 26nt endo-siRNAs.
Our eFACS data and analyses raise many more open questions. However, altogether one is tempted to conclude that the complexity of small RNA expression dynamics in very early embryogenesis is comparable to the expression dynamics of protein encoding genes, and that eFACS will make a contribution towards a more complete understanding of gene regulatory networks during early animal development.
Supplementary Figure 1: eFACS of two-to-four-cell stage embryos.
Supplementary Figure 2: Comparison of miRNA expression in fixed and living mixed embryos.
Supplementary Figure 3: Validation of 16 miRNA fold changes computed from the deep sequencing data by qRT-PCRs on hand-picked living embryos.
Supplementary Figure 4: miRNA expression fold changes between the different samples.
Supplementary Table 1: Sequencing and mapping results in the samples.
Supplementary Table 2: miRNA predictions using miRDeep.
Supplementary Table 3: Top 100 reads mapping antisense to coding features.
Supplementary Table 4: Novel 21U-RNAs.
Supplementary Table 5: Sequencing miRNA expression.
Supplementary Table 6: 21U-RNAs are highly expressed in 1-cell embryos.
Supplementary Table 7: Expression of 21U-RNAs.
Supplementary Table 8: qRT-PCR results 16 miRNAs on hand-picked one-cell stage embryos.
Supplementary Table 9: List of primers and probes used in this study
We are grateful to R. Lin for providing us with the TX189[P(oma-1)::oma-1::GFP] strain. All other strains used in this project were provided by the Caenorhabditis Genetic Center, which is funded by the National Center for Research Resources. MS gratefully acknowledges part-time funding from the Max-Delbrück-Center and New York University PhD exchange program and a travel grand from Boehringer Ingelheim Fonds. JM thanks the German Research Foundation for a fellowship in the International Research Training Group Genomics and Systems Biology of Molecular Networks (GRK 1360). FP and NR acknowledge partial funding from National Human Genome Research Institute (ModEncode U01 HG004276) and National Institutes of Health (R01HD046236). We thank S. Lebedeva for help with sequencing runs.
Author Contributions: FP and NR conceived, designed, and supervised the study. TC contributed to initial eFACS experiments. HPR helped with FACS machine settings and runs. WC and NL contributed to library preparations and sequencing. JM designed and performed computational studies with the exception of predicting new miRNAs (MRF). MS designed and performed the experiments. MS, JM, FP, NR analyzed the data. MS and NR wrote the paper, JM and FP edited it.