|Home | About | Journals | Submit | Contact Us | Français|
Physarum polycephalum is a unicellular eukaryote belonging to the amoebozoa group of organisms. The complex life cycle involves various cell types that differ in morphology, function, and biochemical composition. Sporulation, one step in the life cycle, is a stimulus-controlled differentiation response of macroscopic plasmodial cells that develop into fruiting bodies. Well-established Mendelian genetics and the occurrence of macroscopic cells with a naturally synchronous population of nuclei as source of homogeneous cell material for biochemical analyses make Physarum an attractive model organism for studying the regulatory control of cell differentiation. Here, we develop an approach using RNA-sequencing (RNA-seq), without needing to rely on a genome sequence as a reference, for studying the transcriptomic changes during stimulus-triggered commitment to sporulation in individual plasmodial cells. The approach is validated through the obtained expression patterns and annotations, and particularly the results from up- and downregulated genes, which correlate well with previous studies.
Physarum polycephalum (“slime mold”) is a free-living amoebozoan protist. Due to its rich cell biology, Physarum is a model organism for cell motility1 and cell differentiation.2–4 More recently in RNA editing,5 Physarum is also a model organism for DNA replication,6 optimization of cell morphology7 and behavioral intelligence.8 During its branched life cycle, Physarum differentiates into several cell types that occur in temporal order (Fig. 1). The multinucleate, macroscopic plasmodium, for example, may develop from microscopic, single nucleate amoebal cells or develop by germination of spherules.9 The number of nuclei contained in a plasmodium depends on its size. A plasmodium in a 9 cm petri dish, for example, contains approximately 107 nuclei that display natural synchrony in terms of cell cycle and differentiation status.10,11 During sporulation, the plasmodium differentiates into fruiting bodies containing spores which undergo a meiotic recombination to become the precursor cells of amoebae that may develop into gametes later on. Sporulation occurs only in starved plasmodial cells and can be experimentally triggered, e.g. by exposure to far red light or to heat shock. Well-established Mendelian genetics and the occurrence of naturally synchronous macroscopic cells as source of homogeneous cell material for biochemical analyses have made Physarum an attractive model organism for studying the regulatory control of cell differentiation. Recent advances in sequencing technologies motivated detailed analyses of transcriptome dynamics during the life cycle of Physarum.12,13 These studies were performed using RNA isolated from cell populations.
In general, differentiation follows spatial and temporal changes in transcript abundance in a cell type-specific manner. Stochastic variations in gene expression presumably do impact cell-fate decisions,14 and therefore the time-resolved analysis of gene expression patterns in individual cells would provide valuable insight as compared to averaged data from measurements obtained from cell populations.14,15 Expression patterns of single-cells have been analyzed using deep RNA sequencing (RNA seq)16 to characterize the transcriptomes of individual embryonic mouse cells separated by technically complex procedures, and relying on the mouse genomic information for transcript assembly and mapping.17,18
The Physarum genome has been sequenced but the data are still not assembled into a complete genome sequence (The Genome Institute, Washington University School of Medicine; http://genome.wustl.edu/genomes/view/physarum_polycephalum/). Therefore, here we evaluated the possibility of studying the global transcriptional changes during the differentiation of a single cell without relying on genomic information, and developed an approach to analyze the differential expression at several time points during the commitment of a plasmodial cell to sporulation.
Physarum macroplasmodia (apogamic strain WT3119) were cultured as previously described.13,20 Cells were grown and collected under two different conditions: (i) a plasmodium starved for 6 days (competent D1 and D2 individual cell samples); and (ii) a plasmodium starved for 6 days, exposed to far red light for 30 minutes, and returned to the dark for 6.5 hours (L1 and L2 photoinduced cells; Table 1). During this time period the cell becomes irreversibly committed to sporulation.21 Samples were frozen with liquid nitrogen and Poly(A)+ RNA was isolated from the total RNA samples (by two rounds of oligo(dT) affinity chromatography), and fragmented with ultra-sound (4 pulses of 30 sec at 4 °C). Subsequently, the RNA fragments were poly(A)-tailed using poly(A) polymerase, followed by treatment with tobacco acid pyrophosphatase (TAP). Then a RNA adapter was ligated to the 5′-monophosphate of the RNA. First-strand complementary DNA (cDNA) synthesis was performed using an oligo(dT)-adapter primer and the Maloney murine leukemia virus (M-MLV) reverse transcriptase. The resulting cDNAs were polymerase chain reaction (PCR)-amplified for 14–15 cycles to about 20–30 ng/μL, including distinctive 4-bp 5′-barcodes for each sample (Table 1), and using a high fidelity DNA polymerase. The PCR products were purified with the Agencourt AMPure XP kit (Beckman Coulter Genomics), and pooled in equivalent amounts. cDNAs in the range of 200 to 400 bp were fractionated from agarose gels and sequenced using the Illumina HiSeq 2000 system. Poly(A)+ RNA isolation, cDNA synthesis and cDNA library preparation as described here were carried out by vertis Biotechnologie (Freising-Weihenstephan, Germany). The 100-bp sequencing outputs were then trimmed for quality (Phred score > 33), and later assembled de novo, using velvet22 and oases (k-mers: 31, 41, 51). Later, CAP323 was employed to reduce redundancy in the assembly. The annotation of this assembly was carried out first through BLAST24 searches (e-value 1E-3) against the SwissProt25 protein database. A search for Physarum noncoding RNAs was not included due to the lack of complete gene models and a finished reference genome in this species. Afterwards, domains and protein signature patterns were associated from matches to the InterPro database, and Gene Ontology (GO) based annotations were assigned using Blast2GO,26 from annotations pertaining to orthologs (annotation e-value cutoff < 1E-6). Gene names and descriptions were filtered using the Blast Description Annotation tool from Blast2GO.26 Significant differences in GO annotations between sets of up- and downregulated genes from each cDNA library were evaluated using Fisher exact tests, as implemented in Blast2GO.
To assess the differential expression between the several single cells, the sequencing output was split using the barcode information for each sample. Then the decoded outputs were mapped to the novel assembly with Bowtie.27 Samtools28 and Tablet29 were later used to obtain mapped read counts. For expression comparisons, we obtained the following for each transcript: (i) the number of mapped reads, and (ii) the normalized expression value, as measured in reads per kilobase per million mapped reads (RPKM).30 To identify differentially expressed transcripts between starved and photoinduced cells, the non-normalized mapped read count data was analyzed using the R-based package DESeq.31 Transcript abundances for each gene were estimated as a weighted mean of mapped read counts from each replicate, normalized to the library size. P-values (adjusted for false discovery rate) were generated for each gene in pairwise comparisons between different conditions (competent and induced cells). We used the per-condition method and fit-only sharing mode. A summary of experiments and bioinformatic analyses is depicted in Figure 2.
Four datasets consisting a total of 77.07 million 100-base reads from single cell Physarum plasmodia were obtained from the Illumina sequencing (77.02 million reads with Phred score > 33; 7.12 Gb). This RNA-seq output was deposited in the European Nucleotide Archive32 (study accession ERP001220). Replicate data distributions were 1.85 and 1.82 Gb corresponding to the starved plasmodium (cDNA library replicates D1 and D2), and 1.67 and 1.78 Gb for the cells collected 6 hours after photoinduction (libraries L1 and L2; Table 1). Therefore, assuming a 10% of protein-encoding genes, we obtained a 237.32x coverage for the 300 Mb genome of Physarum.13 Later, from the de novo assembly of the trimmed sequencing output, we obtained 909,505 sequences, with a N50 contig size of 371 bp. Large cDNAs from this assembly (>500 bp) were then clustered into 16,822 contigs (N50 length: 778 bp) with CAP3, to create a comprehensive set of representative transcripts. 10,278 of these contigs included transcript abundance data in the form of mapped reads in at least one cell sample, which was used as a measure of gene expression. Transcriptionally active genes were then defined as contigs with at least one mapped read present in all four samples or differentiation stages; in this regard, we observed 8,149 transcripts with mapped reads in all libraries.
In order to assess the reproducibility of this single-cell approach, we normalized the mapped read counts for each transcriptionally active transcript as reads per kilobase per million mapped reads (RPKM).30 We then compared the RPKM-normalized data for competent plasmodia (D1 and D2 cells) and photoinduced cells (L1 and L2) that were separately processed (Fig. 3). Accordingly, cells from related developmental stages exhibit very similar transcriptomic profiles (competent cells: r = 0.99; light-induced plasmodia: r = 0.98). On the other hand, lower correlations were found between competent and photoinduced cells (r = 0.96 between D2 and L1; r = 0.97 in all other cases; Fig. 3). However, further studies involving comparisons between several cell types and cell stages are required to establish whether these lower correlations account for the variations between individual cells.
Afterwards, we estimated the differentially expressed transcripts using DESeq31 from the raw mapped reads. For these analyses, only contigs with a combined count of 300 mapped reads among all the samples were considered; i.e., 3,164 contigs were then selected that fitted these criteria. This mapped read count threshold was selected to reduce noise caused by spurious contigs and alignments. Upon normalization, the distribution of mapped reads reflected the presence of differentially expressed transcripts, and genes with other kinds of regulation, with a slightly greater set of genes with higher expression in light-induced cells (Fig. 4). Specifically, we identified 556 upregulated transcripts (P-value < 0.05; 504 of these with a false discovery rate (FDR) < 0.1) and 531 downregulated transcripts (475 with FDR < 0.1) between the photoinduced and competent cell libraries for transcriptionally active contigs with mapped reads (Fig. 4).
Subsequently, to assign functions to the novel sequences, annotations were associated to the transcriptome assembly. In this way we obtained 92,641 Gene Ontology (GO) terms, corresponding to 5,722 SwissProt orthologs, where 64,730 GO terms belong to 4,222 sequences with mapped reads. cDNAs were linked biological processes (n = 1,135), molecular functions (n = 1,558), and cellular components (n = 576). From the transcriptionally active genes with mapped reads, 231 annotated transcripts were upregulated, and 264 transcripts were downregulated (Supplementary Data). A comparison of GO annotations between sets of up- and downregulated transcripts revealed two terms exclusively featured in upregulated genes (‘symplast,’ GO:0055044; and ‘auxiliary transport protein,’ GO:0015457). Both annotations are related to the extracellular transport via pores. Conversely, six GO terms were identified only in downregulated transcripts (‘synapse,’ GO:0045202; ‘synapse part,’ GO:0044456; ‘anti-oxidant activity,’ GO:0016209; ‘translation regulator activity,’ GO:0045182; ‘immune system process,’ GO:0002376; and ‘viral reproduction,’ GO:0016032; Fig. 5). These groups of GO annotations are associated with the regulation of translation. Next, we tested for the enrichment of GO terms in up- and downregulated contigs against the full list of annotated transcripts, using the Fisher’s exact test as implemented in Blast2GO.26 In this manner, significant overrepresentation was found only in upregulated transcripts (FDR = 0.037; P-value = 0.046), with all GO terms belonging to the molecular function category of ontologies: metal ion binding (GO:0046872), calcium ion binding (GO:0005509), ion binding (GO:0043167), and cation binding (GO:0043169). All these belong to the same hierarchy of ontologies, so all these can be summarized with the lower and more specific category, i.e., the ‘calcium ion binding’ GO term. Both analyses of GO annotations correlate well with previous studies that point to the upregulation of genes associated with the ion transport in the light-induced plasmodium.13,20
Later, to evaluate the genes with similar regulation, we clustered the fully annotated transcripts for the highest statistically significant up- and down-regulation levels, as compared to the starved plasmodium cell libraries (Tables 2 and and3).3). In spite of the apparent diversity of the annotations, we inferred potential functions based on ortholog identities and gene ontology assignments. In this way, we identified upregulated transcripts encoding endopeptidases (PHYSA), phospholipases (PLDG) and stress response proteins (BPM1, NAH1), as well as genes related to biosynthetic processes (COAD, IOD1, PYR1), development (STX3), chromatin remodeling (YA27), and signaling (ARF1, CYH4, SAR1, LTBP2), that are highly expressed 6.5 hours after photoinduction (Table 2).
On the other hand, a different group of genes is downregulated upon light exposure. In this case, we found transcripts associated with actin-binding (MYS2, COMA), FMN-binding (NOS, NCPR), signaling (VWKA), sugar- (TCT1) and cation-binding proteins (XANP, BOT2), a transporter (PEP3) and transferases (SET5, HMNT), that were downregulated 6.5 hours after light induction (Table 3). These measurements of transcriptional regulation at different time points correlate well with previous results in cell pools, where actin-binding and signaling proteins were identified as core members of the regulatory network during sporulation.13
We also noticed the expression of multiple transcript isoforms in the same cell at the same time point, both in upregulated transcripts (PYR1, BPM1, PHYSA; Table 2), and in downregulated transcripts (COMA, VWKA, NOS, NCPR, XANP, MYS2, SET5; Table 3). This phenomenon has also been observed in previous single-cell studies, and has been attributed to the complexity of transcript variants.33 Whether these genes encode isoforms controlling stage-specific signaling pathways remains to be studied in detail.
To date, two studies have reported the RNA-seq analysis of transcriptomes in eukaryotic organisms, using single embryonic cells as models.17,18 In these works, both the assembly and mapping procedures were achieved using the mouse genome as a reference. Previously, we succeeded in measuring the expression on a set of 35 differentially regulated genes in populations of single-cells, using the GeXP RT-PCR platform.20 Here, using the power of RNA-seq to obtain whole transcriptomes without relying on previous genomic information, we were able to characterize a large set of expressed genes in different samples during the sporulation of Physarum, an organism without a known genomic sequence. Furthermore, in order to obtain single cells, all former studies on single-cell multiplex gene expression analysis required complex separation methods, such as pipetting cells manually, or using laser micro-dissection or fluorescence-activated cell sorting.15 In this study, we used the plasmodium, a natural macroscopic multinucleate single-cell stage from Physarum, whose culture and handling is straightforward, and for which there are several well established methods for genetic manipulation.3
By combining the power of next-generation sequencing technologies with the simplicity for obtaining single cells from Physarum, we developed an approach to characterize the whole transcriptome through the differentiation of this lower eukaryote at the single-cell level. The observed regulation patterns correlate well with previous studies on differential gene expression during the commitment to sporulation in the slime mold, particularly with respect to proteins involved in signaling and actin binding. We expect that improvements in single-cell transcriptomics, such as discrimination in sense and antisense transcripts, the ability to sequence a more diverse range of nucleic acid species, and other future developments, will help us to display a more precise picture of the regulatory network controlling differentiation in this organism.
Authors thank Jan-Thierry Wegener (Otto von Guericke University) for his help with computational issues.
This research was partially supported by the German Federal Ministry of Education and Research (FORSYS—MaCS, Magdeburg Centre for Systems Biology). IB is an International Max Planck Research School—Otto von Guericke University fellow.
WM’s institution has received grants and study support from the German Federal Ministry of Education and Research. Other authors disclose no competing interests.
Conceived and designed the research: WM, IB. Analyzed the data: IB, JL. Wrote the first draft of the manuscript: IB. Contributed to the writing of the manuscript: WM, JL. Agree with manuscript results and conclusions: WM, JL. Jointly developed the structure and arguments for the paper: WM, IB. All authors reviewed and approved of the final manuscript.
As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. The external blind peer reviewers report no conflicts of interest.