Our application of MAPCeL to
C. elegans embryonic body wall muscle has generated a high-quality gene expression profile that defines the embryonic muscle transcriptome. We base this conclusion on five observations. First, we have successfully isolated
myo-3::GFP labeled muscle cells directly from embryos and determined that transcriptional profiles obtained from these cells are reproducible. Second, sorting of cultured muscle cell populations yields similarly reproducible data and a transcriptional profile consistent with more fully differentiated muscle cells. Third, our muscle-enriched gene lists are largely distinct from profiles of other nonmuscle cell types from
C. elegans (Additional data file 6). Conversely (fourth), the MAPCeL datasets show substantial overlap with an independent experiment in which most embryonic blastomeres were converted to body wall muscle-like cell fates
in vivo and profiled on the Affymetrix array (Additional data file 10) [
7]. Fifth, transgenic GFP reporters generated in this study confirmed that a majority of muscle-enriched genes were expressed in muscle
in vivo. Based on these observations, we suggest that our results provide a comprehensive profile of gene expression in developing
C. elegans body wall muscle cells. Moreover, the common group of 592 genes in these microarray profiles that are also specifically upregulated with the induction of embryonic muscle differentation are likely to comprise a core group of genes with fundamental roles in myogenesis. An additional 719 genes are identified that may also contribute substantially to the myogenic program (Figure ). These lists can now be exploited in future work for studies of muscle development, myofibril assembly, and muscle function (Additional data file 8).
The strong concurrence of our data with known or predicted muscle proteins (Table ) underscores the potential utility of these MAPCeL profiles for identifying candidate muscle genes that can now be tested by genetic methods in this model organism. Examples include F45G2.2, an atypical member of the myosin II family that shows strong homology to the head region of known
C. elegans body wall muscle myosin heavy chain genes (for instance,
myo-3) but for which a muscle function has not previously been described. Muscle enrichment of
tag-138 is intriguing because its vertebrate homolog HIP1 (Huntingtin interacting protein 1) mediates receptor endocytosis, is highly expressed in a variety of human tumors, and may function as an oncogene [
63]. Mutations affecting components of the DGC result in human muscular dystrophies, and similar genetic defects in
C. elegans also disrupt body wall muscle structure and function [
49-
51]. Our muscle datasets have detected potential additional members of the DGC, namely the sarcoglycan-like genes
sgcb-1 and H22K11.4 and a syntrophin-like gene,
stn-2, which can now be experimentally tested for related functions. In addition to confirming muscle enrichment of known neurotransmitter-gated ion channels, these data have also identified several new possible receptors for modulating muscle activity. Indeed, enrichment of the nAChR subunit gene
acr-16 in the M24 dataset led to physiological experiments that confirmed a key role for the ACR-16 receptor in acetylcholine-evoked muscle excitation [
16,
57]. These positive results, which are based on a limited survey of the muscle genes in our datasets, suggest that a more detailed analysis of these gene lists should reveal a substantial number of additional muscle functional genes.
The evolutionary conservation of sarcomere structure and function from
C. elegans to mammals has made the nematode an attractive model for studies of muscle development [
1-
3]. Past work in the worm has been useful for understanding general concepts of myosin filament assembly [
8,
32,
37], the molecular mechanisms that underlie the transduction of sarcomere contractile force to connected tissues [
2,
39] and the evolution of striated muscle specification [
7]. Our gene expression profiles should similarly shed light on myogenesis in other species, including humans. Approximately 60% of transcripts enriched in at least one of the embryonic body wall muscle datasets (787/1,312) are conserved in the human genome (BLAST = e-10; Additional data file 13). Although many of the transcripts in this list encode proteins with well established roles in mammalian muscle (for example, myosin heavy chain), potential muscle functions for a substantial number of additional proteins have not been defined. Of particular interest are 32 transcripts encoding proteins with no known function in any organism ('uncharacterized conserved protein'; Additional data file 13). Our data strongly suggest that many of these novel proteins play important roles in myogenesis or muscle function and serve as ripe targets for future studies, both in
C. elegans and in other animals.
We view this study as a starting point for determining how the C. elegans muscle cell transcriptome regulates myogenesis. Our approach provides a temporal component by profiling both nascent and differentiated muscle cells, and we have examples from our data in which transcript appearance mimics the known order of protein assembly within the sarcomere (for example, UNC-54). It may be possible to enhance the temporal resolution of MAPCeL by profiling embryos expressing a series of muscle reporters that come on at successive developmental time points. These data could potentially provide clues as to how the transcriptome temporally orchestrates myogenesis to assemble myofibrils into functional sarcomeres. Profiling data collected from subsets of embryonic muscle (anal and pharyngeal muscles) could also reveal gene sets with specialized functions in these different muscle types.
The embryonic muscle profiles described in this paper show substantial overlap (about 600 genes) with an independent microarray dataset obtained from embryonic muscle cells induced by ectopic expression of the MRF related transcription factor HLH-1 (Figure and Additional data file 10) [
7]. Our MAPCeL datasets show less similarity, however (about 250 common genes), with a profile of larval muscles obtained using the mRNA tagging strategy (Additional data file 8). In this approach, an epitope-tagged poly-A binding protein was used to specifically pull down body wall muscle transcripts from L1 stage larvae [
25]. It seems unlikely that this result can be fully attributed to differences in developmental age (embryonic versus larval) because reporter gene constructs for genes in our MAPCeL datasets showed post-embryonic expression in one or more muscle types (Figure ). One potential explanation for this disparity is the relative sensitivity of the two profiling approaches used to generate these data.
A recent modification of the mRNA tagging strategy that reduces background RNA could yield a deeper dataset of larval muscle enriched transcripts [
24]. The use of the mRNA tagging method to profile larval muscles is necessary because post-embryonic cells are not readily accessible to MAPCeL analysis [
14,
17]. mRNA tagging affords the additional benefit of providing sharply defined temporal profiles of gene expression that could potentially identify transcriptional cascades of genes that control muscle differentiation and growth during this period. Finally, mRNA tagging profiles of aging body wall muscle cells could reveal transcriptionally regulated genes associated with sarcopenia, an evolutionarily conserved process in which body wall muscles in
C. elegans exhibit morphological disorganization and functional decline that resembles the progressive age-related muscle atrophy that occurs in mammals [
64]. In this context, it will be interesting to compare these gene expression data with MAPCeL profiles of embryonic muscle cells maintained in culture for prolonged periods to potentially distinguish between autonomous versus environmentally induced aging processes.