|Home | About | Journals | Submit | Contact Us | Français|
Spinal muscular atrophy is an autosomal recessive motor neuron disease caused by a genetic defect carried by as many as one in 75 people. Unlike most neurological disorders, we know exactly what the genetic basis is of the disorder, but in spite of this, have little understanding of why the low levels of one protein, survival motor neuron protein, results in the specific progressive die back of only one cell type in the body, the motor neuron. Given the fact that all cells in the body of a patient with spinal muscular atrophy share the same low abundance of the protein throughout development, an appropriate approach is to ask how lower levels of survival motor neuron protein affects the proteome of embryonic stem cells prior to development. Convergent biostatistical analyses of a discovery proteomic analysis of these cells provide results that are consistent with the pathomechanistic fate of the developed motor neuron.
Spinal Muscular Atrophy (SMA), the most prevalent hereditary motor neuron disease, is marked by degeneration of lower motor neurons (MN) in the spinal cord and brainstem, leading to progressive weakness and skeletal muscle atrophy (Murayama et al. 1991). SMA affects 4–10 per 100,000 live births (Burd et al. 1991), with 1 in 40–75 people being carriers. It impacts children health internationally with no apparent selective penetrance by race or gender. There is no currently no cure. Clinical severity is related to the level of the disease associated survival motor neuron (smn) protein, with the most severe subtype showing an onset between 0–6 months of age and death usually occurring by 24 months of age from respiratory complications. Currently, the only biochemical biomarker for SMA relies on SMN mRNA or protein measurements. SMA is caused by homozygous deletions or other mutations in the telomeric copy of the survival motor neuron gene (SMN1) on chromosome 5q13 (Brzustowicz et al. 1990). Ninety-five percent of SMA patients were found to have a deleted exon 7 of SMN1 (Lefebvre et al. 1995; Ogino and Wilson, 2004). The same chromosome also contains one or more duplicate copies of the SMN2 gene, found closer to the centromere (Lefebvre et al. 1995). SMN2 differs from SMN1 by at least a single nucleotide change, which alters the splicing of the transcript from SMN2 (Lorson et al. 1999; Monani et al. 1999). This results in truncated smn transcripts that lack exon 7. The copy number and expression from SMN2 can vary from patient to patient. The severity and progression of SMA varies with copy number and/or level of Smn2 (Wurth 2000).
Smn protein is present in all cell types, and all cells, with the notable exception of anterior horn cells, can tolerate low Smn levels (reviewed by Burghes and Beattie 2009). It is unclear how defects in this ubiquitously expressed SMN gene result in alpha motor neuron degeneration leaving other tissues ostensibly unaffected. Smn is present in the nucleus and cytoplasm, in association with ribonucleo-protein (RNP) complexes that play a role in RNA processing. This complex is formed by interactions of numerous small nuclear riboproteins (U RNP1-6) and catalyzes pre-mRNA splicing (Meister et al. 2001). Smn loss affects pre-mRNA splicing in transcripts of many tissues, suggesting non-fatal effects of smn loss across cell types, not only in motor neurons (Fischer et al. 1997; Zhang et al. 2008; Bowerman et al. 2009). It is unclear why defects in this ubiquitously expressed gene leaves other tissues ostensibly unaffected. It is also unclear how smn affects functional integrity in anterior horn cells and their axons (Gubitz et al. 2004), although our previous work showed that smn depletion in neuronal cells leads to decreased differentiation and growth, lowered viability, mitochondrial dysfunction and apoptosis (Parker et al. 2008; Acsadi et al. 2009).
The molecular pathomechanisms resulting in the selective loss of anterior horn cells are not known due to the lack of information about the downstream consequences of low smn levels in motor neurons. All of the tissues of the body develop from a fertilized egg. In a patient with SMA, that occurs with one determined difference, with one determined consequence: the low abundance of smn protein causes the specific loss of motor neurons. By profiling the proteome of ESCs with low SMN protein abundance we expect to identify key signaling pathways and important genes and proteins that contribute to our understanding of the pathology of the condition.
FVB ESCs were maintained and expanded in complete ESC medium containing DMEM high glucose, 20% ES qualified FBS, ES supplement (Hepes, L-glutamine and Beta-mercaptoethanol), 1X nonessential amino acids and 1000U ESGRO(Lif)/ml. Cells were passaged every 2d at a 1:4 split in 0.1% gelatin coated flasks. Cells were plated at 3×106 in 10cm gelatin coated dishes in complete ESC medium and incubated overnight at 37°C 5% CO2. On the day of transfection, cells were 60–70% confluent, Lipofectamine 2000 and 6–8µg of plasmid DNA were added to each dish and incubated for 4h. The media was then removed and replaced with complete ESC media with 600–800µg of G418 to kill any cell not transfected. These colonies were removed from the dish and each placed in one well of a 24-well plate with complete media and G418. As the colonies grew they were expanded so individual cell lines could be frozen and stored. Each clonal cell line was then assayed by QPCR and western blot (protein) to determine the extent of smn knock-down (Acsadi et al. 2011). The selected control and knockdown cell lines were then maintained in RPMI 1640 media in a 10cm culture dish for SILAC supplemented with 10% dialyzed fetal bovine serum, 0.46 mM L-Lys-HCl or 0.46mM 13C6, 15N2-Lys-HCl, 0.47 mM L-Arg-HCl or 0.47 mM 13C6-Arg-HCl, 200 mg/l L-Proline, 2 mm glutamine, 50 uM Mercaptoethanol, 100 U/ml penicillin and 100 µg/ml streptomycin in a humidified 5% CO2 atmosphere (Bendall et al. 2008). Cells were passaged three times a week and harvested for experiments after six passages in the SILAC media. Cells were harvested for analysis by washing twice with ice cold HANKS’ solution then scraping from the dish. Cells were recovered in 1 ml of ice cold HANKS’ then pelleted by centrifugation and stored at −80°C as a cell pellet until analysis.
Cell pellets were resuspended in 100 ul of water then 100 ul of 2% LiDS was added and the mixture immediately immersed in 95°C water for a 5 min incubation. Protein in lysates was determined using a BCA protein assay (Pierce, Rockford, IL). Equal amounts of protein from SILAC heavy and light cell lysates were combined then treated with 10 mM DTT and alkylated with 30 mM iodoacetamide before adding 10 mM additional DTT. Three sample pairs with heavy control and light shRNA knock down from independent culture dishes were fractionated by SDS-PAGE on 10% polyacrylamide gels and stained with coomassie blue dye. Each of the three sample lanes was divided into 30 fractions with the edges of each lane removed prior to slicing for analysis. Proteins in the gel were digested overnight with 0.04 µg trypsin per slice in buffer containing Tris, 20 mM (pH=8.0) and 10% acetonitrile. Eluted peptides solubilized in 0.1% formic acid were analyzed by LC-MS/MS without further purification.
All analyses were performed on a Thermo QExactive MS (ThermoFisher Scientific, Watham, MA. Peptides were separated by reversed phase chromatography using an Easy 1000 nano UHPLC system (Thermo) and Acclaim PepMap 100, 75 um × 2 cm trap with Acclaim PepMap RSLC, 75 um × 15 cm column (Dionex). Peptides were eluted with a 2 h gradient from 5% to 30% acetonitrile with pH maintained by 0.1% formic acid. Column effluent was analyzed directly by MS/MS using HCD fragmentation.
Mass spectra were extracted from raw files and analyzed using MaxQuant version 188.8.131.52. They were searched against the Uniprot Mouse Proteome database (downloaded August, 2013, approx. 43539 entries) along with the MaxQuant contaminants database assuming the digestion enzyme trypsin. The mass tolerances for parent ions were 20 ppm for the first search and 4.5 ppm for the second search. All fragment mass tolerances were 20 ppm. The iodoacetamide derivative of cysteine was specified as a fixed modification. Oxidation of methionine, acetylation of the n-terminus and phosphorylation of serine, threonine and tyrosine were specified as variable modifications.
False discovery rates were calculated by searching a reversed database and were set to 0.01 for peptide-spectra matches and 0.05 for protein identification. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.
An individual slice-by-slice quantification where each slice was submitted as its own experiment and so each protein was quantified up to 30 times per lane. Further analysis to discover the suitability of these proteomic techniques to detect changes in the mobility of proteins in the gel were reported elsewhere as a proteomic technique paper with no biological analysis (Carruthers et al. 2015).
Analysis of significant differences between wild type and low abundant smn mouse ESC were performed using three different “Big data” analytic tools. David analysis is probably the most used tool, used primarily to assess biological themes in gene array data sets (Huang et al. 2009). However, we and others reason such a tool can also be applied to proteomic data (Li et al. 2012). Such theme-enrichment analysis relies on the current assembled knowledge that indicates an association of a given gene or protein with a specific theme. The analysis then asks whether the members associated with a given theme are over-represented, or enriched, in the list of genes or proteins that have satisfied the criteria applied to determine a significant difference in the experiment performed. The second tool is Ingenuity Pathway Analysis (IPA). Similarly to the David analysis, the IPA requires a list that has satisfied the criteria applied to determine a significant difference in the experiment performed. However, the IPA then uses information concerning the causal relationships reported in the literature to further inform the likely biological significance of a coordinated change in expression/abundance of a set of targets (Krämer et al. 2014; Pan et al., 2015). Lastly, the data were analyzed using Gene Set Enrichment Analysis (GSEA). It is easy imagine that a given experiment might yield thousands of targets, or perhaps, risibly few, making the interpretation of the results of the DAVID and IPA analytical methods problematic, or the criteria applied to achieve a usable list of significant targets at best arbitrary. Unlike the previous two methods, GSEA makes use of all of the valid data measured during the experiment. The targets are ranked by their difference in expression/abundance level by the experimental manipulation. How near the top of the ranked list the members appear then indicates the strength of correlation with the biological theme (Subramanian et al. 2014).
Of the 7,050 proteins identified in the experiment, 5,738 were quantified on all three lanes, 769 on only two lanes, and 491 were quantified on only one lane. It is worth stating that the quantification of a protein on only one lane does not diminish the accuracy of the measurement of the abundance of that protein. However, it does limit its usefulness for interpreting a biological experiment across a sample of three independent experiments as here performed. Using a cut off of p<0.05, of the 5,738 quantified on all three lanes, 1,485 proteins had a significant different abundance between wild type and ESCs with low smn. At a False Discovery Rate of 15%, 841 proteins are significant. These were used in the first two Bioinformatic Data Analysis programs, DAVID and IPA (supp. table 1).
David analysis of proteins deemed significant at an FDR of 15% were compared to background control list of uniprot targets from ESC with lowered abundance of smn protein. The list of GO David biological themes listed in Table 1 reveals a ranked list that includes defined biological themes and less clear catch-all sets with considerable overlap. We have presented an admittedly arbitrary number of the 20 most significant themes.
The most enriched themed was mitochondrion. Multiple entries further down the table of the most enriched themes are also related to mitochondrial function. Transit peptide refers to the directing of proteins containing an N-terminal presequence to organelles including mitochondria. Oxidoreductase and fatty acid metabolism are also mitochondrial related. Ribosomal proteins are present in all mitochondria to translate encoded mRNA. GTP-binding proteins function in many parts of the cell but are strongly associated with mitochondrial function. Similarly mitochondrial function is intimately related to the universal cofactor nicotinamide adenine dinucleotide NAD. Mitochondrion inner membrane refers to proteins associated with the membrane between the matrix from the intermembrane.
Acetylation and phosphoprotein refer to post-translational modification. We further examined the subset of the protein list that displayed multimodal abundance across the gel slices, indicating the same protein had measure abundances in at least one post translationally modified form (see Table 2 below). The themes Ribonucleoprotein, Protein biosynthesis, Nucleotide-binding, Ribosomal protein, and Endoplasmic reticulum are also related in translational processes.
We further examined the subset of the protein list that displayed multimodal abundance across the gel slices, indicating the same protein had measure abundances in at least one post translationally modified form (105 proteins, see Table 2, supp. table 2). The most enriched terms were clustered around themes of post-translational modification, including acetylation, phosphoprotein, protein synthesis, ribonucleoprotein, Ribosomal protein, Ribosome and Endoplasmic reticulum. These themes were all in the top 20 most enriched themes for the larger set of proteins having a significantly different abundance in ESC with lowered smn abundance. However, there presence as themes on this list means that they are not only related to protein translation but they themselves show more than one abundance across the gels implying they exist in more than one form in the ESC and are differently abundant by lowered smn. The themes Elongation factors, proteins that are used in protein synthesis in the process of cell cycle and elongation, and indeed alternative splicing were also significantly enriched.
However also among the most enriched included themes more directly related to the known interactome of smn itself, such as ribonucleoprotein, ribosomal protein, nucleotide binding, and most interestingly, chaperone.
IPA analysis for Molecular and cellular functions among the differentially abundant proteins highly lighted themes of cellular organization and maintenance as the strongest pathways. Those pathways involved in cell death survival, growth and proliferation were next strongest and finally cellular movement.
Of the associated network functions, the highest scoring included protein synthesis. This shows parallels with the DAVID analysis. The next equal strongest networks were “Post-Translational Modification, Protein Folding, Cellular Movement”. The fourth theme of “Molecular Transport, Protein Trafficking, Cellular Assembly and Organization” recalls the DAVID analysis indication of chaperone, one of the few cellular processes previously identified for smn function.
The five “top Tox list” is dominated by metabolic related processes with the second most significant being Mitochondrial dysfunction.
The IPA analysis has in keeping with the results of the DAVID analysis highlighted the theme of post-translational modification as being significantly associated with a lowering of abundance of smn protein. The IPA analysis of those differentially abundant proteins with multimodal distribution across gel slices indicated the importance of Remodeling of Epithelial Adherens Junctions (figure 1A) and EIF2 Signaling (figure 1B) as canonical pathways. As can be seen from the figures there is again inherent overlap with these themes.
The themes associated with Molecular and cellular functions among multimodal proteins substantially mirror the same themes as that seen across all proteins with an FRD 15%, albeit with lower numbers of contributing proteins. The exception being that in keeping with the indicated importance of cellular signaling in the canonical pathways, “Cell-To-Cell Signaling and Interaction” was the second strongest function.
Interestingly, the Network Functions associated with the multimodal proteins showed far less overlap with those indicated for the larger set of all regulated abundant proteins. The theme of cellular assembly and organization was retained, however, with the addition of Nervous System Development and function.
The “Top Tox List” again highlights mitochondrial themes.
The Gene Set Enrichment Analysis (GSEA) used 5719 resolvable and namable targets and is divided into those pathways that were significantly decreased in association with decreased SMN protein abundance (q<0.1, Table 5A) and those that tended to an increase, even though none of the increased pathways achieved a q value less than 0.1 (Table 5B).
Twelve reactome pathways were found to be significant that were decreased by lowering smn protein abundance. Of these the vast majority are directly related to protein translation. The clustering to the right of the hits shown in figure 2A indicates the strong negative correlation of the theme 3 UTR MEDIATED TRANSLATIONAL REGULATION (fig 2B shows a reactome tending to a positive if non-significant correlation for comparison).
Of the ten pathways that showed an increased albeit non-significant correlation, five show a direct link to mitochondrial function, one to extracellular matrix, two to metabolism and of particular interest, post translational protein modification.
The strength of having Bioinformatic Data Analysis tools is to remove the caprice of simply considering targets as important based on prior experience and indeed existing knowledge of the pathophysiology of the disease under examination. However, the precision of the resolution of the protein abundance make a meaningful interpretation of a large change in a given experiment validates the interpretation of individual target measures in a way that would be inappropriate using more traditional proteomic techniques.
Eif2s3y, shares a role in RNA transport with smn (Mazeyrat et al. 2001).
Odc1 has been implicated in an animal model of another motor neuron disorder, amyotrophic lateral sclerosis (Virgili et al. 2006), but to our knowledge has not previously been linked specifically to SMA.
Gli2 is a transcription factors that mediate the initial Hedgehog (Hh) signaling (Pan et al. 2009), more importantly Gli2 is required for the initial extension of axons from spinal accessory motor neurons (Dillon et al. 2005) and is vital to the regulation of motor neuron development and spatial patterning of ventral spinal cord progenitors (Bai et al. 2004). To our knowledge, Gli2 has not previously been linked specifically to SMA.
Stmn2 is important to neuronal axon integrity via its role in regulation of microtubule dynamics and protein trafficking (Duncan et al., 2013). Regulation of Stmn2 was observed in regenerating mouse sensory neuronal axons after injury (Shin et al. 2014). It should be noted the same paper states Stmn2 is not strongly associated with motor neurons. Stmn2 has also been linked to axonal “Wallerian degeneration” but not to later onset motor neuron loss as seen in SMA (Conforti et al. 2014).
ZNF423 is critically required for retinoic acid-induced differentiation and is a marker of neuroblastoma outcome (Huang et al. 2009).
Ireb2 is a key player in regulation of vertebrate cellular iron homeostasis (Hentze et al. 2010). A recent report by Zumbrennen-Bullough et al. (2014) reported that older Irp2−/− mice displayed iron deposition in white matter but a significant reduction of iron in neurons. Of more immediate relevance, Irp2−/− mice motor neurons display increased ferritin and decreased Transferrin receptor protein 1 expression and impaired mitochondrial function leading to motor neuron degeneration (Jeong et al. 2011). Iron has previously been indicated as playing a potential role in the spinal cord motor neuron cell death (Yu et al., 2008). However, this is the first report tying the growing literature on iron dysregulation and neuronal degeneration to specifically spinal muscular atrophy, although Vitte et al (2004) highlighted a concern regarding liver iron homeostasis in extreme mouse models.
Srpy4 has been highlighted as having a role in developmental signaling pathways in adult zebrafish motor neuron regeneration (Reimer et al. 2009). To our knowledge, Gli2 has not previously been linked specifically to SMA.
A proteomic profile of Embryonic Stem Cells with low smn protein revealed thematic changes consistent with the developmental dysfunction seen in the pathophysiological development of patients with SMA. Pathways associated with mRNA spicing, protein translation, post-translational modification and perhaps most striking, mitochondrial function and specifically mitochondrial dysfunction were highlighted by each of the bioinformatics data analyses employed. It is striking that these disease-relevant effects are observable in the cells that are still embryonic stem cells. Although not demonstrated here, these cells can reasonably be expected to be differentiated into the range of somatic cell types of all three germ layers thus far characterized in the stem cell literature. However, only one of those cell types shows the pathophysiological effects characteristic of spinal muscular atrophy. That the themes that are seen as idiosyncratic to one specific cell type are seen at the embryonic stage has implications not only for our understanding of development but also for therapeutic intervention. Specific proteins have been highlighted that are also of use as potential biomarkers as well as providing insights into discerning the still unknown cascade that results in motor neuron death.
This work was performed in the Wayne State University, Karmanos Cancer Center and Environmental Health Sciences CURES Center Proteomics Core that is supported by National Institutes of Health National Institute of Environmental Health Sciences P30 ES020957, P30 CA022453, and S10 OD010700. Stem cell culture was supported by National Institutes of Health National Institute of Neurological Disorders and Stroke 1R21NS071339 awarded to GCP and the Children’s Hospital of Michigan Sarnaik fund.
Compliance with Ethical Standards
This article does not contain any studies with human participants or animals performed by any of the authors.
The authors declare that they have no conflict of interest.