|Home | About | Journals | Submit | Contact Us | Français|
Human embryonic stem cells (hESCs) can differentiate into neural stem cells (NSCs), which can further be differentiated into neurons and glia cells. Therefore, these cells have huge potential as source for treatment of neurological diseases. Membrane-associated proteins are very important in cellular signaling and recognition, and their function and activity are frequently regulated by post-translational modifications such as phosphorylation and glycosylation. To obtain information about membrane-associated proteins and their modified amino acids potentially involved in changes of hESCs and NSCs as well as to investigate potential new markers for these two cell stages, we performed large-scale quantitative membrane-proteomic of hESCs and NSCs. This approach employed membrane purification followed by peptide dimethyl labeling and peptide enrichment to study the membrane subproteome as well as changes in phosphorylation and sialylation between hESCs and NSCs. Combining proteomics and modification specific proteomics we identified a total of 5105 proteins whereof 57% contained transmembrane domains or signal peptides. The enrichment strategy yielded a total of 10,087 phosphorylated peptides in which 78% of phosphopeptides were identified with ≥99% confidence in site assignment and 1810 unique formerly sialylated N-linked glycopeptides. Several proteins were identified as significantly regulated in hESCs and NSC, including proteins involved in the early embryonic and neural development. In the latter group of proteins, we could identify potential NSC markers as Crumbs 2 and several novel proteins. A motif analysis of the altered phosphosites showed a sequence consensus motif (R-X-XpS/T) significantly up-regulated in NSC. This motif is among other kinases recognized by the calmodulin-dependent protein kinase-2, emphasizing a possible importance of this kinase for this cell stage. Collectively, this data represent the most diverse set of post-translational modifications reported for hESCs and NSCs. This study revealed potential markers to distinguish NSCs from hESCs and will contribute to improve our understanding on the differentiation process.
Pluripotent embryonic stem cell (ESC)1-derived neural stem cells (NSCs) can differentiate into neurons and glia cells of the central nervous system (1), including specialized neuron types like dopaminergic, representing a potential source for treatment of neurological diseases, such as Parkinson′s disease. Therefore, a better understanding of the cellular processes behind the changes of hESCs into NSCs, including solid markers for each cell type, is fundamental to move forward with a successful regenerative cell therapy and to investigate the early human neurogenesis processes.
Many markers have been reported for the two types of stem cells (2, 3), however several of these markers are also identified in other stem or progenitor cells such as CD133 (Prominin-1) (4). Discovery of cell surface specific markers for differentiated stem cells is highly relevant for future clinical applications. In particular being able to distinguish the developmental stages of the differentiation from parental stem cells to fully mature cells would allow a correct manipulation and isolation of the cell type of interest. Moreover, such study would increase our understanding on the whole process of differentiation from embryonic cells to neural cells. Because plasma membrane-associated proteins are the key interface between cell and the surrounding environment, and they frequently present large extracellular domains suitable for antibody detection, they represent a great marker candidate potential. In addition, these proteins are very important in the cellular signaling process and cell–cell interaction and communication, processes very important for cellular differentiation. Furthermore, most membrane bound proteins involved in the abovementioned process are frequently regulated or otherwise manipulated to alter interaction partners and function by post-translational modifications (PTMs) such as phosphorylation and glycosylation.
Protein phosphorylation and glycosylation are the most common PTMs in nature and they play an important role in many protein regulatory functions and cellular and biological processes. Protein phosphorylation is a dynamic PTM involved in many different cell signaling events like cell cycle, protein synthesis, protein degradation, differentiation, as well cellular proliferation and apoptosis (5). On the other hand, protein glycosylation has several roles in cell–cell interaction, cell-matrix interaction, communication and cellular signaling (6–8). A nine carbon sugar unit termed sialic acid (SA) can be bound to the nonreducing ends of glycans attached to certain membrane proteins, secreted proteins and lipids (9). SA is involved in many physiological processes such as cell–cell interaction and molecular recognition, and is important in different pathophysiological process, including brain development and cancer metastasis (10–14).
The mass spectrometry-based proteomic approach is a powerful tool for characterization of proteins and their PTMs. Because of the low abundance of protein PTMs in comparison to their nonmodified counterpart in the cell or tissue, their detection and characterization are almost only possible by using advanced enrichment strategies. Although many studies have reported the proteome and phosphoproteome of hESCs and early differentiated stages (15–20), there are a limited number of studies available comparing the individual stages between hESCs and NSC (21, 22) especially at the PTM level. For example, Chaerkady and collaborators have reported the quantitative temporal proteomic analysis of hESC differentiation into neural cell types, including motor neurons and astrocytes (21). The authors identified a total of 1251 proteins including proteins differentially regulated during neural differentiation such as the solute carrier protein 3 member 2 (SLC3A2), a cell surface protein highly expressed only in the hESCs stage. However, the focus of this study was not on membrane bound proteins and the authors did not analyze any PTM profile during differentiation, which may also elicit important and more selective information regarding the molecular events underlying the different stages.
To obtain more information about membrane proteins involved in the changes of hESCs to NSCs and also to investigate potential markers for the two distinct cellular stages, we performed a comprehensive quantitative mass-spectrometry-based proteome and PTM-ome study of membrane fractions isolated from hESCs and NSCs. We focused the PTM study on phosphorylation and SA N-linked glycosylation. This study allowed us to identify several significantly regulated proteins in hESCs and NSCs, including proteins involved in the early embryonic development as well as in the neural development. In the latter group of proteins we could identify Crumbs homolog 2 (CR2) as a potential novel NSC marker. By using selective reaction monitoring (SRM) we were able to verify a number of potential markers including CRB2 at protein and PTM level as well as CMP-N-acetylneuraminate-poly-alpha-2,8-sialyltransferase (SIA8D) at the PTM level across different cell lines beside the one used in this study. In addition, calmodulin-dependent protein kinase-2 (CaMKII) could be an important kinase for the NSC stage, because we identified the sequence recognition motif (R-X-X-pS/T) highly up-regulated in these cells. This is the consensus site for several kinases including CaMKII. Moreover, the analysis of the regulated dataset revealed an over-representation of the extracellular matrix (ECM)-receptor pathways, which are involved in diverse processes such as differentiation and proliferation (23). To our knowledge this comparative proteome study of the membrane-associated proteins that include quantitative analysis of protein phosphorylation and SA N-linked glycosylation is the most diverse set of PTMs reported for hESCs and NSCs.
Iron-coated PHOS-select metal chelate beads were from Sigma (Steinheim, Germany). TiO2 beads were a kind gift from GL Science (Tokyo, Japan). Empore C8 extraction disk was from 3 m Bioanalytical Technologies (St. Paul, MN). Ammonia solution (25%) was from Merck (Darmstadt, Germany). Protease and phosphatase inhibitor cocktails, as well as PNGase F were from Roche (Mannheim, Germany). Modified trypsin was from Promega (Madison, WI). Lysyl Endopeptidase (Lys-C) was from Wako Pure Chemical Industries (Osaka, Japan). TSKGel Amide-80 2 mm, 3 μm particle size was obtained from Tosoh Bioscience (Stuttgart, Germany). Poros Oligo R3 and R2 reversed-phase material was from PerSeptive Biosystems (Framingham, MA). Glycolic acid, trifluoroacetic acid, and Na2CO3 were from Fluka (St. Louis, MO). Ultrapure water was from an Elga Purelab Ultra water system (Bucks, U.K.). All other chemicals and solvents used were obtained from Sigma (St. Louis, MO).
hESCs line I6 was either maintained on inactivated mouse embryonic fibroblast (MEF) feeder cells in medium comprised of knockout Dulbecco's modified Eagle's medium and Ham's F12 supplemented with 20% knockout serum replacement, 2 mm nonessential amino acids, 4 mm l-glutamine, 0.1 mm β-mercaptoethanol, 50 μg/ml Pen-Strep (all from Invitrogen, Carlsbad, CA), and 4 ng/ml of basic fibroblast growth factor (bFGF; Sigma), or on Geltrex (Invitrogen)-coated dishes in defined medium or medium conditioned with MEF for 24 h as previously described (24–26).
hESCs colonies were harvested using collagenase and cultured in suspension as embryoid bodies (EBs) for 8 days in ES medium minus FGF2. EBs were then cultured for additional 2–3 days in suspension in neural induction medium containing DMEM/F12 with Glutamax, NEAA, N2 and FGF2 (20 ng/ml) before attachment on cell culture plates coated with Geltrex. Neural rosettes formed 2–3 days after adherent culture were isolated manually using stretched glass Pasteur pipette and placed in fresh culture dishes. The rosettes were then dissociated into single cells using accutase (Invitrogen) and replaced onto culture dishes to obtain a homogeneous population of NSCs. The NSCs population was expanded in Neurobasal media containing NEAA, 2 mm glutamine, B27 and 20 ng/ml FGF2.
Immunocytochemistry and staining procedures were as described previously (27). Briefly, cells were fixed with 4% paraformaldehyde for 20 min, blocked in blocking buffer (10% goat serum, 1% BSA, 0.1% Triton X-100) for 1 h followed by incubation with the primary antibody at 4 °C overnight in 8% goat serum, 1% BSA, 0.1% Triton X-100. Appropriately coupled secondary antibodies, Alexa488-, Alexa568- or Alexa594- (Molecular Probes, and Jackson ImmunoResearch Lab Inc.) were used for single or double labeling. All secondary antibodies were tested for cross reactivity and nonspecific immunoreactivity. The following primary antibodies were used: Nestin (611658, BD Transduction laboratories, 1:500), Sox1 (AB5768, Chemicon (Temecula, CA),1:500) and Sox2 (AB5770, Chemicon,1:500).
The quantification of immunoreactive cells in culture was performed by analyzing fluorescent images using Photoshop on a minimum of 5000 cells of at least 10 randomly chosen fields derived from 3 or more independent experiments. The number of Hoechst labeled nuclei on each image was referred as total cell number (100%).
hESCs and NSCs (derived from parental hESCs lines) (1 × 107 cells) were homogenized on ice in ice-cold 0.1 m Na2CO3 lysis buffer (28) supplemented with protease inhibitor (Roche Complete EDTA Free), PhosSTOP phosphatase inhibitor mixture (Roche Applied Science, Meylan, France) and 10 mm sodium pervanadate. The homogenates were sonicated for 2 × 20 s and incubated on ice for 1 h with rotation. After incubation, the lysates were centrifuged at 150,000 × g for 90 min at 4 °C to separate soluble proteins from membrane proteins (pellet). The pellets were washed twice with triethylammonium bicarbonate (TEAB) (500 mm and 50 mm, respectively) to reduce contamination from soluble protein.
The membrane fraction was resuspended in 6 m urea, 2 m thiourea, 50 mm TEAB. Proteins were reduced at a final concentration of 10 mm DTT for 1 h at room temperature and subsequently alkylated in 20 mm iodoacetamide (IAA) for 1 h at room temperature in the dark. The digestion was performed in two steps: first, 4 μl of endoproteinase Lys-C was added and the sample was incubated for 3 h at room temperature. Following incubation, the sample was diluted eight times with 50 mm TEAB, pH 8.0, and digested with trypsin (approximate 2% (w/w)) overnight at room temperature. To stop the trypsin digestion and remove insoluble material (e.g. lipids) the samples were acidified to 1% formic acid (FA) and centrifuged at 14,000 × g for 10 min. A small amount of the supernatant was dried by vacuum centrifugation for precise quantification by amino acid composition analysis. The remaining supernatant was lyophilized before dimethyl labeling.
Tryptic peptides were differentially labeled on-column with stable isotope dimethyl as described previously (29, 30). Briefly, 200 μg of each sample from hESCs and NSCs were loaded onto separate HLB columns (Waters, Milford, MA) and washed with 5 ml of 50 mm sodium phosphate buffer (pH 7.5) containing 4% formaldehyde in water (CH2O or CD2O or 13CD2O) and 0.6 m cyanoborohydride (NaBH3CN or NaBD3CN). The peptides originating from the hESCs and NSCs membrane fractions were mixed 1:1 and stored for further enrichment and analysis.
To separate multi- and monophosphorylated peptides from formerly sialylated glycopeptides and nonmodified peptides, the samples were subjected firstly to SIMAC enrichment (31) to isolate multiphosphorylated peptides, followed by an optimized TiO2 enrichment procedure (32–34) to purify monophosphorylated and SA containing glycopeptides from nonmodified peptides. After enzymatic deglycosylation of the TiO2 bound peptide fraction, HILIC was used to separate the deglycosylated and phosphorylated peptides.
Briefly, the sample was resuspended in SIMAC loading buffer (0.1% TFA, 50% ACN) and combined with PhosSelect IMAC beads (Sigma-Aldrich, St. Louis, MO) previously equilibrated with the loading buffer. After 30 min incubation at room temperature under rotation, beads were first washed with the loading buffer followed by the elution of mainly mono-phosphorylated peptides with an acidic solution (1% TFA, 20% ACN). The acidic fraction, as well the flow though, was lyophilized and TiO2 enrichment was later performed. The IMAC beads were then incubated with a basic solution (1% ammonium hydroxide, pH 11.3) to elute multiply phosphorylated peptides. The fraction containing mainly multiphosphorylated peptides was desalted using Poros R3 prior nanoLC-ESI-MS/MS.
For the simultaneous enrichment of mono-phosphopeptides and SA containing glycopeptides TiO2 chromatography was applied essentially as described previously (35). The IMAC flow through and mono-phosphorylated peptide fraction from SIMAC were resuspended in TiO2 loading buffer (80% ACN, 5% TFA, 1 m glycolic acid) and incubated for 30 min with TiO2 beads. The beads were then sequentially washed with i) 100 μl TiO2 loading buffer, ii) 100 μl 80% ACN, 1% TFA and iii) 60 μl 20% ACN, 0.1% TFA. Phosphopeptides and SA glycopeptides were eluted with 1% ammonium hydroxide, pH 11.3 and lyophilized. The flow through from the TiO2 column was collected and desalted on an HLB solid reversed-phase column (Waters) before capillary HILIC fractionation.
The eluted peptides from TiO2 were resuspended in 50 mm TEAB, pH 8.0 and treated with N-glycosidase F and Sialidase A overnight at 37 °C to remove N-linked glycans (33). After deglycosylation, the peptides were desalted using a Poros R3 micro-column and subsequently subjected to capillary HILIC fractionation before nanoLC-ESI-MS/MS analysis, as described below.
Before the HILIC fractionation, the samples were desalted using self-made micro-columns packed with Poros R3 reversed-phase resin. A small plug of C8 material from a 3 m EmporeTM C8 was inserted in the constrict end of a P10 or P200 tips. The micro-column was packed with reverse-phase resin (suspended in 100% ACN) by applying a gentle air pressure with the syringe. The acidified samples were loaded onto the micro-column and washed three times with 0.1% TFA. Peptides were eluted with 50% ACN, 0.1% TFA, and 70% ACN, 0.1% TFA, respectively.
The mono-phosphorylated and deglycosylated fraction and the nonmodified peptides fraction were further fractionated using HILIC essentially as previously described (36). Briefly, peptides were resuspended in B-solvent (90% ACN, 0.1% TFA) and loaded onto a 450 μm OD x 320 μm ID x 17 cm micro-capillary column packed with TSK Amide-80 (3 μm; Tosoh Bioscience, Stuttgart, Germany) using an Agilent 1200 Series HPLC (Agilent, Santa Clara CA). The peptides were separated using a gradient from 100–60% solvent B (A = 0.1% TFA) in 30 min at a flow-rate of 6 μl/min. Fractions were collected every 1 min and combined into 10 final fractions based on the UV chromatogram and subsequently dried by vacuum centrifugation.
The samples were resuspended in 0.1% FA and loaded onto an EASY-nLC system (Thermo Scientific, Odense, Denmark). The peptides were loaded onto a 20 cm long fused silica capillary column (100 μm inner diameter) packed with ReproSil - Pur C18 AQ 3 μm reversed-phase material (Dr. Maisch, Ammerbuch-Entringen, Germany). The peptides were eluted with an organic solvent gradient from 100% phase A (0.1% FA) to 34% phase B (0.1% FA, 95% ACN) at a constant flow rate of 250 nL/min. The nano HPLC was online connected to an LTQ-Orbitrap Velos mass spectrometer (Thermo Scientific, San Jose CA). The LTQ-Orbitrap Velos was operated in positive ion mode with data-dependent acquisition. The full scan was acquired in the Orbitrap with an automatic gain control (AGC) target value of 1 × 106 ions and a maximum fill time of 500 ms. Each MS scan was acquired at resolution of 60,000 FWHM followed by 7 MS/MS scan of the most intense ions. For improved fragmentation of phosphorylated peptides, multistage acquisition (MSA) was enabled or high-resolution HCD-MS/MS. The nonmodified peptides were fragmented by low-resolution CID-MS/MS or HCD-MS/MS. Ions selected for MS/MS were dynamically excluded for a duration of 45 s. All raw data were viewed in Xcalibur v2.0.7 (Thermo Scientific).
The raw data were processed using Proteome Discoverer v1.3 (Thermo Scientific) and searched against the Swiss-Prot human v3.53 database (20,243 entries) and the NCBI homo sapiens database (666,077 entries) using an in-house MASCOT server (v2.3, Matrix Science Ltd, London, UK). Database searches were performed with the following parameters: precursor mass tolerance of 10 ppm, product ion mass tolerance of 0.6 Da (for CID fragmentation) or 0.05Da (for HCD fragmentation), carbamidomethylation of Cys as fixed modification, and two missed cleavages for trypsin. Searches were also conducted with the following variable modifications: oxidation of Met, acetylation of protein N-terminal, phosphorylation on S/T/Y, deamidation of Asn and dimethyl light, intermediate or heavy of N-terminal and Lysine (+28.031, +32.056 Da and +36.076 respectively) depending on the labeling of the replicates. Only peptides with up to q-value of 0.01 (Percolator) (37), Mascot rank 1 and a cut-off value of Mascot score ≥ 22 (32) were considered for further analysis. All spectra were searched against the Swissprot database human first and all spectra that did not result in a confident identification were resubmitted to search against the NCBI human database in a second step to increase the coverage of the analysis. Protein annotation was obtained using ProteinCenter (Thermo Scientific, Odense, Denmark).
All accession codes with peptide sequences containing 100% sequence identity against an individual peptide characterized by MS/MS in the database search were grouped together as a single protein description to remove protein name redundancy.
The quantitative analysis was carried out on the log2-values of the measured intensities and the data were normalized based on the median or the LOESS using the R statistics package or DanteR package (http://www.omics.pnl.com). Significantly regulated peptides or proteins between biological replicates were determined based on twofold change (38) and only the regulated ones found in at least two replicates were selected to further analysis.
To assign phosphosites with ≥ 99% confidence we combined the mascot delta score (39) and the phosphoRS (40) (included on Proteome Discoverer 1.3). The MD-score ≥ 9 (correspond to 1% FLR) and phosphoRS probability ≥ 99% threshold were used to confirm the site of phosphorylation. In addition, all phosphosites mentioned in the Table I were manually checked. Only glycosylated sites within the consensus sequence (N-X-S/T - where X is any amino acid except proline) were selected for further analysis.
We performed the motif-X algorithm (41) to determine over-represented sequence motifs from the total as well as the regulated phosphoproteome dataset, using the Homo sapiens database as background. Only phosphorylated peptides with confidently phosphosite were used. The peptides were centered at the phosphorylated residues (S, T, or Y), using a ± 6 amino acid residue sequence window surrounding the phosphosite and only motifs with p < 10−6 were allowed.
The SRM approach was used to validate some interesting proteins and modified peptides. The transitions were defined based on the raw data obtained from the comparative analysis. The SRM experiments were carried out with hESC and their derived NSC, as mentioned above, using two different cell lines H14 and H9 beside the I6 for the unmodified proteome where two peptides were used per protein. For the modified peptides, only H14 was used because more of the material was available for this cell line. The experiment was performed in duplicate for each cell line. The samples ESC and NSC were treated as described before to obtain the membrane fraction. The digested peptides were labeled with dimethyl labeling to get the relative quantitation using SRM (42). Subsequently, the sample from each cell line was pooled resulting in 1:1 ratio of ESC and NSC and the samples were fractionated using HILIC before SRM. For the H14, TiO2 enrichment was performed followed by deglycosylation (as described before) without HILIC fractionation. Test runs were performed to establish the retention time window for each peptide ion and to refine the transitions. The sample was resuspended in 0.1% FA and loaded onto an EASY-nLC system (Thermo Scientific, Odense, Denmark). The peptides were loaded onto a 16 cm long fused silica capillary column (75 μm inner diameter) packed with ReproSil - Pur C18 AQ 3 μm reversed-phase material (Dr. Maisch, Ammerbuch-Entringen, Germany). The peptides were eluted with an organic solvent gradient from 100% phase A (0.1% FA) to 34% phase B (0.1% FA, 95% ACN) at a constant flow rate of 300 nL/min. For the samples prefractionated using HILIC, the gradient was 120min whereas for the phosphorylated and deglycosylated peptides the gradient was 180 min. The nano-HPLC was interfaced with a triple quadrupole mass spectrometer (TSQ Vantage, Thermo Scientific, San Jose CA). The TSQ Vantage was operated in positive ion mode with the following parameters: 2300V of spray voltage, 200 °C capillary temperature, Q2 gas pressure 1.5 mTorr argon, and Q1 and Q3 were set to 0.7 Da (FWHM). The software Skyline (43) (MacCoss Lab Software) was used to create the transition, to obtain the collision energy and to process the data. The integrated peak areas of all transitions for quantitation was performed in Microsoft Excel by summing the area of all transitions. The final ratio NSC/ESC was obtained dividing the integrated area from the heavy and light labeling. NSCs derived from H14 and H9 lines were labeled with heavy dimethyl and the ESCs with light dimethyl, whereas for the I6 cell line we performed reversed labeling.
The raw data and the annotated MS/MS spectra are available for download at the Proteome Commons Tranche data repository (https://proteomecommons.org/groupssearch.jsp) under the project name “Comparison of the membrane proteome and PTMome of human embryonic and neural stem cells.”
We have previously shown that a homogeneous population of NSCs can be isolated and expanded from the rosette structures induced by FGF2 during hESCs/iPSC differentiation via EB formation (24, 27). In this study, we used hESCs line I6 and as shown in Fig. 1A–1C the I6 undifferentiated ESC displayed typical hESCs morphology and >95% of our feeder-free culture of hESC expressed pluripotency markers such as Oct4 and Tra1–60. NSC were derived as previously described (24) and can be prolonged cultured while retaining their NSC identity, and maintaining the ability to differentiate into neurons and glia cells. NSC showed similar purity, with more than 95% of the cells expressing the known markers nestin, Sox1, and Sox2 (Figs. 1D–1F).
Membrane associated proteins are normally presented in low abundance, and their function and interactions with other molecules are frequently regulated by PTMs, such as phosphorylation and glycosylation. In standard shotgun proteomic studies membrane proteins are therefore underrepresented, as are the modified peptides derived by proteolysis from this subproteome. Consequently, to be able to detect membrane proteins and their associated PTMs, substantial enrichment steps have to be employed. In the present study we used a comprehensive quantitative MS-based strategy to study the membrane subproteome, as well as the phosphorylation and SA-glycosylation events changed in hESCs and NSCs (Fig. 2). This subproteome includes all insoluble proteins in the cell, including organelle membrane proteins as well as plasma membrane proteins. The hESCs and NSCs (derived from parental hESCs line) were lysed with Na2CO3 to generate linear membranes (28, 44) and the membrane-associated proteins were pelleted by ultracentrifugation. The membrane-associated proteins were digested with Lys-C and trypsin and the resulting peptides were labeled with dimethyl (29) and combined in a 1:1 ratio (hESCs/NSCs). To study the phosphorylated peptides, formerly SA-containing N-linked glycopeptides and nonmodified peptides from the membrane proteins, a comprehensive enrichment strategy was performed similar to previously described (35). Briefly, SIMAC (31) was performed to separate multiphosphorylated peptides from mono- and nonphosphorylated peptides. The multiphosphorylated peptides eluted from IMAC beads were analyzed directly by LC-MS/MS and the mono- and nonphosphorylated peptides were subject to TiO2 enrichment. The TiO2 enrich for both phosphorylated and sialylated glycopeptides as we have previously shown that TiO2 have a very high affinity toward these acidic modified peptides if a specialized loading buffer is used (33, 34, 45, 46). The oligosaccharide moieties were then enzymatically released by N-glycosidase F and sialidase A. Before the LC-MS/MS analysis, both the mono-phosphorylated and formerly sialylated peptides fraction and the nonmodified peptide fraction, from the TiO2 flow through, were prefractionated offline on a TSKGel Amide 80 HILIC HPLC column. The study of the modified peptides was performed with three biological replicates whereas the proteome study of nonmodified peptides was performed with two biological replicates. In addition, a validation of important findings was carried out by SRM using hESCs and their derived NSCs from two different lines H14 and H9 beside the I6 for the unmodified proteome whereas only H14 was used to validate the modified peptides.
By using the strategy described above we identified in total 5105 proteins combining the data obtained from the nonmodified peptides (supplemental Table S1) and the modified peptides (supplemental Tables S2 and S3). Fig. 3A shows the number of identified proteins and peptides in the study. Here a total of 13068 nonmodified unique peptides were identified from 3414 proteins, most of which are membrane-associated proteins (57%). However, 43% of the identified proteins do not have transmembrane domains or signal peptides localizing them to a membrane or for secretion (Fig. 3B). This suggests either a contamination with cytoplasmic proteins or that these proteins can be modified by specific lipids (47) (Lipidation) or connected to the membrane via protein–protein interactions. Lipidation plays an important role in protein function and cellular localization. For example, we found several glycosyl-phosphatidylinositol (GPI) that can serve as anchor for extracellular proteins to the plasma membrane (47), such as the bone marrow stromal antigen 2 (BST-2) (48) and CD59 glycoprotein (49). We also identified several DNA and RNA binding proteins, such as transcriptional factors, which is most likely because of their presence in larger complexes with proteins and DNA or RNA that pellets during the ultracentrifugation. Reducing the speed of ultracentrifugation could perhaps prevent this co-purification of protein–protein and protein-DNA and RNA complexes.
A total of 10,087 phosphorylated peptides (8560 nonredundant phosphorylation sites) (supplemental Table S4) were identified. To assign the phosphosites correctly we combined the mascot-delta score (39) and phosphoRS (40) algorithms. After applying MD-Score ≥ 9 or phosphoRS probability ≥ 99% thresholds we identified 7140 nonredundant phosphosites with ≥ 99% confidence. These sites were found in 7862 phosphorylated peptides (supplemental Table S2) mapped to 2432 proteins (Fig. 3A) and only unique phosphopeptides with ≥ 99% confidence were considered for further analysis. The distribution of 79% (pS), 17% (pT), and 4% (pY) (Fig. 3C) was similar to those reported in other large-scale membrane phosphoproteome studies (44, 50). As depicted in Fig. 3D, the phosphopeptides were mainly found singly or doubly phosphorylated. By using the N-linked glycosylation consensus motif (N-X-S/T; X ≠ P) to validate the formerly SA N-linked glycosylated peptides, we identified 1810 formerly SA containing N-linked glycopeptides (supplemental Table S3) (corresponding to 1666 nonredundant glycosylation sites) mapped to 854 proteins. We are aware that some of the glycosites identified here, specially the NG motif, could be because of chemical deamidation as suggested by Palmisano et al. (51). Those sites were not taken into account during the subsequent data mining. Not surprisingly, membrane-associated proteins represented 49 and 94% of total identified phosphorylated and formerly SA N-linked glycosylated proteins based on transmembrane domain and signal peptide assignment, respectively (supplemental Tables S2 and S3), reflecting that N-linked glycosylation mainly occurs on proteins associated to cellular membranes such as the cell surface, endoplasmic reticulum and the Golgi apparatus. Fig. 3E depicts the overlap between the membrane proteome, phosphoproteome, and sialio-N-linked glycoproteome of the merged replicates. Here a total of only 149 proteins were present in all three subproteomes, illustrating the significant increase in the coverage of the membrane proteome obtained by targeting not only nonmodified peptides but also phosphopeptides and SA N-linked glycopeptides. In addition, the parallel study of the nonmodified and modified peptides is important to verify if the modified site is regulated based on protein expression and degradation or alteration in site occupancy. Furthermore, 235 proteins in total were both phosphorylated and carried SA N-linked glycosylation.
Several proteins in the membrane subproteome were differentially expressed in hESCs and NSCs, and the significantly regulated ones found in at least two replicates are represented in the supplemental Table S5. In this manuscript, we used the term up-regulated to describe a protein and peptide more abundant in NSCs whereas the down-regulated molecules are more abundant in hESCs. In total, 68 proteins were found up-regulated, whereas 113 proteins were found down-regulated. To identify protein networks involved in the differentiation we used the online database resource STRING (52). Fig. 4A depicts the interaction network of down-regulated proteins, which are involved in the early development. Fibronectin (FN1) and Basigin (BSG) seem to be critical for this network, interacting strongly with several proteins found in this study. FN1 is involved in many cellular interactions with the extracellular matrix and plays an important role in cell adhesion, migration, growth, differentiation, and cytoskeleton organization (53–55). This important role of FN1 in vertebrate development was showed by the early embryonic lethality of mice with FN gene inactivation (56). FN1 is a glycosylated protein and indeed we found a peptide containing the glycosylation site Asn528 down-regulated. Although, FN1 is expressed in pluripotent cells, this protein is also known to be present in astrocytes (57). BSG is also a glycosylated transmembrane protein that plays a pivotal role in early embryogenesis during the preimplantation stage (58). BSG has been found to participate in the cell-surface orientation of solute carrier family 16 member 1 (SLC16A1) to the plasma membrane (59) and has been implicated in the induction of extracellular matrix metalloproteinases (60). Although known hESCs undifferentiated markers are represented in Fig. 4A, such as alkaline phosphatase (ALPL) and DNA (cytosine-5)-methyltransferase 3 beta (DNMT3B), we also identified other known markers like podocalyxin-like (PODXL), lin-28 homolog A (LIN28) and Sal-like protein 4 (SALL4) (61) (supplemental Table S5).
The interaction network of proteins up-regulated in NSCs is represented in Fig. 4B, where we observed the presence of two different networks that are involved in neuronal development and energy metabolism. Dihydropyrimidinase-related protein 2, (DPYSL2), also known as collapsin response mediator protein 2 (CRMP-2), and isocitrate dehydrogenase [NADP] (IDH2), seem to be key nodes in the networks. DPYSL2 is a plasma membrane associated protein, involved in neuronal differentiation, as well as in axon growth and guidance, neuronal growth cone collapse, and cell migration (62) and has been also reported to be up-regulated in NSC in monkey (63) and rat (64). DPYSL4 and 5 are proteins from the same family as DYSL2 that are also observed in this network. Moreover, we found the protein dynactin subunit 2 (DCTN2) and CD47 involved in the development of the nervous system. DCTN has been implicated in synapse stability (65) whereas CD47 plays a role in cell adhesion and it is suggested to be involved in memory and synaptic plasticity in the hippocampus (66). IDH2 is the isocitrate dehydrogenase found in mitochondria that catalyzes the decarboxylation of isocitrate into α-ketoglutarate, producing NADPH, which is an important anti-oxidative agent (67). Intracellular redox stage and the oxygen availability are factors that control the differentiation of NSC to the other type of cells in the central nervous system. Le Belle and collaborators (68) have demonstrated that ROS levels increase during the proliferation process of NSC, and enhance self-renewal and neurogenesis. However, the ROS levels balance is important because high levels of ROS may lead to cell death (69). As IDH2, CKB that is the brain type of the creatine kinase also has a role in the energy metabolism (70). In addition to the abovementioned proteins represented in Fig. 4B, we also found RNA-binding protein Musashi homolog 1 (MSI1) and Nestin (NES) that are known markers for NSCs.
One of our aims in this study was to explore the changes in proteins in the membrane subproteome carrying protein phosphorylation and SA N-linked glycosylation to identify important cellular signaling pathways and surface interactions underlying the differentiation change of pluripotent hESCs to NSCs.
In total we identified 150 formerly SA glycopeptides (152 sites) down-regulated whereas 119 peptides (120 sites) were up-regulated (supplemental Table S5). Several SA glycoproteins involved in neuronal development were found significantly regulated in our dataset. An example is the CMP-N-acetylneuraminate-poly-alpha-2,8-sialyltransferase (PST, ST8SiaIV, or SIA8D - Asn50) that catalyzes the poly-condensation of alpha-2,8-linked sialic acid, required for the synthesis of polysialic acids (PSA), which is present on the embryonic neural cell adhesion molecule (N-CAM) and possibly other surface proteins. PSA is involved in developing the nervous system and is specifically important for neuronal migration, differentiation and synaptic plasticity (10). Impairments of the PSA mechanism result in severe brain damage and lack of neuronal migration. Another example of SA glycoprotein up-regulated in NSC is the neuronal cell adhesion molecule (NRCAM - Asn993). This protein is also involved in several events of the nervous system development, such as synaptogenesis (71). It is worth mentioning that this protein is known to be expressed in neurons and it could be identified from the very small part of spontaneous differentiation taken place in our cell preparation. However, further experiments are needed to confirm this finding.
Glycosylated proteins were also identified as over-represented in hESCs, highlighting the EMILIN-1 protein with two different occupancy sites (Asn455 and Asn794) highly changed in this stage. This protein is a connective tissue glycoprotein associated with elastic fibers (72), and the presence of its mRNA has been detected at high abundance in ectoplacental cone and trophoblast giant cells of developing mouse embryos (73).
Protein phosphorylation is one of the most common PTM that controls important cellular processes by the action of kinases and phosphatases that phosphorylate and dephosphorylate proteins, respectively. We identified 287 phosphopeptides up-regulated and 308 down-regulated during the differentiation from hESCs to NSCs, present in at least two of the three replicates (supplemental Table S5).
In this study we identified eight SA N-linked glycosylation and 24 phosphorylation sites that statistically changed their occupancy in hESCs and NSCs that are located within important protein kinases and phosphatases (Table I). For example, protein kinase C delta (PRKCD) (Ser645) and mitogen-activated protein kinase (MAPK6 or ERK3) (Ser189) had phosphosites more phosphorylated in hESCs, whereas the mitogen-activated protein kinase (MAPK1 or ERK2) (Thr185 and Tyr187) had its phosphosites detected more phosphorylated in NSCs. PRKCD is a serine and threonine-protein kinase that plays crucial role in apoptosis, control of growth, and differentiation (74). It is important to point out that the Ser645 is localized in the protein kinase C-terminal domain and may have a regulatory activity, although it is still unknown the function of this site. Interestingly, the phosphorylation of Thr185 and Tyr187 on the MAPK1 and Ser189 on the MAPK6 leads to the activation of these kinases (75, 76). The MAPK signaling is known to play a role in cellular proliferation, differentiation, and development process (77). Indeed, Armstrong and collaborators (78) have reported the role of the MAPK and ERK signaling in the maintenance of hESCs pluripotency. The authors observed that MAPK6 and MAPK1 were transcriptionally down-regulated during the differentiation process from ESC to embryoid body (78). Regarding the MAPK6, our results support this study, because the phosphorylation of the regulatory site Ser189 was found decreased during the differentiation, consistent with the inactivation of this kinase. However, our results suggest an activation of the MAPK1 in NSC, because we observed an increased phosphorylation of its regulatory sites Thr185 and Tyr187. Indeed, it has already been shown that MAPK1 activation promotes proliferation and inhibits neuronal differentiation of NSC (79).
Regarding phosphosites within phosphatases that had their occupancy changed in the two cell stages, it is worth mentioning the Ser594 on the tyrosine-protein phosphatase nonreceptor type 14 (PTPN14) and the Ser507 on the protein phosphatase 1 regulatory subunit 12A (PPP1R12A), were found more abundant in the hESCs compared with NSCs, respectively (Table I). PP1R12A is a member of the major group of phosphatases in eukaryotes. Therefore, its phosphorylation seems to play a key role in the insulin signal transduction, because this hormone stimulates PP1R12A phosphorylation at several sites, including Ser507 (80).
To determine over-represented sequence motifs from the whole as well as regulated phosphoproteome dataset, we performed the motif-X algorithm (41) using the Homo sapiens database as background. Taking into account only the phosphopeptides that statistically changed their occupancy during the differentiation, the motifs pS/T-P and pS-X-X-E were found over-represented in both stages whereas the motif R-X-X-pS was found over-represented only in NSC (Fig. 5). It is worth mentioning that around 50% of all regulated peptides from both stages contain the pS/T-P sequence consensus, which is a known target for the proline-directed kinase group (CMGC), which includes the mitogen-activated kinases (MAPK) and cyclin-dependent kinases (CDK) (for review, see (81)). Casein kinase 2 (CK2) is known to phosphorylate Ser/Thr residues surrounded by acidic regions pS-X-X-E. It is important to point out that the basic motif R-X-X-pS is a target for several kinases including CaMKII, which seems to have a key role in the NSCs. This is supported by the fact that CaMKII is involved in several functions in neuronal differentiation and also in synaptic transmission and plasticity (82). Indeed, by using the NetworKIN algorithm (83) that predicts the kinases involved in the phosphorylation of a specific site, we observed that CaMKII is only involved in the phosphorylation of phosphosites from the NSCs, whereas the kinases CDK2, CDK3, CK2, p38 were predicted to be responsible for the phosphorylation of site found in both hESCs and NSC (Fig. 6).
The analysis of proteins and modified peptides that were differentially regulated in the two cell stages revealed that the extracellular matrix (ECM)-receptor interaction pathway was over-represented in our data set (Fig. 7). This pathway is involved in the signaling that controls cell shape, motility, migration, growth, survival, and differentiation (23). Moreover, ECM are components of the stem cell “niche,” which are responsible for sending signals from the receptors to the inner of the cell that controls, for example, the NSC fate (proliferation, differentiation, survival, and death) (84).
Laminin (LAM) is an example of ECM that binds to different receptors such as integrin and syndecan (85). We found several laminin subunits such as beta-1 (LAMB1), alpha-1 (LAMA1) and gamma1 (LAMC1) that were up-regulated at the proteome level during the differentiation. Interesting, the sites Asn2038 and Asn677 on LAMA1 and LAMB1, respectively, presented a decreased SA-glycosylation level, confirming the modulation of the SA level or glycan structure of these sites. The reason for the decrease in abundance of the Asn2038 and Asn677 could be because of a decrease in sialylation or the addition of PSA to these sites. TiO2 is believed to bind PSA very strongly, similar to highly phosphorylated peptides, and the molecules modified by PSA might not be eluted from the TiO2 using the elution buffer used here. In addition, we found two SA-glycosites on the laminin subunit alpha 5 (LAMA5) that showed different abundance profiles; whereas the occupancy of Asn73 was found more SA-glycosylated in NSCs, the Asn2303 was detected more abundant in hESCs. Regarding the thrombospondin (THBS) proteins, the site occupancy of the Asn1067 on THBS1 increased whereas the site occupancy on Asn1069-THBS2 seems to be decreased during the differentiation. It is important to note that we could not detect nonmodified peptides from the proteins LAM5A and THBS2 in our proteome study. Thus, it is not possible to define whether it is the site occupancy or the protein expression that is changing during the differentiation process (Fig. 7).
As mentioned above, LAM, FN, and THBS are ECMs that bind to different integrin receptors, which are transmembrane heterodimers (αβ) (23). Many of the integrin receptors have multispecificity. For example, the integrin alpha3-beta1 (ITGA3/ITGAB1) binds all above-cited ECM. We found several alpha integrins in our study (ITGA1, ITGA3 and ITGA6) that were found down-regulated in the proteome and sialiome data set. Even though we identified the integrin beta1 (ITGAB1) in our study, this protein was not found significantly regulated. Although the integrin alpha 6 (ITGA6) is highly expressed in embryonic, neural, and hematopoietic stem cell of mouse (86), we found this protein down-regulated. These findings suggest that the expression of this protein might be species specific. Hall and collaborators (87) have observed that laminin/integrin-beta1 is necessary for NSC proliferation and survival. However, other receptors besides ITGAB1 seems to be necessary in order for the laminin to promote NSC growth (87). We believe that the syndecan (SDC) receptor may have a role in promoting NSC growth by interacting with laminin, because SDC is a family of transmembrane proteoglycan that act as co-receptor for ECM and growth factor (85). The Ser187 on SDC2 was found increasingly phosphorylated in NSCs (Fig. 7). SDC2 is found in developing neural tissues (85) and this phosphosite has already been suggested to be a substrate for the protein kinase C (PKC), which is also involved in cell-matrix and cell–cell adhesion (88).
These interesting findings may suggest that the above-mentioned proteins and modification sites play a role in the differentiation process. However, it is important to stress that further studies need to be performed to evaluate their exact biological functions.
We also investigated new potential markers to distinguish NSCs from hESCs at the membrane subproteome and PTM level. Table II contains the potential markers found in this study. An example is the transmembrane glycoprotein Crumbs 2 (Crb2) that was found up-regulated in NSCs in the proteome level as well as at its glycosylation sites Asn478, Asn786 and Asn886. This apical polarity protein has been recently reported to be essential for survival and differentiation of ESC-derived neural progenitors in mice and its up-regulation was correlated with early neural markers as Sox1 and Musashi at the proteome level (89). In the presented study we also observed the up-regulation of Musashi-1 at the proteome level and this finding supports our suggestion that Crb2 is a potential NSCs marker.
Other potential NSCs marker is the wntless protein (WLS), a conserved transmembrane protein that plays an important role in the control of the secretion of Wnt proteins (90, 91). The Wnt signaling pathway is known to be involved in several processes important for stem cells, including self-renewal, pluripotency and differentiation (92–94). In the neural development, Wnts proteins regulate neurogenesis and are important for neural stem cell self-renewal (95–97). Wnt transportation to the surface of the cell is inhibited in the absence of WLS, which decreases the neurogenesis (98).
At the PTM level, it is important to mentioned the prolow-density lipoprotein receptor-related protein 1 (LRP1), also known as apolipoprotein E receptor, which is involved in endocytosis and signaling functions during the embryonic development (99) and in the brain (100) where it participates on synaptic plasticity. In this protein, the glycosylation of Asn2127 and Asn3089 were found highly abundant in NSCs whereas Asn729 was more abundant in hESCs. However, the expression level of LRP1 did not change significantly, which indicates a positive modulation of the glycosylation or sialylation level of these sites during the differentiation from hESCs to NSCs.
A number of proteins were selected for validation based on the abovementioned findings. The relative quantitation by SRM was based on the integrated peak difference between the heavy and light dimethyl labeling. Per peptide, at least three transitions of heavy and light were selected as well as two peptides per protein. To verify if the findings were not only specific for the hESC line I6 and their derived NSCs, two different cell lines were used (H14 and H9). Because of the low amount of material obtained from the membrane fractions, the modified peptides were validated using only the ESC H14 line and derived NSC. The results from the SRM are summarizing in Table III and the complete information about the monitored transitions, the integrated peak area and coefficient of variation (CV) are shown in the supplemental Table S6. The SRM data for each protein and PTM showing the peptide abundance including all transitions are illustrated in the supplemental Figs. S1 to S5. In the SRM study we were able to validate CRB2 as a potential NSC marker as well as its glycosylation site Asn886. The validation by SRM proved the previous findings and indeed our results were not specific for ESC and derived NSC from the I6 line.
In conclusion, we identified in total 5105 proteins, combining the proteome and PTM-ome analysis, where 57% contained transmembrane domain or signal peptides. The enrichment strategy used proved the strength of the TiO2 enrichment for phosphorylated peptides and SA-containing N-linked glycopeptides leading to the identification of 10,087 and 1890 phosphopeptides and formerly SA N-linked glycopeptides, respectively. However, 7862 phosphopeptides were identified with more than 99% confidence in site assignment. Moreover, we could identify significantly changes in abundance of 181 proteins, 644 phosphorylation sites and 272 sialylated glycosylation sites in at least two replicate. Collectively, this data represent the most diverse set of proteins and PTMs reported for hESCs and NSCs. In this study, we were able to detect a number of proteins and PTMs specific for the two cells stages. In particular, sialylated glycoproteins seem to be promising cell-surface markers, where the level of sialylation can pinpoint the differentiation stage as observed in LRP1 as mentioned above. For instance, we could identify Crumbs 2, JAM2, GFRA1, MARCKS, PRTG, WLS, SIA8D, SULF2, IGDCC3, LRP2, PTX3, and LRRC4B as potential markers for NSCs. Several of these could be validated using SRM including CRB2 at protein and PTM level as well as SIA8D at PTM level across different cell lines beside the one used in this study.
The discovery of new surface markers is essential for the basic and clinical study of stem cells, because several available markers are ambiguous. The cell-surface proteins markers can be used to distinguish different cells types, as well as different differentiation stages. Furthermore, the study of PTMs can be interesting once many biological processes, such as stem cell self-renewal and differentiation, are modulated by proteins that are frequently regulated by PTMs.
To our knowledge this is the first comparative study of hESCs and NSCs that includes quantitative data on phosphorylation and SA N-linked glycosylation. These data will contribute to improve our understanding on the differentiation process and reveals possible markers to distinguish the NSCs from hESCs.
We thank Thiago Verano Braga and Alistair Edwards (Protein Research Group, Department of Biochemistry and Molecular Biology University of Southern Denmark) for help in proofreading the manuscript. For the advice in the SRM analysis, I would like to thank Gerard Such Sanmartín, Simone Sidoli and James Williamson. Especial thanks to Anna Maria Swistowska from Buck Institute for all help with the cells for the validation.
* This work was supported by the Lundbeck Foundation (M.R.L. Junior Group Leader Fellowship and a grant for an LTQ-Orbitrap Velos) and in part by California Institute for Regenerative Medicine Grants TR-01856 and CL1-00501 to Xianmin Zeng.
This article contains supplemental Figs S1 to S5 and Tables S1 to S6.
1 The abbreviations used are: