Cell-Surface Proteome of Embryo-Derived Stem Cell Lineages
ES, TS, XEN, and EpiSC were biotinylated with the membrane-impermeable reagent sulfo-NHS-SS-biotin (A), which binds to primary amines of cell-surface proteins (
Roesli et al., 2006). Individual lysates from biotinylated ES, TS, XEN, and EpiSC were prepared, and biotinylated proteins were affinity captured to collect cell-membrane-enriched protein samples. For comparison, proteins that did not bind to the beads were also collected. These should be depleted in cell-surface proteins but contain cytoplasmic and nuclear proteins (membrane-depleted whole cell samples). Samples were analyzed by mass spectrometry in quadruplicate using a MudPIT approach (
Taylor et al., 2009). A total of 3,432 proteins were identified (1,758 for ES; 2,391 for TS; 2,442 for XEN; 2,169 for EpiSC; see
Table S1 available online), which represents one of the largest mouse stem cell protein data sets reported (
Van Hoof et al., 2006; Wang et al., 2008a).
Gene Ontology (GO) analysis revealed that biotinylated fractions were highly enriched for plasma membrane proteins and depleted in nonplasma membrane proteins (p < 1 × 10
−13, Fisher's exact test). This suggests that cell-surface proteins had been successfully captured. However, one caveat of chemical labeling strategies and organellar purifications in general is the detection of proteins with annotations other than the organelle of interest (
Bergeron et al., 2010), which could hinder the identification of lineage-specific cell surface proteins. To address this challenge, we developed a data mining strategy to predict proteins that are localized to the cell surface. Machine-learning algorithms (
Hall et al., 2009b) and training sets of known membrane-localized and non-membrane-localized proteins were used to build a model that categorized each protein as belonging to the cell surface or not (B). Applying this stringent model to our data identified 551 proteins predicted to be localized at the cell surface (220 for ES; 222 for TS; 212 for XEN; 416 for EpiSC;
Table S2).
As expected, these data revealed a strong enrichment for functional classes that are characteristic of cell-surface proteins, including signaling receptors, cell adhesion and cell migration molecules (C and
Table S3). The data set contains shared and cell-type-specific proteins within many functional classes, thereby revealing important differences in their protein profiles ( and ). For example, signaling receptors known to be involved in regulating stem cell self-renewal were detected, including Lifr in ES cells, Fgfr2 in TS cells, and Fgfr1 in EpiSC. In addition, Notch receptors were identified in EpiSC, but not in ES cells, suggesting differences may exist between pluripotent cell types. Interestingly, we identified numerous Ephrin and Slit receptors in ES and XEN cells. Their known roles in cell guidance and migration could provide a mechanism to explain the process of cell sorting that occurs during EPI and PE segregation (
Brose and Tessier-Lavigne, 2000; Genander and Frisén, 2010; Plusa et al., 2008). Thus, our proteomic data set will provide an important resource of cell-surface proteins that are present on stem cells and could be used in future functional studies to interrogate the mechanisms of self-renewal and differentiation.
| Table 1Functional Classification of the Plasma Membrane Proteins Identified in ES, TS, and XEN Cells |
| Table 2Functional Classification of the Plasma Membrane Proteins Identified in ES Cells and EpiSC |
Protein Abundance Is a Reliable Predictor of Cell-Type Specificity
Although previous studies have shown that RNA expression alone can be used to predict cell-specific protein expression (
Gerbe et al., 2008; Plusa et al., 2008), the presence of RNA does not always correlate with the presence of the protein (
Cox et al., 2009; de Sousa Abreu et al., 2009; Lundberg et al., 2010) suggesting that RNA expression may be an unreliable indicator of cell-specific protein expression. To determine whether the set of cell-specific membrane proteins identified in this current study could have been predicted by RNA expression alone, we integrated genome-wide transcriptional profiles into the protein data set (
Table S4). Pairwise comparison of the difference in RNA and protein expression between cell types revealed a subset of cell-surface proteins that would have been identified as cell-specific using transcriptional profiling alone (D). However, a larger set of cell-surface proteins showed poor correlation between RNA and protein abundance when comparing cell types and would not be predicted to be cell-specific using transcriptional information alone (D). For this set of proteins, RNA transcripts were detected at equivalent (<2-fold difference) levels between cell types, despite robust differences in protein abundance. The cell-specific protein expression of ten proteins within this set was confirmed using antibodies (see below). Disagreement between transcript and protein abundance was prevalent when comparing two cell lines. For instance, discordance between RNA and protein for ES and TS cells occurred for 89/143 (62%) cell-surface proteins. Poor concordance also impacts the reliable identification of cell-specific membrane proteins. Thus, overall only 21 of 178 (12%) cell-specific membrane proteins would have been identified by analysis of RNA expression alone (
Table S4). As an example, levels of Pecam1 transcript, a strong marker of ES cells (
Robson et al., 2001; Vittet et al., 1996), were similar in ES and TS cells (confirmed by quantitative RT-PCR; data not shown), but Pecam1 protein was detected only in ES cells (34 spectral counts in ES cells, 0 in TS cells;
Table S4; confirmed by antibody in ). Poor concordance between protein and transcript expression for cell-surface proteins are consistent with previous studies and, together with our data, suggest that protein is a more reliable predictor of cell-type specificity than RNA expression alone. These data reinforce the need for a direct proteomic approach for protein marker discovery.
Cell-Surface Protein Markers Enable Isolation of Lineage-Specific Stem Cells
Many cell-surface proteins identified were unique to one or another cell line and provide an important set of lineage-specific markers (
Table S2). Comparison between ES, TS and XEN cells revealed 71 cell-surface proteins unique to ES cells, 74 to TS cells and 66 to XEN cells (A). Comparison between ES cells and EpiSC revealed 60 cell-surface proteins unique to ES cells and 256 to EpiSC (A). We sought to use these protein markers to define a cell-surface protein signature for each cell type that would enable unambiguous detection of specific cells during stem cell differentiation and reprogramming. As an initial step, we screened a panel of commercially available antibodies for those that were able to detect the lineage-specific proteins identified. Of 52 membrane proteins examined, 27 revealed the expected cell-specific cell-surface expression pattern (Figures B and B;
Figure S1) and 25 antibodies failed due to absence of signal in all assays tested or were detected as multiple bands by western blot (see
Supplemental Experimental Procedures for antibody details). Of the 27 confirmed proteins, two have been identified previously: Pecam1 in EPI/ES cells and Pdgfrα in PE/XEN cells, and provide further validation of our data (
Plusa et al., 2008; Robson et al., 2001; Vittet et al., 1996). To the best of our knowledge, the remaining 25 confirmed cell-surface proteins have not been described previously for ES, TS, XEN, or EpiSC, thereby revealing stem cell-specific expression patterns.
We tested whether the cell-surface proteins, and the antibodies that bind to them, could be used for flow cytometry. Nine antibodies gave strong and cell-specific signals: Pecam1, Cd81 antigen, and Pvrl3 for ES cells; Cdcp1 and Cd40 antigen for TS cells; Pdgfrα, Dpp4, and Robo2 for XEN cells; Cd40 antigen and Cd47 antigen for EpiSC (Figures C, 3C, and D). Importantly, combinations of these antibodies could separate a mixed population of ES, TS, and XEN cells into individual cell types by flow cytometry (D). These results were confirmed using additional ES, TS, and XEN cell lines (data not shown), demonstrating the robustness of these markers. We have, therefore, greatly expanded our knowledge of stem cell specific cell-surface proteins and have identified combinations of antibodies that are able to separate a mixed population of cell types into their individual lineages.
Analysis of Cellular Reprogramming, ES Cells to XEN Cells
Monitoring the depletion of progenitor cells and the appearance of a new population is critical to optimization of differentiation and reprogramming protocols. To examine this further, we used our cell-specific protein markers to track changes in cell fate during conversion of ES cells into XEN cells. To achieve this, the PE transcription factor
Sox17 was overexpressed in ES, which has been shown previously to promote ES cell to XEN cell conversion (
Niakan et al., 2010; Qu et al., 2008; Shimoda et al., 2007). We used a doxycycline-inducible system in order to study early changes in cell state upon
Sox17 expression. Using a panel of six antibodies (Pecam1, Cd81, and Pvrl3 for ES cells; Dpp4, Pdgfrα, and Robo2 for XEN cells) we observed by flow cytometry a complete conversion in cell phenotype within 8–12 days of
Sox17 induction (E;
Figure S2). The converted cells were indistinguishable by flow cytometry from embryo-derived XEN cells and their change in cell fate was confirmed using gene expression analysis (E;
Figure S2). Interestingly, on day four, approximately one-third of the cells undergoing conversion were negative for both ES cell and XEN cell markers, indicating that an initial step in the differentiation process is the downregulation of ES cell proteins and this event precedes upregulation of XEN cell proteins. By day eight, the majority (>90%) of cells were positive for XEN cell markers with a minor proportion of negative cells. Together, these data confirm the fidelity of identified cell-specific cell-surface proteins and reveal the temporal and sequential changes in cell state that occur upon transcription factor mediated lineage conversion.
Cell-Surface Proteins Distinguish ES Cells and EpiSC during Differentiation and Reprogramming
ES cells and EpiSC are pluripotent stem cells that recapitulate the pre- and postimplantation EPI of early mouse embryos, respectively. The two stem cell types differ in terms of gene expression profiles, growth factor requirements, epigenetic status and developmental potency (
Rossant, 2008). Better understanding of ES cells and EpiSC is important for identifying how pluripotency is regulated and may also provide clues to explain the differences between human and mouse ES cells, with the former being more akin to EpiSC. Mouse ES cells and EpiSC can be interconverted by alteration of culture conditions augmented by forced expression of key transcription factors such as
Nanog and
Klf4 (
Bao et al., 2009; Greber et al., 2010; Guo and Smith, 2010; Guo et al., 2009; Hall et al., 2009a; Hanna et al., 2010; Silva et al., 2009; Theunissen et al., 2011). However, no cell-surface markers have been shown to functionally isolate the two stem cell types from each other and instead previous reports have relied on transgene expression or cell morphology. Applying our proteomic data set to this deficit, we sought to identify cell-surface proteins that could distinguish between ES cells and EpiSC as this would allow unambiguous identification and quantification during the process of cell conversion.
Our proteomic analysis and subsequent validation by antibodies identified nine cell-specific membrane proteins that are expressed by either ES cells or EpiSC: Pecam1, Pvrl3, and Cd81 antigen for ES cells; Notch3, Cd40 antigen, Cdh10, Sirpa, Cd47 antigen, and Cdh2 for EpiSC (A and 3B).
We applied established cell culture conditions to drive the conversion of ES cells into EpiSC (
Guo et al., 2009; Zhang et al., 2010). Changes in cell fate during this process were monitored by flow cytometric analysis of Pecam1, Cd81, and Cd40. The flow analysis revealed a progressive change in cell phenotype, whereby ~75% of cells had downregulated ES cell markers by day two (C). By day five, cells were indistinguishable from embryo-derived EpiSC by flow cytometry (C). The cells could be maintained in EpiSC culture conditions and revealed a gene expression profile highly similar to EpiSC (C), thereby confirming successful cell conversion. These experiments also provided important validation of the identified protein markers and their suitability for analyzing ES cell to EpiSC differentiation.
EpiSC to ES cell reprogramming is an inefficient process (~1% in published studies) (
Guo et al., 2009) and is therefore dependent on the accurate detection and isolation of reprogrammed cells. We transferred
Nanog-expressing EpiSC into stringent ES cell conditions (termed 2i/LIF) and used flow cytometry to detect the appearance of reprogrammed ES cells. We found that an antibody combination of Pecam1 together with Cd47 or Cd40 provided the most robust readout. Reprogrammed cells (defined here as Pecam1 positive and Cd47 negative) were detected on day nine and this population increased to ~1%–5% on day 13 (D). Each cell population was isolated by flow cytometry and their gene expression profile was analyzed using qRT-PCR. Reprogrammed cells showed expression of ES cell factors
Esrrb,
Klf2, and
Fbox15 at similar levels to embryo-derived ES cells and had downregulated EpiSC factors
Fgf5,
Cer1, and
T, suggesting successful reversion (E). To test this further, we used flow cytometry to purify reprogrammed cells and transferred the cells into 2i/LIF conditions. The reprogrammed cells formed compact ES cell-like colonies, which were positive for alkaline phosphatase activity and expressed the ES cell factor Klf4, thereby confirming their cellular identity (F). In contrast, EpiSC that failed to reprogram (defined here as Pecam1 negative and Cd47 positive) did not upregulate ES cell gene expression profiles or form alkaline phosphatase positive colonies in 2i/LIF (E and 3F). Instead, these cells showed upregulation of neural markers
Nestin and
Pax6 (E), which is consistent with a previous study that showed neural induction after EpiSC treatment with FGF inhibitors (
Greber et al., 2010). Lastly, we examined whether the cells had undergone epigenetic reprogramming by examining the methylation status of the
Dppa3 (also known as
Stella) promoter region, which is highly methylated in EpiSC and unmethylated in ES cells (
Hayashi et al., 2008). Bisulphite sequencing revealed that reprogrammed cells isolated by flow cytometry on day 12 had an unmethylated
Dppa3 promoter, whereas EpiSC that failed to reprogram remained fully methylated (F). The reprogrammed cells, therefore, share molecular features with ES cells and not with EpiSC. Taken together, these studies have identified a panel of cell-surface protein markers that are able to distinguish ES cells and EpiSC during differentiation and reprogramming. These results now enable the accurate detection and isolation of specific cell populations without the need to use transgenic reporter cell lines.
Identified Cell-Surface Proteins Are Expressed in Lineage-Appropriate Manner In Vivo
Better understanding of the molecular determinants of cell fate decisions and the precise timing of lineage restriction during early embryo development is essential for effective use of stem cells. Progress toward understanding these issues is contingent on the ability to prospectively isolate and characterize each cell lineage directly from blastocysts; however this remains a major technical challenge. Our panel of validated ES, TS, and XEN cell-surface proteins and antibodies present an opportunity to establish conditions that could enable these new approaches.
We investigated whether the identified cell-surface proteins were expressed by their in vivo tissue of origin. Embryos were examined by immunofluorescence and costained with known lineage markers Nanog, Oct4, Cdx2, and Gata6 to verify the identity of each cell type (A). In embryonic day E4.5 blastocysts, ES cell markers Pecam1, Cd81, Plxna4, and Pvrl3 localized to the cell surface of EPI with no signal detected in PE or TE (A). XEN cell proteins Pdgfrα and Dpp4 were restricted to PE with no expression detected in EPI or TE, and TS cell proteins Cdcp1, Ggt1, and Scarb1 localized specifically to TE (A). Curiously, Robo2 and Cd40 were not detected at this stage of development (A). We therefore examined E5.5 embryos and found that the XEN cell protein Robo2 was expressed by parietal endoderm cells, which are a specific cell type derived from PE (B). In addition, the TS cell protein Cd40 was detected throughout the trophoblast of E5.5 embryos, thereby revealing a strong stage-specific expression pattern (B). Cd40 is also a protein marker of EpiSC, and consistent with this, we detected cell-surface expression of Cd40 in EPI at E5.5 (B). Further examination of additional EpiSC protein markers revealed Sirpa, Notch3, Cdh2, and Cd47 localized specifically to EPI at E5.5 (B). In contrast, ES cell proteins Pecam1, Plxna4, and Pvrl3 were not detected in E5.5 embryos (data not shown), confirming that the stage-specific fidelity of protein markers that are able distinguish between ES cells and EpiSC in vitro are also maintained in vivo. Overall, the results validate our approach of using cell lines as models for identifying proteins in cell-types that are not directly amenable to proteomic studies, such as tissues in the early embryo. Sixteen proteins that we identified using the stem cell lines were expressed by the expected embryo lineage, thus revealing previously unappreciated expression patterns in the embryo and also providing potential cell-surface markers for prospective cell isolation.
Prospective Isolation of Lineage-Specific Cells Directly from Blastocysts by Flow Cytometry
We next examined whether it is possible to sort E4.5 blastocysts into separate lineages using the protein/antibody combinations identified (A). To achieve this, there were several significant technical hurdles to overcome. We first determined conditions that could dissociate blastocysts into single cells while maintaining cell viability (see
Experimental Procedures). Batches of 30–50 embryos were processed per experiment and ~25% of cells were recovered after single cell dissociation (~5–15 cells from each blastocyst). Cells were labeled with antibodies and subjected to flow cytometry. Cell viability was ~70% (based on propidium iodide staining) and ~30% of cells were recovered after flow cytometry. An unbiased computational analysis of the flow cytometry data defined three distinct cell populations from blastocysts, based upon the fluorescent intensity of each antibody (B). We noticed that the proportion of TE cells was reduced from ~75% in the blastocyst to ~15% of cells after flow cytometry, with the remaining cells comprising equal proportions of EPI and PE (
Figure S3A). The reduction in TE number was due to the difficulty in obtaining single viable cells, however despite the lower numbers, sufficient TE cells were obtained for analysis in each flow cytometry experiment. These data indicate that each cell lineage had been successfully isolated, thereby representing a significant advance in our ability to analyze specific cell types within the early embryo.
To confirm the lineage identity of each cell population, we sorted E4.5 blastocysts by flow cytometry and subjected individual cells within each population to quantitative gene expression analysis using the BioMark Fluidigm System. Principle component analysis revealed that Pecam1-positive (n = 25), Pdgfrα-positive (n = 23), and Cdcp1-positive (n = 15) cells formed three distinct clusters and each cell type could be unambiguously identified based upon expression levels of known EPI, PE, and TE genes (C and 5D;
Figure S3B). The clear separation of lineage-specific transcription factor expression suggests that each cell lineage is fully segregated in blastocysts at E4.5. These data also provide an estimate of the error rate during cell sorting. One EPI cell was falsely allocated into the PE population, and one TE cell that was falsely sorted into the EPI population, resulting in an error rate of ~4%. An alternative combination of antibodies, including Cd81 and Dpp4, showed a similar trend in lineage-specific gene expression levels but the cell populations were less distinct (
Figure S3C).
Lastly, we assessed whether cells isolated from E4.5 blastocysts by flow cytometry were viable, as this would enable sorted cells to undergo functional assays. To test this, each sorted cell population was transferred separately into ES, TS, and XEN cell derivation conditions. Importantly, cells isolated from all three lineages remained viable after 96 hr in culture. Differences in stem cell derivation efficiency between each isolated population also provide insight into the lineage restriction of each blastocyst cell type. ES cell colonies emerged from the Pecam1 population (EPI cells; efficiency of 19%) but no ES cell colonies developed from Cdcp1 (TE cells) or Pdgfrα (PE cells) populations (E). Conversely, numerous XEN cell colonies emerged from the Pdgfrα population (efficiency of 17%), but only one XEN cell colony from Pecam1 cells (efficiency of 1.6%) and none from Cdcp1 cells (E). From Cdcp1-positive cells, we obtained one TS cell colony (efficiency of 1%), whereas no TS cells emerged from Pecam1 or Pdgfrα-positive cells (E). The low derivation efficiency of TS colonies from Cdcp1-positive cells is not unexpected, as even established TS cell lines have lower clonal efficiency than ES and XEN cells (data not shown). These data confirm that cells remain viable after embryo dissociation and flow cytometry, thereby enabling functional studies to be applied. Furthermore, each cell type only gave rise to the appropriate stem cell lineage, indicating that EPI, PE, and TE are lineage restricted in E4.5 embryos even when transferred into conditions that are strongly selective for alternate stem cell lineages.