|Home | About | Journals | Submit | Contact Us | Français|
Quantification of protein expression in single cells promises to advance a systems-level understanding of normal development. Using a bottom-up proteomic workflow and multiplexing quantification by tandem mass tags, we recently demonstrated relative quantification between single embryonic cells (blastomeres) in the frog (Xenopus laevis) embryo. In this study, we minimize derivatization steps to enhance analytical sensitivity and use label-free quantification (LFQ) for single Xenopus cells. The technology builds on a custom-designed capillary electrophoresis microflow-electrospray ionization high-resolution mass spectrometry platform and LFQ by MaxLFQ (MaxQuant). By judiciously tailoring performance to peptide separation, ionization, and data-dependent acquisition, we demonstrate an ~75-amol (~11 nm) lower limit of detection and quantification for proteins in complex cell digests. The platform enabled the identification of 438 nonredundant protein groups by measuring 16 ng of protein digest, or <0.2% of the total protein contained in a blastomere in the 16-cell embryo. LFQ intensity was validated as a quantitative proxy for protein abundance. Correlation analysis was performed to compare protein quantities between the embryo and n = 3 different single D11 blastomeres, which are fated to develop into the nervous system. A total of 335 nonredundant protein groups were quantified in union between the single D11 cells spanning a 4 log-order concentration range. LFQ and correlation analysis detected expected proteomic differences between the whole embryo and blastomeres, and also found translational differences between individual D11 cells. LFQ on single cells raises exciting possibilities to study gene expression in other cells and models to help better understand cell processes on a systems biology level.
A key mission of systems cell biology is to reveal the suite of gene expression differences between cells in biological systems (1, 2), particularly at the level of proteins that perform critical cellular functions. From large amounts, usually milligrams of proteins, contemporary liquid chromatography (LC) with high-resolution mass spectrometry (HRMS)1 is able to characterize the encoded proteome in deep-to-complete coverage and elucidate post-translational modifications (PTMs) (3–6). Multiplexing quantification with new strategies that overcome spectral interferences (e.g. multinotch MS3 analysis (6)) provide new molecular insights into cell development. For example, passive retention has been identified as the mechanism responsible for maintaining the nuclear and cytoplasmic proteomes in the oocyte using the South African clawed frog (Xenopus laevis) (7), a powerful model in cell and developmental biology. Most recently, quantification of 10,000 proteins and 28,000 transcripts revealed molecular dynamics in previously unknown details during Xenopus embryonic development (8). Further developments in HRMS sensitivity are expected to also raise a capability to study dynamic molecular changes at the level of individual embryonic cells (blastomeres) to help decipher the spatiotemporal evolution of cell heterogeneity during normal development of the vertebrate embryo.
Multiple analytical solutions extended HRMS sensitivity to single cells; see literature reviews elsewhere, including references (9–14). Microprobe sampling, matrix-assisted laser desorption ionization (13, 15, 16), or LC-HRMS (17) have measured peptides and proteins in a discovery (untargeted) setting in single molluscan, crustacean, frog, or mammalian oocytes (18), eggs (19), and nuclei (7). Additionally, mass cytometry was used to assay 34 targeted proteins at a high, ~1000 cell/s, throughput during erythropoiesis (20), and this platform was recently coupled to laser-ablation to survey 32 targeted proteins between cells in the tumor environment (21). Electrophoresis is a rapidly emerging alternative technology for high-sensitivity proteomics. Electrophoretic separation affords exquisite peak capacity, is compatible with diverse types of molecules, can be hyphenated to ESI-HRMS via various interface designs, and is scalable to single cells (22–27). Using microscale chemical separation to simplify sample chemistry, capillary electrophoresis (CE), and Fourier transform MS was used to target α- and β-globulins in 5–10 human erythrocytes (28) and carbonic anhydrase (23) in lysates diluted to single cells. A microfluidic setting extended these electrophoretic studies to higher throughput, 12 erythrocytes/min (24). Continuous developments in CE separation (29, 30) and late-generation CE nano-flow ESI interfaces (31, 32) (see reviews (33, 34)) accomplished high-sensitivity detection of proteomes (35, 36) and labile PTMs, such as phosphorylation (37). These developments enabled femtogram (zeptomole) limit of detection (38) for protein digests from cell populations. For example, we recently developed metabolomic CE micro-flow electrospray ionization (μESI) HRMS platforms (26) for measuring metabolites (27, 39) and proteins (36) in single Xenopus blastomeres. Using microdissection to isolate single blastomeres, tandem mass tags to enable multiplexing quantification, and bottom-up proteomics, CE-μESI-HRMS was able to quantify 130–150 different protein groups (isoforms) in common between multiple single blastomeres in the 16-cell Xenopus embryo. The resulting data uncovered proteomic cell heterogeneity between blastomere types that give rise to nervous tissue, skin, and hindgut of the frog (36).
Here we develop label-free quantification (LFQ) for single Xenopus blastomeres to ask whether translational differences are also quantifiable between blastomeres within the same cell type. We proposed that minimization of sample preparation steps that are prone to protein/peptide losses fosters higher sensitivity to enable LFQ (40) on single cells, albeit at the expense of lower sample throughput without multiplexing. To test this hypothesis, we adopted LFQ to our proteomic CE-μESI-HRMS platform using MaxLFQ. After validation of this approach, ~112 different protein groups were quantified between mid-line dorsal-animal cells (D11) in the 16-cell Xenopus embryo, which are precursors of nervous tissue (brain, spinal cord, and retina) (41). The resulting data suggest comparable expression for the majority of proteins and highly variable expression for 25 different protein groups between the D11 cells. Quantification of cell-to-cell differences within the same cell type demonstrates that proteomic HRMS is sensitive enough to facilitate new types of questions in basic and translational research.
Chemicals, solvents, and TPCK-modified trypsin were obtained in reagent grade or higher purity from Fisher Scientific (Pittsburg, PA). Standard peptide methionine enkephalin (Met-Enk) was from Sigma Aldrich (St. Louis, MO). Samples were dissolved in LC-MS grade methanol, acetonitrile, water, or formic acid from Fisher. Bare fused silica capillaries (40/110 μm inner/outer diameter) were from Polymicro Technologies (Phoenix, AZ) and used as received.
Steinberg's media (100 and 50%) was prepared as in references (27, 36). For CE separation, the background electrolyte (BGE) composed of 25% (v/v) acetonitrile containing 1 m formic acid in MS-grade water, which measured pH 2.3. The electrospray sheath liquid contained 50% (v/v) methanol and 0.1% (v/v) formic acid in MS-grade water, which measured pH 3.5. The lysis buffer contained 20 mm Tris-HCl (pH 7.5), 150 mm NaCl, 5 mm EDTA, and 1% sodium dodecyl sulfate.
Adult male and female frogs (Xenopus laevis) were obtained from Nasco (Fort Atkinson, WI) and maintained in a breeding colony. Protocols regarding the maintenance and handling of Xenopus were approved by the George Washington University Institutional Animal Care and Use Committee (IACUC no. A311). Embryos were obtained by in vitro fertilization and dejellied using a 2% cysteine solution following standard protocols (42). Embryos were raised in a Petri dish containing 100% Steinberg's media, and their development was monitored using a stereomicroscope. Upon reaching the 16-cell stage, single embryos were collected and transferred in a centrifuge tube for further processing. To isolate single blastomeres, embryos were collected on the same day from the same wild-type parents, i.e. the same genetic background, to reduce variability and were transferred into an agarose-coated Petri dish containing 50% Steinberg's solution at room temperature. Blastomeres were identified based on pigmentation and reference to established cell fate maps (41) and dissected via an earlier protocol (42). Each isolated single blastomere was immediately transferred into a separate 0.6 ml centrifuge tube for further processing. A total of n = 3 biological replicates were collected for the whole-embryo (different embryos processed) and single-cell (D11 cell types isolated from different embryos) measurements in this study.
To process the embryos and single blastomeres, standard bottom-up proteomic workflows (43) were downscaled to the total protein content of the sample as determined by standard BCA assay (Thermo). The specimens were lysed in the lysis buffer, facilitated by ultrasonic agitation. Proteins were reduced using dithiothreitol (25 mm final), carbamidomethylated using iodoacetamide (50 mm final), and precipitated in chilled acetone (−20 °C) over ~12 h. The precipitate was separated by centrifugation at 10,000 × g for 10 min at 4 °C, the supernatant was discarded, and the protein pellet was washed once for single cell lysates and three times for whole embryo lysates with chilled acetone (−20 °C). The proteins were suspended in 50 mm ammonium bicarbonate and digested using trypsin at ~1:50 protein/enzyme ratio with overnight incubation at 37 °C.
The tryptic digests (sample) resulting from each embryo and cell were measured in a custom-built CE-ESI platform that we previously described in detail (36). In this work, 1 μl of the digest was deposited into a sample-loading microvial, whence ~7–16 nL of material was injected into a separation fused silica capillary (~85 cm in length) filled with the BGE. Peptides were separated at +19 kV (inlet end of the capillary) and ionized in a custom-built co-axial sheath-flow CE-ESI interface operated in the micro-flow regime (CE-μESI) following our earlier designs (25–27, 36, 39). The sheath liquid rate was 1 μl/min. The operational parameters for the CE-ESI mass spectrometer were carefully selected to maintain the electrospray in the cone-jet spraying regime, which is most efficient for ionization (44): the flow rate and composition of the electrospray sheath liquid and the electrospray emitter-to-mass spectrometer orifice distance were controlled. The CE-μESI source was aligned on-axis with the sampling plate of the mass spectrometers to entrain peptide ions into the mass spectrometer.
To test and validate LFQ, peptide ions from whole-embryo lysates were detected using a high-resolution quadrupole orthogonal acceleration time-of-flight (Qq-TOF) mass spectrometer (Impact HD; Bruker Daltonics, Billerica, MA) via data-dependent acquisition (DDA). The mass spectrometer was tuned, operated at 40,000 full width at half maximum (FWHM) resolution, and externally mass-calibrated over the m/z 250–3000 range as instructed by the vendor. Experimental settings included: CE-ESI source potential, 0 V (earth ground); CE-ESI-to-orifice distance, 1 mm; orifice plate potential, −1700 V; survey scan (MS1) data acquisition rate, 2, 4, 8, or 12 Hz; collision-induced dissociation in nitrogen collision gas at 20–35 eV collision energy depending on m/z value and charge state. The tandem MS settings were: 4 Hz for signals lower than 3.2 × 103 counts and 15 Hz for signals above 105 counts per 1000 summations.
For deeper proteomic coverage, the single-cell and whole-embryo digests were analyzed also using a quadrupole orbitrap linear ion trap (q-OT-LIT) tribrid mass analyzer (Fusion; Thermo Scientific). Conditions of sample injection and peptide separation were identical to the qQ-TOF experiments. In this setup, the CE-μESI emitter was positioned ~5 mm from the orifice of the sampling cone and operated at +2,900 V spray voltage (against earth ground). Peptide ions were identified via data-dependent HRMS2. Survey scans were recorded every 3 s (cycle time) between m/z 350–1600 at ~60,000 FWHM resolution in the Orbitrap analyzer with 100 ms maximum injection time, automatic gain control (AGC) set to 5 × 105 counts (C-trap), and 1 microscan. During tandem MS experiments, the least intense ions with MS2 high-pass threshold of 103 ion counts were isolated at top speed in the quadrupole with 0.8 Da isolation window and routed for fragmentation via higher-energy collisional dissociation in the multipole cell at 30% normalized collision energy in nitrogen collision gas. The fragments were detected in the ion trap with “rapid” scan rate, 50 ms maximum injection time, AGC set to 1 × 104 counts, and 1 microscan. Fragmented ions were dynamically excluded for 15 s. Ions of any charge state (including undetermined) were allowed for fragmentation in both mass spectrometers.
The CE-μESI-HRMS system was evaluated for performance based on 3–4 technical replicate analyses (same digest analyzed multiple times). CE-μESI-HRMS was validated for LFQ using the Qq-TOF mass spectrometer. Cell-to-cell and cell-to-embryo differences are reported based on measurements in the q-OT-LIT mass spectrometer. To account for innate biological variability, D11 blastomeres were collected in n = 3 biological replicates, each from a different 16-cell embryo. Each whole embryo and blastomere was measured in a randomized order. This study utilized two different mass spectrometers (Qq-TOF and q-OT-LIT) to demonstrate the broad applicability of CE-μESI-HRMS for protein analysis in single blastomeres.
The MS proteomics data have been deposited to the ProteomeExchange Consortium via the PRIDE (45) partner repository with the data set identifier “PXD004174.” For proteins that were identified based on a single peptide, tandem mass spectra are provided in the SI. To find molecular features (unique m/z versus migration time domains), peptide signals were semi-manually profiled using a custom-written script (26) in Compass DataAnalysis 4.2 (Bruker Daltonics). Primary (raw) data were searched using MaxQuant v188.8.131.52 (46) software executing the Andromeda search engine (47) against Xenopus laevis NCBI (48) database (downloaded from Xenbase.org (49, 50) on August 19th 2015, containing 34,176 protein entries). The search parameters were: fixed modification, carbamidomethylation; variable modification, methionine oxidation and/or protein N-term acetylation; minimum peptide length, 7 amino acids; search for common contaminants, enabled. The Qq-TOF data were processed according to a protocol established elsewhere (51) using the settings: initial mass deviation for precursor ions, 70 mDa; main search for precursor ions (recalibrated), 6 mDa; fragmentation mass tolerance, <10 ppm. The q-OT-LIT data were processed using the settings: mass deviation for the main search of precursor masses, <4.5 ppm; de novo mass tolerance for tandem mass spectra, <0.25 Da. A p value of less than 0.05 (Student's t test) was chosen to indicate statistical significance. Errors were calculated as standard error of the mean (S.E.M.). Peptide and protein identifications were filtered at <1% false discovery rate (FDR) against a reversed-sequence database. Common contaminants were manually curated and excluded from the list of protein identifications reported here. Protein isoforms were grouped based on parsimony principle in MaxQuant and are reported as protein groups. Protein interaction networks were generated using Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) ver. 10 (52) with a medium confidence score of 0.4 and K-means clustering at level 3.
The goal of this study was to compare protein production between single blastomeres of identical cell fate. Our analytical strategy (Fig. 1) builds on a volume-limited CE-ESI system that we recently developed for detecting proteins with ~25–60 amol sensitivity in single Xenopus blastomeres (36). Using multiplexing quantification to enhance peptide abundance via tandem mass tags (TMTs), CE-ESI-HRMS was able to quantify ~130–150 different protein groups in common between different blastomere types in the 16-cell Xenopus laevis embryo. Herein we proposed that simplification of sample-handling steps prone to protein/peptide losses will enhance sensitivity to enable the quantification of single blastomeres in a label-free manner. To test this hypothesis, we eliminated peptide derivatization by TMTs and adopted label-free quantification (LFQ) by a recent approach in MaxQuant, termed MaxLFQ (40). MaxLFQ approximates protein abundance with higher signal-to-noise ratio (S/N) and quantitative accuracy for low-intensity signals by using extracted ion currents (XICs) from MS1 events, which can be acquired at a higher duty cycle and broader m/z range than discreet MS/MS events (e.g. during spectral counting) (40). These aspects raised benefits for low-abundance peptide signals anticipated in single-cell digests in this work. The current approach, shown in Fig. 1, begins with the preparation of protein digests from single blastomeres followed by CE-μESI-HRMS analysis via DDA. Protein quantification is enhanced by balancing the duty cycle of MS/MS events underlying peptide identification and survey scans (MS1 events) underpinning quantification.
Method development and validation were performed using the 16-cell Xenopus embryo, in which blastomeres are large enough (~250 μm in spherical diameter) to aid manual cell identification and isolation. For example, the midline dorsal-animal blastomere (D11) is readily located in the embryo based on pigmentation and position along the dorsal-ventral axes (Fig. 1). Using established cell biological tools and protocols (42), we can reproducibly identify and dissect single D11 blastomeres from the embryo, which are fated to give rise to the nervous tissue (41). Blastomeres in the 16-cell embryo contain an appreciable amount, ~10 μg, of total protein (36), which aided sensitivity refinement during the instrument development portion of this work. However, ~90% of this protein content is dominated by yolk (vitellogenins) (53), essentially leaving only ~1 μg of yolk-free proteins from each D11 cell. The abundance of yolk proteins in these blastomeres may beneficially minimize adsorptive losses to low-abundance proteins during sample preparation. Although this starting protein amount is already ~100–1000-times less than typically assessed in bottom-up proteomics, we analyzed only a portion, ~1–30 ng proteins, or 0.01–0.3% of the total protein content of the blastomeres to make advances toward measuring the protein content of larger single mammalian cells.
In preparation for bottom-up protein detection, we established trace-level peptide separation and detection using HRMS (Qq-TOF). CE was carried out in bare (unmodified) fused silica capillaries filled with BGE at pH 2.3, selected to suppress nonspecific peptide adsorption on the capillary walls by minimizing the ionization of the surface silanol groups. Enhanced Joule heating and electrolysis because of higher conductivity at lower pH were minimized by addition of organic modifiers to the BGE; 25% acetonitrile containing 1 m formic acid provided optimal performance and was used throughout this study. Furthermore, we performed on-column field-amplified sample stacking to enhance the S/N by suspending protein digests in 50% acetonitrile containing 0.05% acetic acid (versus higher conductivity of the BGE). The CE-ESI-HRMS platform was tested quantitative for peptides. The under-the-curve peak area was linear over a 3 log-order tested concentration range for Met-Enk (Fig. 2A). A 45-nm solution produced S/N = ~12, which extrapolates to ~11-nm or 75-amol lower limit of detection for this peptide (S/N = 3), where S/N was defined as the ratio between peak height of the signal and the root-mean-square of the noise (see Fig. 2A inset).
Peptide separation and quantification was robust in complex Xenopus protein digests. A ~20 ng digest from a 16-cell embryo yielded a rich base-peak electropherogram (Fig. 2B), demonstrating appreciable molecular complexity detectable despite this limited amount of protein digest. As efficient peptide separation is central to LFQ, we refined the peak capacity of CE-μESI-HRMS by tailoring the BGE composition and the CE separation potential for peptides from vitellogenin b1 (see supplemental Table S1A). These peptides were separated during a 20-min window with theoretical plate numbers between ~130,000–370,000 (see Fig. 2B and supplemental Table S2), comparing favorably to traditional nanoLC. Peptide detection was quantitative also in this complex sample, as demonstrated for the 10-, 13-, and 28-mer peptide ions from Vtgb1 for a 2 log-order tested concentration range (Fig. 2C). The digitizer of the mass spectrometer is expected to extend this range to 4–5 log orders of magnitude. Additionally, the reproducibility of peptide separation was tested across multiple days. The Pearson cross-correlation coefficient calculated for ~230 randomly selected peptides was 0.99 across 7 days (supplemental Fig. S1). Combined, these results established sensitive, quantitative, and robust peptide detection with compatibility to limited sample amounts, setting the stage for LFQ for single Xenopus blastomeres.
Next, we designed a set of studies to extend LFQ to an increasing number of proteins in single blastomeres. The strategy was twofold. On one hand, we aimed at enhancing peptide sequencing by increasing the success rate of MS/MS events that lead to peptide identifications. On the other hand, improvements in quantification required increasing the MS1 duty cycle for recording XICs, which serve as the basis of quantification in MaxLFQ (40). Enhancing peptide charging by supercharging agents (54, 55), such as dimethyl sulfoxide (10%) and sulfulane (100 mm), was one possibility to aid peptide identifications. However, modification to the electrospray sheath liquid compromised the stability of the Taylor cone, which in turn sacrificed protein identifications (supplemental Table S3).
Data-dependent MS/MS was configured to electrophoretic separation (Fig. 3); salient parameters included the rate of MS/MS and full-MS events as well as the lower S/N threshold to trigger fragmentation. Consecutive peptides migrated through the capillary with at least ~0.256 s difference (Fig. 3A), suggesting that a 4 Hz survey (full-MS or MS1) scan rate was sufficient to recognize peptides. Indeed, faster survey scans lowered the cumulative success of protein identification (supplemental Table S3) because of lower S/N ratios resulting with less spectral averaging (data not shown). The distribution of peptide ion signal intensities was estimated normal in the low signal abundance range (Fig. 3A), whereas the higher-intensity domain (> ~5 × 104 counts) tailed because of abundant peptides from vitellogenins and structural proteins. To help identify lower-abundance signals, the MS/MS threshold was lowered (500 counts), even though this was also anticipated to cause the fragmentation of nonpeptide signals, such as common contaminants in ESI, at the cost of MS1 events (for quantification). To counterbalance the duty cycle, MS/MS spectra were collected faster, at 15 Hz, for high-abundance signals (>5 × 104 counts) as they were likely to provide higher-quality fragmentation. For low-abundance signals (<5 × 103 counts), which were expected to fragment with lower S/N, the MS/MS rate was dynamically adjusted to 4 Hz to boost S/N via spectral averaging. With rapidly increasing peptide identification during the peptide migration window (~20–50 min), particularly during the section early when more highly charged and smaller peptides (3+ and 4+) eluted (see Fig. 3B), this two-pronged DDA strategy enabled the identification of 74 different protein groups in 16 ng of the embryo digest using the qQ-TOF system (see proteins listed in supplemental Table S1B).
To validate CE-μESI-HRMS for LFQ, progressively smaller amounts of protein digests were measured on the qQ-TOF MS/MS instrument (Fig. 4A). Between 33–108 nonredundant protein groups were identified from ~2–35 ng of protein digest, corresponding to ~0.02–0.30% of the total protein content in the average blastomere in the 16-cell embryo, respectively. LFQ intensities were correlated with the total amount of protein digest (R2 ≥ 0.90), as shown for Vtgb1, peptidylprolyl isomerase (Ppia), and lipovitellin (Lipo 1) in Fig. 4. These results demonstrated linear quantification also at the level of proteins using CE-μESI-HRMS.
To enhance protein quantification, we coupled CE-μESI-HRMS to late-generation q-OT-LIT HRMS. This tribrid instrument design enabled ion trapping to accumulate low-abundance signals prior to MS/MS, higher-resolution mass analysis to resolve spectral interferences (60,000 FWHM used versus 40,000 FWHM by the Qq-TOF earlier), and synchronous MS and MS/MS operation to boost the overall acquisition duty cycle. As a result, protein identifications were enhanced ~fivefold (Fig. 4A). Gene ontology annotation suggests that the identified proteins participate in catalysis, binding, and other cellular biological processes by carrying out metabolic, developmental, or regulatory mechanisms (Fig. 4B).
CE-μESI-HRMS (q-OT-LIT) was applied to compare protein expression between the whole embryo and n = 3 single D11 blastomeres; a left D11 (D111) and two right D11 (D112 and D113) cells were analyzed in technical duplicate. A total of 438 different protein groups were identified in union between the cells (see Table I and proteins listed in supplemental Table S1E). Among these proteins were many known to be involved in brain or spinal cord development, major derivatives of D11 blastomeres (41). For example, chaperonin containing TCP1 subunit 3 (Cct3), creatine kinase-brain (Ckb), malate dehydrogenase 1 (Mdh1), and nucleoside diphosphate kinase 2 (Nme2), which were detected in all three biological replicates, and voltage-dependent ion channel 2 (Vdac2), which was detected in two of the three biological replicates, are known to be expressed in the brain and spinal cord structures of the embryo (49, 50). Of the 438 identified proteins, a total of 335 nonredundant protein groups were quantified in union between the three D11 blastomeres, and 62 proteins were common to all biological replicates (intercept). LFQ intensities suggested that these proteins encompassed a ~4-log-order-magnitude of concentration range.
The LFQ intensity values allowed us to compare the translational state between the embryo and individual D11 blastomeres (Fig. 5B). In traditional cell-averaging HRMS, protein expression data can be compared based on LFQ intensities after normalization to total protein amounts or reference (e.g. stably expressed) proteins. However, these data are not readily available for single blastomeres: the total protein amount and the size of the cells are difficult or impractical to measure, especially for aspherical and rapidly dividing blastomeres. As an alternative, we proposed a correlation, rather than protein abundance-based model, to gauge protein expression. The LFQ intensities for each quantified protein group were log-transformed and plotted between the samples. As the relative concentration between stable proteins (not expressed or not degrading) is independent of blastomere/embryo dimensions or sample amounts analyzed by CE-HRMS, the LFQ scores for these proteins are expected to follow correlation. In contrast, proteins with changing copy numbers are expected to deviate from the correlation, for example, because of biological events, such as differential gene expression, protein degradation, or sampling biases including nonspecific adsorption on surfaces (vials, pipette tips, etc.). Indeed, high Pearson correlation coefficients (ρ ~ 0.9) calculated based on protein LFQ intensities revealed good technical reproducibility (Fig. 5A, left panel). For a systematic analysis of correlation, we computed the Euclidean distance, d, of each quantified protein from the linear regression curve (see mock sample). The resulting distances essentially served as a “proteomic ruler” with d ≈ 0 indicating stable expression and larger d values signifying variable expression (middle panel). Based on the single-cell technical replicates, a d > 0.5 was selected to mark significant dysregulation in this study. LFQ intensities were reproducible with a mean of ~20% RSD, suggesting biological significance detectable at fold change ≥ 1.5 (right panel).
The correlation model was validated based on known molecular cell heterogeneity in the 16-cell embryo. Although expression levels were correlated for the majority of proteins, a ρ < 0.76 indicated graded translational differences between D11 blastomeres (n = 3) and the embryo (see Fig. 5B, left panel). Protein ratios were normally distributed with Gaussian medians of 0.24 between D111 versus embryo, 0.52 between D112 versus embryo, and 0.85 between D113 versus embryo. We ascribe these shifts to a combination of factors, including different protein amounts contained by whole 16-cell embryos and single blastomeres, heterogeneous protein content between different types of blastomeres (36), and likely size differences between D11 blastomeres. After median-normalizing the fold change values, protein expression was readily queried using volcano plots. A d > 0.5 and fold change ≥ 1.5 was taken to screen for significant dysregulation (right panel). Compared with the whole embryo, D11 blastomeres contained higher amounts for 17 protein groups and lower amounts for eight different protein groups (Table II). For example, higher LFQ intensity for actin-b and lower for vitellogenin-a1 in the D11 blastomeres indicated an advanced level of metabolic activity compared with the average embryo. This is not unexpected considering that blastomeres on the dorsal-animal side of the embryo (i.e. D11) are known to complete mitosis/cytokinesis faster than those in the vegetal hemisphere, and vegetal cells are known to contain more yolk platelets. These protein differences also agree with known blastomere mRNA (56–58) and protein heterogeneity (36) along the animal-vegetal and dorsal-ventral axes of the embryo. Additionally, many of the proteins accumulating in D11 blastomeres (e.g. Gnb2l1, Cct3, Wyhaq) have higher expression in nervous tissue (compare with data on Xenbase (49, 50)), which is the fate of the D11 blastomeres (49, 50). Combined, these results validate the utility of LFQ to meaningfully capture expression differences between cells and the embryo, which would have been hidden during whole-embryo measurements.
Last, we asked whether there are also translational differences between the individual D11 cells; a left D11 (D111), and two right D11 blastomeres (D112 and D113) were analyzed (each in technical duplicate) using CE-ESI-HRMS (q-OT-LIT). The analysis revealed good correlation for the majority of the proteins and notable dysregulation for others (see Fig. 5C, left panel). Based on the distribution of the d values, proteins were categorized as stably (95 proteins with 0 ≤ d ≤ 0.05) and variably (e.g. 25 proteins with d ≥ 0.5) expressed between the cells (right panel). Dysregulated proteins are listed in Table II. Among the most stably quantified proteins were products of many traditional “housekeeping genes,” including Eno1, Hadha, Hsp90, and Mdh1, considered to be invariantly expressed across tissues (59). Other proteins, such as Vdac2 and Cofilin-1, were also in this category. Cofilin-1 is an essential molecular player during vertebrate cytokinesis that accumulates in the cleavage furrow of diving cells (60). This supports that the D11 blastomeres were isolated in similar phases of the cell cycle from different embryos. In contrast, proteins with differential expression (see Table II) are linked to Wnt signaling, protein translation and protein folding, cell differentiation, morphology, motility, and cycle control, as well as cytoskeleton organization, and energy balance, whereas others have been implicated in the development of nervous tissues, the known fates of D11 blastomeres.
Possible functional associations were predicted between proteins in the D11 blastomeres (Fig. 6). The gene names corresponding to the proteins that were quantified between D11 blastomeres were imported into STRING 10 to predict protein-protein interactions using the Xenopus silurana reference database. With K-means filtering, subnetworks of ribosomal, mitochondrial, cell structural, and metabolic activities can be distinguished in the resulting interaction map (supplemental Fig. S2). This analysis was also repeated for the proteins that exhibited similar levels between the individual D11 blastomeres (d > 0.5 in Fig. 5C, left). Associations are apparent for proteins with similar KEGG functions (Fig. 6A). For example, those involved in metabolic and oxidative phosphorylation and the ribosome are readily recognized. Furthermore, functional interactions for the 25 most variably quantified proteins indicated associations in protein synthesis and metabolism (Fig. 6B). Many of these proteins or related transcripts are known to accumulate in the neural plate in early-stage embryos and the eye, retina, head, somites, heart, or tail-bud structures in the tadpole (see Xenbase (49, 50)). The observed translational differences between the individual D11 cells would have been lost to averaging during traditional approaches in which an ensemble of cells is measured. These results underscore the power of single-cell measurements to aid cell and developmental biological investigations.
LFQ by CE-μESI-HRMS is sufficiently sensitive to compare protein expression between single blastomeres in the developing embryo. By simplifying sample preparation, this strategy raises benefits for measuring samples that are precious, rare, or limited in size or when sample losses are of concern. Here we extended LFQ to single blastomeres in the 16-cell Xenopus laevis embryo, a powerful model in cell and developmental biology and health research. The platform accomplished an ~75-amol (~11 nm) lower limit of detection and was compatible with a few tens of nanoliters (nanograms), i.e. ~1000–10,000-times smaller amounts of samples than analyzed in typical bottom-up proteomic workflows. We found that CE separated peptides fast (~4 Hz sequential migration time) with comparable separation efficiency than contemporary nanoLC. This corroborates with the growing body of investigations that demonstrate the utility of CE for fast and sensitive analysis of limited amounts of proteins (29, 36–38). Data-dependent acquisition was judiciously tailored to CE-based separation to maximize duty cycle between full-scan (MS1) events leading to quantification and MS/MS scans achieving peptide identification. New-generation tribrid HRMS (quadrupole-orbitrap-ion trap) rose to the challenge particularly well by boosting fragmentation success via ion trapping, resolving spectral interferences with higher mass resolution, and importantly, parallelizing MS/MS and survey scans for enhancing the acquisition duty cycle. The approach was able to identify 438 nonredundant protein groups and quantify 335 of these proteins in union between three D11 blastomeres by measuring ~16 ng, or <0.2% of the total protein content from each cell. These results suggest that the presented single-cell analysis technology is applicable to smaller cells and other types of cells, including blastomeres, neurons, and limited tissues.
Proteomics on single cells necessitates new considerations in data evaluation. Because embryonic cells rapidly divide and change their transcriptional and translational activities (8, 57, 61), measurement of cell size or protein content is technologically difficult or impractical. This in turn hinders the normalization of LFQ intensities during the comparison of protein abundances between samples. As an alternative, we implemented correlation analysis to compare protein levels (estimated by LFQ intensities) between blastomeres. Pearson correlation and fold-change values based on the calculated LFQ intensities helped identify stably and variably expressed proteins between blastomeres and the whole embryo. These translational cell-to-cell differences complement known molecular differences between cells in the embryo at the level of transcripts (56–58), proteins (36), and also metabolites (27, 39). These outcomes provide leverage for using correlation analysis to compare gene expression between single cells.
The presented study also detected graded proteomic heterogeneity between different D11 blastomeres from three different embryos. Although protein levels were comparable for a large number of “housekeeping” and cell structural genes, a small number of proteins exhibited significant cell-to-cell variability. Independent studies by immunohistochemistry and in situ hybridization have implicated these proteins in the development of the neural plate, eye, brain, head, and somites structures of the embryo, which are the known fates of D11 blastomeres. Although addressing the origin and biological significance of the observed protein differences goes beyond the scope of this work, detection of translational heterogeneity between cells of the same “cell type” underscores the importance of single-cell measurements. High-sensitivity HRMS, such as the single-cell analysis platform presented here, supports new investigative possibilities in how spatiotemporal heterogeneity in gene expression organizes subcellular organelles (7), cells, and tissues, and the whole embryo (8, 19) during normal development and disease.
New and continuing technological advances raise exciting potentials to adopt proteomic measurements from large Xenopus blastomeres to smaller single cells, including mammalian systems. To this end, we demonstrated the detection/quantification of a considerable number of proteins from ~20 ng protein digests from single blastomeres, approaching the total protein content of larger mammalian cells. Microscale sample preparation can help collect peptides and proteins with high sensitivity, using, for example, patch-clamp electrophysiological tools (62) and microanalysis probes (17, 63). For smaller mammalian cells containing fewer amounts of proteins, addition of carrier proteins may be beneficial to minimize adsorptive losses for low-abundance proteins. To assess proteins at trace levels, various CE approaches can help enrich molecules on-column and separate them in increased peak capacity (64). New-generation interfaces that minimize/eliminate sample dilution between CE and ESI may be used to ionize peptides more efficiently (see reviews in Ref (33, 65–67)); electrokinetically pumped sheath-flow (32, 38) and sheathless (68, 69) interfaces are promising designs in this direction. Based on recent successes in proteome coverage and post-translational analysis by CE (35, 37, 69), we expect these innovative solutions combined with new-generation mass spectrometers capable of ever-increasing sensitivity, speed, and multiplexing to further advance protein identification in the miniscule amounts of proteins afforded by single cells.
In parallel, proteomic measurements should be made faster or parallelized to empower statistics on cells and cell populations. Although we demonstrated an ability by HRMS to measure single blastomeres, the presented workflow would benefit from higher throughput. Multiplexing quantification of the proteome by, e.g. TMTs (36) and imaging HRMS (e.g. MALDI (16) and laser ablation mass cytometry (21)) deliver complementary throughput over separation-based single-cell measurements. Furthermore, lab-on-a-chip devices capable of encapsulating cells in nanoliter droplets are attractive to sort, lyse, and treat thousands-to-tens of thousands of cells (70), raising a potential to measure a sufficiently large cohort of cells that capture the cell populations' overall behavior also at the level of the proteome. We anticipate that continuous developments in cell handling, proteomic processing, and HRMS will open new doors to study systems biology at the level of the basic functional building block of life: the cell.
We thank Leonid Peshkin (Harvard Medical School, Boston, MA) for helpful discussions during the preparation of the manuscript.
Author contributions: P.N. and C.L-B designed the research; S.A.M. provided the Xenopus cells and commented on the manuscript; C.L-B, S.R., and P.N. analyzed the cells; P.N. and C.L-B interpreted the data and wrote the manuscript.
* This work was supported by the National Science Foundation Grant DBI-1455474 (to P.N. and S.A.M.) and the George Washington University Department of Chemistry Start-Up Funds (to P.N.) and Columbian College Facilitating Funds (to P.N. and S.A.M.). The content of the presented work was solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
This article contains supplemental material.
1 The abbreviations used are: