|Home | About | Journals | Submit | Contact Us | Français|
We report the results of abundant plasma protein depletion on the analysis of underivatized N-linked glycans derived from plasma proteins by nanoLC Fourier-transform ion cyclotron resonance mass spectrometry. N-linked glycan profiles were compared between plasma samples where the six most abundant plasma proteins were depleted (n=3) through a solid-phase immunoaffinity column and undepleted plasma samples (n=3). Three exogenous glycan standards were spiked into all samples which allowed for normalization of the N-glycan abundances. The abundances of 20 glycans varying in type, structure, composition, and molecular weight (1,200–3,700 Da) were compared between the two sets of samples. Small fucosylated non-sialylated complex glycans were found to decrease in abundance in the depleted samples (greater than or equal to tenfold) relative to the undepleted samples. Protein depletion was found to marginally effect (less than threefold) the abundance of high mannose, hybrid, and large highly sialylated complex species. The significance of these findings in terms of future biomarker discovery experiments via global glycan profiling is discussed.
The systematic study of the composition, structure, linkages, and function of all carbohydrates either released from or attached to proteins, commonly referred to as glycomics, is a promising and developing field of science . This is mainly due to the ubiquitous nature of this post-translational modification as it is has been estimated that the majority of proteins are glycosylated . In addition, glycosylation is known to regulate several critical biological processes ranging from cellular recognition and signaling [3–5], pathogen binding [6, 7] to protein folding and function . As a result, aberrant glycosylation has been implicated in various diseases [8, 9] and thus is a promising area for biomarker discovery.
Recent biomarker studies have taken a global glycan approach in which glycans are cleaved from all glycoproteins in complex biological matrices (e.g., plasma), purified, and analyzed by mass spectrometry [10–13]. These studies allow for a more targeted approach to biomarker discovery since the number of species in the glycome is only a small subset of the entire proteome. The goal of these experiments is the identification of a marker indicative of either the onset or progression of disease; specifically in such cancers where a sensitive and specific screening test does not exist and early intervention significantly increases survival rates (e.g., ovarian) . Over the last 5 years, several putative glycan markers from plasma glycoproteins have been identified for different types of cancer including: ovarian [10, 11], breast [15, 16], prostate , liver , and esophageal malignancies .
Although the mapping of an entire glycome is noteworthy, it presents a formidable challenge due to the complex biological processing by several enzymes  within the Golgi apparatus, leading to an estimated ~1,700 different N-linked glycans attached to plasma proteins . This number is orders of magnitude less than an analogous shotgun proteomic experiment; however, the number of identified glycans in recent glycan profiling experiments by both MALDI and LC-MS have been significantly less than 100 [13, 16, 21]. Some have utilized three elutions from a solid-phase extraction cartridge to pre-fractionate the glycan sample prior to nanoLC to reduce ion suppression and have reportedly identified approximately 200 glycan species . For a large sample set, such as those required for biomarker discovery investigations, pre-fractionation of glycan samples and replicate analysis by nanoLC MS dramatically increases analysis times. A potential alternative to pre-fractionation is abundant protein depletion to reduce the overall protein complexity and dynamic range of plasma  and thus limit the ion suppression resulting from abundant glycans derived from the most abundant glycoproteins. Protein depletion has been used to reduce complexity and selectively enhance a number of proteomic experiments [24–27]. In addition, it is possible that few or all of these glycans are solely derived from the different glycoforms of the abundant proteins in plasma. The answers to these questions are the subject of this study.
To the author’s knowledge, this is the first semi-quantitative report on the effect of abundant protein depletion on global glycan profiling experiments. Plasma samples (n=3) were depleted of the six most abundant proteins (albumin, immunoglobulin G, transferrin, anti-trypsin, haptoglobin, and immunoglobulin A), five of which are known to contain N-glycans. These glycan profiles were then compared to profiles originating from undepleted plasma samples. Internal standards were introduced just prior to solid-phase extraction (SPE) and provided a point of reference to compare glycan abundances between the different sample sets. For the majority, similar glycan profiles were observed between depleted and undepleted plasma samples; however, it was observed that low-molecular-weight fucosylated glycans significantly decreased in concentration as a result of abundant protein depletion. The abundance of high mannose, hybrid, and highly sialylated structures were less affected by protein depletion.
Peptide-N-glycosidase F (2.5 mU/μL) was purchased from Prozyme (San Leandro, CA). Sodium dodecyl sulfate (SDS), formic acid, trifluoroacetic acid (TFA), ammonium acetate, lacto-N-difucohexaose I (LND), lacto-N-fucopentose (LNF), and maltoheptaose were purchased from Sigma Aldrich (St Louis, MO). HPLC-grade acetonitrile and water were obtained from Burdick & Jackson (Muskegon, MI). Pooled human plasma was purchased from Innovative Research (Novi, MI). Graphitized solid-phase extraction cartridges (Part Number 210101) were from Alltech (Deerfield, IL). The multiple affinity removal system (4.6×100 mm MARS 6 column, product number 5188-5333, Agilent Technologies) was utilized to effectively remove 99% of the six most abundant proteins found in human plasma.
Three different aliquots of plasma (50 μL) were diluted 1:4 in buffer A (Agilent Technologies), filtered utilizing a 0.22 μm spin filter (product number 51855-5990, Agilent Technologies) at 14,000×g for two minutes and then injected onto the MARS 6 column. The flow through, consisting of the ensemble of lower abundant proteins, was collected for each sample. The buffers were then exchanged by ultracentrifugation (250 μL 50 mM Tris–HCl pH 7.5). For plasma samples that did not undergo abundant protein depletion, 50 μL of plasma (n=3) were lyophilized and reconstituted in 250 μL of 50 mM Tris–HCl (pH 7.5). All protein samples were then denatured with heat, SDS, and β-mercaptoethanol, digested with PNGase F for 18 h at 37°C, and purified using nonporous graphitized carbon extraction cartridges. Just prior SPE and after protein denaturation and digestion, 40 μL of a 10 μM internal standard mixture was added to the sample. The internal standard mixture consisted LND, LNF, and maltoheptaose. Eluents were lyophilized and reconstituted in 10 μL of HPLC-grade water and combined for a total of 40 μL. 5 μL of sample was diluted with 100 μL of ACN/H20 (80:20) and injected onto the LC column. The procedure used for cleavage, purification, and analysis of N-linked glycans from plasma is described in detail elsewhere .
Liquid chromatography was performed using an Eksigent nanoLC-2D system (Dublin, CA) operating under hydrophilic interaction chromatography (HILIC) conditions. Solvents A and B were 50 mM ammonium acetate (pH=4.5) and acetonitrile, respectively. A vented column configuration was used in these studies and was recently found in our laboratory to provide superior chromatography of peptides when compared to the discontinuous configuration . Four microliters of sample were injected onto a 10-μL loop and flushed out of the loop onto a ~5 cm self-packed IntegraFrit (New Objective, Woburn, MA) trap at 2 μL/min (80% ACN). During the wash phase, the nano-flow pumps were running over a 15-cm self-packed IntegraFrit “dummy” column to provide sufficient backpressure prior to the valve switching. After approximately ten trap volume washes, the ten-port valve (VICI, Houston, TX) switched in-line with the gradient and data collection commenced. Glycans were eluted at 500 nL/min from a 75 μm I.D. PicoFrit capillary column with a 15 μm I.D. tip (New Objective, Woburn, MA) packed in-house (10 cm) with 5 μm (100Å pore size) TSK-Gel Amide80 stationary phase (Tosoh Biosciences, San Francisco, CA). After 3 min, the gradient ramped to 60% solvent A over 37 min, held constant for 5 min and was then brought back to initial conditions (80% B) to re-equilibrate the column for an additional 10 min. The total run time was 55 minutes. Glycan samples resulting from depleted plasma and undepleted plasma were analyzed in random order to reduce measurement biases. A blank (80/20 ACN:H2O) was run between samples.
Mass spectrometric analyses were performed using a hybrid linear ion trap Fourier-transform ion cyclotron resonance mass spectrometer (Thermo Fischer Scientific, San Jose, CA) equipped with a 7-T superconducting magnet. The instrument was calibrated by following the manufacturer’s standard procedure. Electrospray ionization was achieved by applying a potential of 2 kV to a liquid junction pre-column. The capillary, tube lens voltage, and temperature were set to 42 V, 120 V, and 250°C, respectively. Full scans were performed in the ICR cell at a resolving power of 100,000FWHM at m/z 400, automatic gain control (AGC) of 1×106, and a maximum injection time (IT) of 1 s. Five MS/MS scans were performed in the ion trap per full scan at a normalized collisional energy of 26. For MS/MS spectra, the AGC was set to 1×104 with a maximum IT of 400 ms. A dynamic exclusion of 120 s was used to avoid repeated interrogation of abundant peaks.
Glycans were identified by using Glycoworkbench . Both manual interpretation of tandem MS spectra and SimGlycan 2 version 2.5.5 from Premier Biosoft International were used for structural elucidation. Tandem MS spectra in conjunction with the precursor ion masses were searched against all possible N-linked glycans found attached to plasma glycoproteins in the database. Not all MS/MS spectra were of high quality and in these cases previous literature reports [16, 18, 30] were used to aid in structural assignments. Xcalibur software version 2.0.5 was used for data analysis and peak integration. Glycan abundances were calculated by normalizing the integrated areas of the extracted ion chromatograms of each plasma glycan to the integrated areas of the three internal standards. This normalization has led to a high inter-sample reproducibility as previously reported .
Figure 1 summarizes the experimental design used for the depletion of six most abundant plasma proteins. After depletion and detection at 280 nm the flow through corresponding to the lower abundant proteins was collected and subjected to N-linked glycan profiling. This procedure was repeated three times. For comparison of N-glycan signatures between depleted and undepleted plasma, three pooled human plasma samples, without abundant protein depletion, were subjected to same N-glycan release and purification procedure.
The two main questions we wanted to address in this study are as follows: (1) are the majority of glycans in these global glycan profiling experiments derived from the most abundant plasma proteins? (2) Does depleting the abundant proteins afford more analyte coverage by either reducing ion suppression and/or allow for more thorough interrogation of low abundant species by removal of the more abundant glycans via data-dependent acquisition.
Figure 2a shows two overlaid base-peak ion chromatograms and compares the absolute abundances resulting from the analysis of N-linked glycans derived from both abundant protein-depleted and undepleted plasma samples. From a qualitative inspection of Fig. 2a, the two chromatograms appear very similar in both retention time and ion abundances. Small non-sialylated glycans eluted earlier while larger more hydrophilic glycans were retained longer under HILIC conditions. Certain glycan species considerably decreased in ion abundance after protein depletion or were not even detected as illustrated in the extracted ion chromatograms shown in Fig. 2b. Two of the three extracted ion chromatograms in Fig. 2b originating from the depleted sample were multiplied by 20× for adequate visualization. These three glycans are quite similar in structure as they are all low molecular weight, bi-antennary, and fucosylated.
Figure 2c displays the similarities in the ion abundance of bi- and tri-antennary sialylated glycan species between the two samples. These species, containing 1, 2, and 3 sialic acids, did not significantly differ between protein-depleted samples and undepleted samples indicating these species were mainly derived from the lower abundant plasma glycoproteins.
Based on high mass measurement accuracy (<3 ppm), approximately 50 different glycan compositions are reproducibly identified from plasma  utilizing the described glycan release, purification, and analysis procedure (vide supra). A subset of these compositions were chosen (n=19), structures elucidated and used to evaluate the effect of protein depletion on global glycan profiling. This subset of glycans encompasses the three main classes of N-linked glycans: high mannose, complex, and hybrid structures. In addition, the glycans used for data analysis encompass a relatively broad molecular weight range (1,200–3,700 Da) and included bi-, tri-, and tetra-antennary species as shown in Fig. 3a. It is worth noting that no distinction is made or implied between the α1-3 and α1-6 arms of the core in the structures. Figure 3a displays the 95% CI in the average abundance, normalized to the internal standards, of the 19 glycans investigated in these studies derived from plasma and abundant protein-depleted plasma, respectively. The two most abundant glycans in plasma were the same for both the depleted and undepleted plasma samples and consisted of the mono and di-sialylated bi-antennary species. In both sets of samples, complex structures were the most prevalent while hybrid and high mannose were much less common and abundant.
Figure 3b summarizes the degree at which each glycan changed in abundance as a result of protein depletion. These data were calculated by taking the log2 of the ratio of the average glycan abundance resulting from the analysis of undepleted plasma and depleted protein plasma samples. As initially assessed in Fig. 2, the majority of glycans that dramatically decreased in concentration, as a result of abundant protein depletion (indicated by greater than a twofold), were low-molecular-weight fucosylated bi-antennary and bisecting GlcNAc structures. (glycan # 4,5, 6, 8, 9, 11, 14, 15). Several of these glycans were the more abundant species (>10% relative abundance) in undepleted plasma (Fig. 3a). The majority of these species were core fucosylated with some potentially being fucosylated at the nonreducing end; indicated by a fragment peak in the tandem MS spectra with m/z 512 corresponding to Hex1HexNAc1Fuc1 (data not shown) . Interestingly, these types of glycans have been implicated in cancer development, providing from moderate to high diagnostic values in several independent studies [11, 13, 16, 31, 32].
The abundance of small high mannose (glycan # 1 and 2), hybrid (#3), and large heavily sialylated species (#16–19) were marginally affected by protein depletion. This is shown by the small deviation from the dotted line in Fig. 3b which indicates these glycans changed in abundance by lesser than or equal to twofold as a result of protein depletion. These results signify that the majority of these species are derived from the ensemble of lower abundant proteins in plasma. The tetra-sialylated structure with a fucosylated core (Glycan # 19) was the only species found to increase in abundance as a result of protein depletion.
To answer the second question proposed in the “Results and discussion” section (vide supre) a thorough investigation of the species detected in the different sample sets was conducted. We were unable to identify any glycan species that were present in the depleted protein samples and not in the undepleted sample. Therefore, the detection of glycans from the removal of abundant plasma proteins did not yield deeper analyte coverage as it does for an analogous proteomics experiment. However, it was observed that the majority of glycan species identified in these global glycan profiling experiments are at least partially derived from the collection of lower abundant proteins. For future biomarker discovery experiments, it will be beneficial to know the species that tend to be derived from one or more of the higher abundant plasma proteins.
In this study, we investigated the effects of protein depletion on global glycan profiling by nanoLC FT-ICR mass spectrometry. The ratio of glycan abundances from depleted and undepleted samples indicated that low-molecular-weight fucosylated glycan structures were mostly derived from one or more of the six most abundant proteins found in plasma. The abundances of larger species and heavily sialylated structures were found to be mostly derived from the ensemble of lower abundant proteins. Protein depletion may prove a viable method for future glycan biomarker discovery investigations. Although it would add an additional sample processing step, it would allow for the removal of several of the more abundant glycans that are derived from proteins whose concentrations are known to vary several fold in plasma  potentially enhancing glycan biomarker discovery.
The authors gratefully acknowledge the financial support provided by the National Institutes of Health (R33 CA105295), the W.M. Keck Foundation, and North Carolina State University.
The data associated with this manuscript will be available for download via Tranche (https://proteomecommons.org/tranche/) after publication.