|Home | About | Journals | Submit | Contact Us | Français|
This publication describes the development of an automated platform for the study of the plasma glycoproteome. The method consists of targeted depletion in-line with glycoprotein fractionation. A key element of this platform is the enabling of high throughput sample processing in a manner that minimizes analytical bias in a clinical sample set. The system, named High Performance Multi Lectin Affinity Chromatography (HP-MLAC), is composed of a serial configuration of depletion columns containing anti-albumin antibody and Protein A with in-line multi lectin affinity chromatography (M-LAC) which consists of three mixtures of lectins Concanavalin A (Con A), Jacalin (JAC) and Wheat Germ Agglutinin (WGA). We have demonstrated that this platform gives high recoveries for the fractionation of the plasma proteome (≥95%) and excellent stability (over 200 runs). In addition, glycoproteomes isolated using the HP-MLAC platform were shown to be highly reproducible and glycan specific as demonstrated by re-chromatography of selected fractions and proteomic analysis of the unbound (glycoproteome 1) and bound (glycoproteome 2) fractions.
The field of clinical glycoproteomics has dramatically intensified, and an effort has focused on glycoproteins because of their biological significance and relevance to disease. The plasma glycoproteome has significant clinical value as a source of biomarkers, and most proteins in plasma are predominantly glycosylated1. The complexity of plasma proteome, the wide dynamic range and glycan heterogeneity has been the major obstacles for the discovery of clinical biomarkers. Nonetheless, there is compelling motivation in continuing to study this biofluid2. There is general agreement that to detect candidate biomarkers present at moderate to low protein concentration, it is necessary to first remove high abundance proteins.3 The most commonly used method for simplication of the proteome utilizes affinity based techniques; these are advantageous because of their high selectivity.4 Among these, the so called immune-depletion columns are widely used. Monoclonal and polyclonal antibodies are a promising choice for their high specificity for removal of high abundance proteins, but they may not recognize all form of the proteins5. Major manufactures of these immunoaffinty depletion colums are Agilent, Genway and Sigma. Lectin-based capture reagents have become an important analytical tool in clinical glycoproteomics for plasma and serum. Lectins are a diverse group of carbohydrate-binding proteins; and other studies have shown that the affinity of lectins for sugars is lower than corresponding antibody-antigen interactions. This property is advantageous for affinity chromatography, since elution of adsorbed proteins is more efficient and recoveries of bound proteins are generally robust. Several laboratories have reported on the use of lectins for clinical samples; differences in lectin binding patterns have been associated with possible differences in glycosylation in disease samples.6, 7 8 Commonly used lectins, such as Concavalin A (ConA) or wheat germ agglutinin (WGA) have overlap affinity for a broad range of different type of glycan structures. Therefore it is a challenge to select the appropriate lectin for the affinity selection of a given glycan or glycoprotein and achieve complete binding of the targeted analyte.9 It was for these reasons that we developed the multi-lectin affinity approach (M-LAC), which uses admixtures of lectins, and gives rise to multivalent association with plasma glycoproteins, resulting in better capture of the plasma glycoproteome. In a recent publication, we reported up to 10 fold enhancment in binding affinities with the multiple lectin format compare to the corresponding individual lectins (ConA, JAC and WGA).10 Furthermore, we have previously reported on the combination of abundant protein depletion with M-LAC and have shown a deeper mining of the plasma glycoproteome6, 11
The M-LAC technology has been developed into a high performance multi lectin column (HP-MLAC) by covalent immobilization of the three lectins (ConA, WGA and JAC) to a polystyrene-divinylbenzene support matrix (POROS™) 12 with good flow and pressure properties to enable rapid affinity selection of glycoproteins from biological samples by HPLC. The HP-M-LAC can be easily integrated with abundant protein depletion or other chromatography modes for multidimensional sample fractionation. As is becoming more appreciated in the proteomic community, effective sample preparation is an essential step in comparative proteomics studies. Thus our interest has been the development of a sample fractionation workflow that minimizes the number of sample handling steps and and resultant losses, ex-vivo proteolysis or chemical modifications. We report here that the HP-MLAC approach has been effectively integrated with protein depletion prior to the glycoprotein enrichment step and in-line sample concentration/ desalting before trypsin digestion and LC-ESI-MS analysis. To this end, we report on the development of a robust and reproducible high performance automated platform for plasma fractionation that allows high throughput sample processing for clinical proteomics
Aldehyde POROS- 20 AL (20 µm beads), POROS Protein A (PA) (POROS-PA50 resin), POROS-R1-50 resin and POROS anti-HSA (2.0 mL) column were purchased from Applied Biosystems, (Foster City, CA). Unconjugated lectins: concanavalin A (ConA), jacalin (JAC), wheat germ agglutinin (WGA), were purchased from Vectors Laboratories (Burlingame, CA). Sodium cyanoborohydride, sodium azide, sodium sulfate, sodium chloride, ultra pure (hydroxymethyl)aminomethane hydrochloride, sodium azide, glycine, guanidine hydrochloride, dithiothreitol, ammonium bicarbonate, iodoacetamide, manganese chloride, calcium chloride, and Ponceau S were purchased from Sigma (St. Louis, MO). PEEK columns were purchased from Isolation Technology (Milford, MA). Bradford protein assay kit, trifluoroacetic acid (TFA), formic acid, glacial acetic acid, and HPLC grade acetonitrile, HPLC grade water, and trypsin were purchased form ThermoFisher Scientific (Waltham, MA). Plasma was purchased from Bioreclamation Inc (Long Island, NY). LC-MS columns 150 mm × 75 µm i.d. were purchased from New Objectives (Woburn, MA), reversed phase C18 Magic bead size 5 µm, pore size 300 Å were purchased from Microm BioResource (Auburn, CA). Coomassie blue, SDS-PAGE gels, Emerald ProQ glyco staining kit were purchased from Invitrogen (San Diego, CA)
Protein fractionation was carried out using an automated abundant protein depletion in-line with HP-M-LAC and sample desalting/concentration with a reversed phase column (RP trap). Chromatography was performed using a Prominence 2D HPLC (Shimadzu) equipped with 2 pumps, a sample injection valve and three additional auxiliary valves which allowed operation of multiple columns. Columns were switched on or off line for sample binding, column washing and elution purposes; each valve was controlled independently from the software through contact closures.
The depletion columns used in this study were: Protein A (4.6mm × 50mm) and POROS anti-HSA column to reduce the plasma concentration of immunoglobulin G (IgG) and albumin, respectively.
The lectins in the HP-MLAC column (4.6 × 100 mm) were cross-linked to a high pressure support following a previously described protocol 12 The RP trap used was a POROS-R1 resin. The PA, HP-MLAC and RP trap beads were packed under high pressure into PEEK columns with using a self packing device (Applied Biosystems). The sample (50 uL) was diluted 1:4 with binding buffer and was loaded in the protein depletion columns. The automated platform was performed in a Prominence 2D HPLC Shimadzu system.
Protein A was placed in front of the anti-albumin and the two columns were connected with a union and placed in the HPLC first valve, followed by M-LAC in valve 2 and the RP-trap in valve 3. The configuration of this process is shown in Fig. 1. The sample was introduced to the columns at a flow rate of 0.5 ml/min and 13.75 mL of binding buffer was passed through the depletion columns, which gave enough volume to transfer the depleted plasma into the HP-MLAC column. The HP-MLAC column was taken offline and the protein depletion columns were eluted with 100 mM glycine pH 2.5 at 5.0 ml/min, the fraction was collected and protein concentration was measured using the Bradford assay.
The Protein depletion columns were then washed with 5 CV of neutralization buffer (0.25 M Tris, 1 M NaCl, pH 7.5) at 5.0 ml/min and equilibrated with 5 CV of binding buffer at 5.0 ml/min. Proteins with no affinity for the HP- M-LAC column (unbound HP-M-LAC fraction) were washed with 5 CV of binding buffer at 5.0 ml/min and directly captured by the RP-trap which was previously equilibrated with 5 CV of aqueous buffer (0.1 % TFA in 5 % acetonitrile, 95 % water). The protein was then eluted with 70% organic phase (0.085% TFA in acetonitrile) at 5 ml/min and the column was immediately equilibrated with the aqueous phase. The protein peak from the RP trap was collected and immediately diluted with water to bring the concentration of acetonitrile to 30 % and neutralized with 20 uL of 1 M tris buffer, pH 9.0. The enriched glycoproteins bound to the HP-M-LAC column were then eluted with 5CV of elution buffer (100mM acetic acid pH 4.0) at 5.0 ml/min and again captured on the RP-trap column; the trap was washed and glycoproteins were eluted, diluted and neutralized as described above for the unbound HP-MLAC fraction. The volume of the unbound and bound HP-M-LAC eluted from the RP-trap was reduced to 100 uL using vacuum centrifugation.
Total protein concentrations were measured using the Bradford 13 protein assay as per manufacture’s instructions. Protein recovery for both depletion and M-LAC fractionation steps were determined.
To validate the workflow of HP-MLAC platform, repetitive runs were performed. The unbound and bound HP-MLAC samples were run in 1D SDS-PAGE and stained with Commassie blue dye to visualize total proteins and glycoproteins were stained with a Schiff-base according to the manufacture’s reagent kit (300 ProQ Emerarld, Invitrogen). Fractions were subjected to in solution trypsin digestion and LC-MS analysis using previously described methods6. Briefly, proteins were denatured with 6 M guanidine. The samples were reduced with 5 mM TCEP for 15 min at room temperature and alkylated with 15 mM iodoacetamide for 25 min at room temperature in the dark. The reaction was then quenched by addition of another aliquot of 5 mM DTT. After diluting samples with 100 mM ammonium bicarbonate, pH 8.0 to bring guanidine-HCl concentration down to 1.2 M. Trypsin (1:40 w/w) was added to the samples and incubated for 16 hrs at 37°C.
Prior to LC-MS the unbound and bound HP-MLAC samples were desalted using reversed phase HPLC. The peptides were separated from the non or partially digested proteins using a POROS R1 reversed phase column. The mobile phase A was composed of 0.1% TFA in water, and mobile phase B was 0.085% TFA in HPLC grade acetonitrile. The digested proteins were loaded on the column at 2% solvent B and washed for 3 min to remove salts and other reagents from trypsin digestion. The bound peptides were eluted with a step gradient: 28% solvent B for 3 min, to collect peptide to be analyzed by nano-LC-MS/MS, and then 95% solvent B for 5 min, to elute larger peptides, partially digested and/or non-digested proteins. The separation was performed at 3.0 ml/min and monitored at both 280nm and 214 nm. Peptides eluted with 28 % B were concentrated down to 50 uL using vacuum concentration.
The nano-LC-MS/MS was performed using an Eksigent system (Dublin, CA) interfaced with an LTQ linear ion trap mass spectrometer (ThermoFisher Scientific). The composition of solvent A was 0.1 % (v/v) of formic acid in water and that of solvent B 0.1 % (v/v) of formic acid in HPLC grade acetonitrile. Concentrated peptide samples were injected using an autosampler onto a C18 capillary column (150 mm × 75 µm i.d.) packed in-house with Magic C18. The flow rate was 300 nL/min; the gradient was from 5 % solvent B to 40 % solvent B over 105 min, then from 40% solvent B to 60 % solvent B over 20 min, then from 60 % solvent B to 90 % solvent B over 5 min and held isocratic at 90 % solvent B for 15 min. The mass spectrometry method was the same as described previously.6 Briefly, the electrospray conditions were: temperature of the ion transfer tube, 245°C; spray voltage, 2.0 kV; normalized collision energy, 35%. Data dependent MS/MS analysis was carried out using MS acquisition software (Xcalibur 2.0, ThermoFisher Scientific). Each MS full scan was acquired in a profile mode in the mass range between m/z 400 and 2000, followed by 7 MS/MS scans of the 7 most intense peaks. Dynamic exclusion was continued for duration of 2 min.
Protein/peptide identifications were obtained through a database search against a human proteomic database using the Sequest Cluster search engine (ver. 3.0) and stored in CPAS.14 The database search was conducted against human protein databases Swiss-Prot (release 52 with 15498 protein sequences). The databases consist of normal and reversed protein sequences to facilitate estimation of the false positive rate.15 Trypsin was specified as the digestion enzyme with up to two missed cleavages and carboxyamidomethylation was designated as a fixed modification of cysteine. In order to minimize the level of false positive identifications, criteria that would yield an overall confidence of over 95% for peptide identification were established for filtering raw peptide identifications. This was achieved by filtering the Sequest results using the so called HUPO criteria16 DCn F 0.10, Xcorr F 1.9, 2.2, and 3.75 for singly, doubly, and triply charged ions, respectively, followed by validation using Peptide Prophet analysis16 with cutoff at 0.95 peptide probability to eliminate low confidence identifications.
In this work, we have studied the performance of an automated and high throughput proteomics platform optimized for clinical studies. The platform consists of sequential, multidimensional HPLC fractionation of plasma. The crude plasma is diluted with binding buffer (25 mM Tris, 0.5 M NaCl, 1mM MnCl2, 1 mM CaCl2 pH 7.4, 0.05% sodium azide) the buffer composition is compatible with all the affinity columns in the platform.In the first dimension, albumin and immunoglobulin G were depleted from plasma using the corresponding affinity columns, the sample was then loaded into the HP-MLAC column, where the unbound was washed and concentrated/desalted into the reversed phase trap column. The sample was concentrated using the speed vac apparatus. The bound fraction from HP-MLAC was eluted and concentrated into the trap column. A diagram of the workflow of our approach is shown in Fig. 1.
In this manner, the plasma sample gets loaded into the depletion and the HP-MLAC enrichment column and is ready for trypsin digestion, minimizing sample manipulations which could introduce losses of low level proteins. We evaluated the performance of in-line protein depletion and M-LAC fractionation using normal human plasma. Five independent loadings of the human plasma sample were performed on the HP-MLAC platform. The total protein concentration in each fraction (unbound, bound and depleted plasma) was measured using the Bradford protein assay and Table 1 summarizes the data. It can be seen that on average the recovery was 96 % with a SD of < 3 with a major fraction of the plasma proteins being removed in the depletion step (approximately 80%). The two glycoprotein fractions were roughly equivalent (approximately 10% of the load) with overall SD of ≤ 3 for all fractions. Compared to our previously described multi-lectin (M-LAC) immunodepletion method. We have found that this automated platform gives an improved overall recovery (up from 80% with soft gels to 90+ %) and has resulted in a 10 fold increase in sample throughput.
As shown previously,6 integrating depletion of abundant plasma proteins with M-LAC fractionation can improve the depth of analysis of the plasma proteome. There are a variety of commercially available affinity capture columns such as protein A, protein G, or protein L, and anti –human serum albumin. There are also a number of multicomponent immunoaffinity matrices which target the most abundant plasma proteins for instance, MARS-7and 14 (Agilent), IgY12 (GenWay), as well as top 20 (Sigma) which will deplete the top 7, 12, 14, and 20 most abundant proteins, respectively. One concern when using affinity based depletion strategies includes the loss of potentially important biomarkers due to binding to depleted proteins or to non-specific interactions with the affinity column. For instance, HSA is known as a nonspecific binding protein due to its biological role as a carrier protein to assist in the circulation and distribution of other proteins.17 Furthermore, depletion when used alone is not sufficient to detect low abundance proteins due to the complexity of plasma samples and thus further fractionation is required. A successful clinical proteomics study requires an appropriate balance between several constraints and in this study we will focus on the issues of bias, throughput, recovery, reproducibility and depth of analysis. For example, the combination of a depletion column, such as 2 proteins (2P) with a chromatographic step (reversed phase, strong cation exchange, gel electrophoresis, or IEF separation) will generate multiple fractions. As a result, one has to overcome the issues of high cost, extended length of the study and most importantly large sample requirements, sample manipulation, reproducibility, and recoveries. As a solution we developed a method which consists of combining 2P depletion with HP-M-LAC which yields only two fractions per clinical sample. Our results suggested that 2P depletion combined with M-LAC gave an appropriate balance between two important constraints in a clinical proteomics study, namely through put and depth of analysis. Moreover, we will illustrate the issue of potential bias with the use of affinity-depletion columns and the concern about the stability of the antibodies over time. Fig. 2 shows examples of trend analysis of the leak through of two proteins following depletion of top 12 most abundant proteins.
This data was obtained from an independent clinical study, w here 42 plasma samples were randomized and depleted using 12P depletion column (manuscript in preparation). While the amount of protein leakage was small (typically a few %) the trend analysis suggested that some of the depleted proteins such as α-1 acid glycoprotein 1 and transferrin showed increased levels later in the study. In addition, there are several other factors that present barriers to widespread use of multi antibody depletion columns, e.g. the cost of the depletion column ($150 per analysis), limited lifetime (100–150 samples), limited loading capacity (25 to 100 µL) and buffers that are incompatible with other chromatographic steps (e.g MARS). These considerations, therefore, encouraged our laboratory to examine lower cost, less complex depletion strategies which are described here.
The reproducibility of HP-MLAC was evaluated in triplicate by 1D SDS-PAGE and stained using Coomassie Blue and Schiff-base glyco staining. As seen in Fig.3 the protein patterns observed across replicates is reproducible. The same pattern of bands was observed with glyco staining (see fig. 3b for representative results) and indicated that the unbound and bound fractions from the HP-MLAC were predominately glycosylated. These results suggest that HP-MLAC fractionates the plasma proteome into two glycoproteomes. The composition of the glycoproteome 1 (unbound fraction) and glycoproteome 2 (bound fraction) and glycan specificity will be the subject of a future publication. The absence of detectable amounts of albumin and immunoglobulins in the M-LAC fractions shown in Fig. 3 demonstrated effectiveness of the depletion column (run 1 lanes b,c unbound and bound HP-MLAC, run 2 lanes e,f, run 3 lanes h,k respectively).
To demonstrate the reproducibility of individual glycoprotein fractionation on M-LAC column we performed both physical (independent plasma fractionations) and analytical replicates. The resulting glycoprotein fractions were digested with trypsin as previously described 6 and analyzed by LC-MS/MS. Figure 4 shows the reproducibility of the HP-MLAC platform, the plot gives the correlation of two independent runs analyzed in the entire platform. As seen in the figure the correlation coefficient for the two independents runs is 0.994 and 0.957, for unbound and bound fractions, respectively; showing a good reproducibility of the platform. In addition we have demonstrated complete removal of the albumin (as measured of spectral counts of M-LAC fractions) and constant high level removal of IgG (≥95%). As previously reported12 we have used acetic acid to elute glycoproteins bound to the M-LAC column, and thus we decided to monitor the lifetime of the column. We performed an independent run of the same plasma sample three months after the column was first packed and tested in the HP-MLAC platform (approximately 200 runs) and again, good reproducibility was observed with a correlation coefficient of 0.9697(data not shown). These results demonstrated the stability of the HP-MLAC platform using a mild acid elution step.
To show the specificity of the HP-M-LAC column for the glycan subpopulation of a given glycoprotein we re-injected the unbound and the bound fraction into the M-LAC column (see Fig. 5) and demonstrated reproducibility of the chromatographic process. We also evaluated the specificity of the column by analyzing the levels of individual glycoproteins present in the unbound and bound M-LAC fractions by LC-MS/MS (see methods section). The glycoprotein identification and relative distribution was highly reproducible between the 4 replicates (see Table 2, SD ≤6 as measured by spectral counts). Table 2 shows examples of the identified proteins which are all known to be glycosylated and it can be seen that some glycoproteins were present at similar amounts in both fractions (unbound and bound) such as ceruloplasmin, afamin and AMBP protein. Some of the proteins were identified only in the bound M-LAC fraction for instance kininogen, lumican, plasma retinol-binding protein or the unbound fraction (talin). In conclusion, the number of peptides identified per glyco protein in the unbound and bound fractions were highly consistent over the reproducibility study (Table 2). Furthermore, the demonstrated consistency of the HP-MLAC platform is important parameter when studying glycosylation changes in disease vs. control samples. Bottom up proteomics workflows used in biomarker discovery involve multiple interdependent process steps.
The automated sample processing for glycoproteomic analysis presented in this study greatly minimize such upstream variations and sample losses during sample preparation. Improvements in any of the steps in a proteomics study increases the reliability of the analytical method and have a direct effect on the quality the data and the depth/protein coverage of the analysis. Moreover, the use of “spectral counting” is becoming widely used as a standard method for protein semi-quntitation, in this strategy the number of peptides identified for a given protein is used as an initial survey to compare abundance differences among the clinical samples. Therefore, the reproducibility of the analytical measurement is crucial in comparative clinical proteomics, and it requires the run-to-run precision demonstrated in this platform.
The authors thank Dr. Tomas Rejtar, Agnes Rafalko and Arian Bane for helpful discussions and technical support. This work was supported by the National Cancer Institute (U01-CA128427-01). W. S. H and M .H. disclose that they have a financial interest in current efforts by Northeastern University and PeptiFarma to licence the M-LAC technology for biomarker discovery.