|Home | About | Journals | Submit | Contact Us | Français|
Blood plasma proteins with molecular weights greater than approximately 30 kDa are refractory to comprehensive, high-throughput qualitative characterization of microheterogeneity across human populations. Analytical techniques for obtaining high mass resolution for targeted, intact protein characterization and, separately, high sample throughput exist, but efficient means of coupling these assay characteristics remain rather limited. This article discusses the impetus for analyzing intact proteins in a targeted manner across populations and describes the methodology required to couple mass spectrometric immunoassay with electrospray ionization mass spectrometry for the purpose of qualitatively characterizing a prototypical large plasma protein, vitamin D binding protein, across populations.
One of the major goals of proteomic research is to discover and ultimately validate new protein biomarkers of disease. Arguments abound regarding the best forms, techniques, and tools of proteomics to use to accomplish this goal. The U.S. Food and Drug Administration defines a biomarker as, “A characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention” (emphasis supplied). Considering this definition, it is critical to understand that a (protein) biomarker of disease is not just “a protein”—it is something about a protein—generally either a change in concentration, amino acid sequence, and/or qualitative/quantitative changes in posttranslational modification (collectively referred to as microheterogeneity). This suggests the hypothesis that proteins represent potential biomarker platforms and that all three of the above biomarker qualifiers must ultimately be considered in a comprehensive analytical manner for all candidate proteins.
This understanding, in turn, dictates a particular analytical approach to biomarker development—namely, that targeted analysis of intact proteins by MS across populations of samples needs to be done to allow efficient operation without (what amounts to) analytical blindspots to at least two of the three aforementioned categories of protein biomarkers.
Coupled with proper sample preparation, mass spectrometers excel at providing highly detailed information towards whole-protein and peptide characterization. Generation of valuable biological data via detailed characterization of proteins (as distinguished from identification via surrogate peptides) was a major goal envisioned for modern mass spectrometers shortly after the advent of electrospray and matrix-assisted-laser desorption/ionization (MALDI) techniques.1,2 With regard to the three major categories of protein biomarkers described above, only MS possesses the capability to track all three categories simultaneously (and nearly comprehensively) at a degree of simplicity and sensitivity compatible with common experimental constraints.
Mass spectra do not come with labels; thus mere acquisition of vast quantities of data does not necessarily constitute a value-added proposition. In fact, the presence of unidentified peaks in a spectrum that do not arise from any specific protein(s) under consideration (as if the spectrum was being interpreted manually), or could stem from multiple proteins under consideration, generally only serve to confound interpretation of the spectrum—especially considering the vast potential for protein microheterogeneity.3–11 Carefully designed sample preparation steps, therefore, are crucial to defining how mass spectra can be interpreted and, therefore, their overall information content. (Creative sample preparation often leverages a surprisingly high degree of information toward determination of protein structure,12–16 regardless of the number of mass spectral peaks present.) Considering the potential for microheterogeneity, isolation of individual (intact) proteins (by affinity capture, chromatography, or otherwise) must be done prior to analysis by MS to maximize limits of detection out of complex biological matrices and thus maintain high sensitivity towards all major categories of protein biomarkers: Regardless of the degree of sophisticated data processing applied, great collections of raw, uninterpretable mass spectral peaks cannot, ultimately, effectively serve as meaningful biomarkers.
The precise molecular weights of full-length protein molecules contain both genetic and posttranslational modification information—i.e., both genotypic and protein phenotypic information. As a method, detection and identification of proteins via peptide surrogates lacks this comprehensive facet—allowing microheterogeneities to easily avoid detection. Analysis of intact proteins allows for examination of only a small fraction of the entire “proteome” within a limited time frame, but it yields a comprehensive single-protein analysis; we are left with a clear indication of where we ought to look next. Mass spectral analysis of intact proteins, while disadvantaged in its lack of ability to provide residue-specific information on microheterogeneities in a single experiment, does naturally provide full sequence coverage all the time with constant, near-comprehensive analysis of microheterogeneity and the option of quantitative analysis if desired,10,17–21 fulfilling the analytical need to track all three major categories of protein biomarker.
Different forms of what are considered the same protein oftentimes exist not only within a single person, but with even greater diversity across populations.7 Such variations may well serve as biomarkers if they delineate differences between healthy and diseased populations. A statistically meaningful assessment of these variations within the appropriate populations can only be gained by analyzing many samples, particularly in cases such as the one demonstrated below wherein more than 10 different forms of vitamin D binding protein (DBP) are found within the “normal” population.
Stemming from work on intact protein characterization and quantitation, mass spectrometric immunoassay (MSIA) was developed over a decade ago for the purpose of facilitating high-throughput studies of targeted, intact proteins across large numbers of samples.10,17–21 Since that time, MSIA has been used to successfully identify and verify new protein biomarkers of disease.22 In its original incarnation, MSIA was coupled to MALDI time-of-flight mass spectrometry (MALDI-TOF-MS) as an endpoint analytical technique. Without special efforts, however, MALDI-TOF-MS lacks the capacity to resolve smaller, biologically meaningful mass shifts at m/z ratios greater than about 30,000. This limitation begins to reintroduce analytical blindspots into a biomarker discovery program that is based on the analysis of intact proteins.
By increasing the charge state of intact proteins, ESI-MS, particularly when coupled with mass analyzers with high resolving power, promises to at least begin to overcome the aforementioned resolution limitation of MALDI-TOF-MS. Thus, to begin to enable routine discovery of unique qualitative features of larger proteins as biomarkers of disease, we describe here methodology for coupling MSIA with electrospray ionization MS using vitamin D binding protein (DBP) as a prototypical, high-molecular-weight plasma protein that happens to contain numerous qualitative differences not only between individuals but between allele products within a single person as well.
Polyclonal rabbit anti-human DBP (GC-globulin) antibodies were obtained from DAKO (Cat. No. A0021, Carpinteria, CA). Premixed MES-buffered saline powder packets were from Pierce (Rockford, IL). Isolation of DBP from plasma was carried out with proprietary MSIA pipette tips (MSIA-Tips) from Intrinsic Bioprobes (Tempe, AZ) derivatized with the DBP antibodies via 1,1′-carbonyldiimidazole (CDI) chemistry as described below. EDTA plasma samples were purchased from ProMedDx (Norton, MA). Protein Captrap catridges for liquid chromatography MS (LC/MS) were obtained from Michrom Bioresources (Auburn, CA). Premade 10-mM Hepes-buffered saline (HBS-N) and 10-mM HBS-N with 3 mM EDTA and 0.005% (v/v) polysorbate 20 (HBS-EP) were purchased from Biacore (Piscataway, NJ). All other chemicals were obtained from Sigma-Aldrich (St. Louis, MO).
As previously described,11 Intrinsic Bioprobes loaded soda lime glass beads into stainless steel annealing molds and baked them into solid yet porous frits. Next, the frits were acid conditioned then treated for 12 h with 10% amino-propyltriethoxysilane. The resulting amine-functionalized frits were then equilibrated in a phosphate buffer followed by exposure to 15 kDa carboxymethyldextran and CDI followed by a final rinse in phosphate buffer to provide a highly functionalized carboxylic acid surface. Intrinsic Bioprobes provided the MSIA-Tips at this stage. Subsequently, the MSIA-Tips were rinsed thoroughly with 0.2 M hydrochloric acid to create free acid carboxyl groups (by eliminating any sodium/potassium salts). The HCl rinse was followed by an acetone rinse and drying under vacuum. MSIA-Tips were then activated with CDI23 by exposing them to a 50 g/L solution of CDI in 1-methyl-2-pyrrolidone (NMP) by continuously pipetting the solution up and down (150 μL/well; 50 μL vol; 450 cycles, or approximately 45 min) with the aid of a Beckman Multimek 96-channel automated pipettor robot. (No special precautions were taken to ensure the dryness of the NMP.) Following exposure to CDI, MSIA-Tips were blotted to remove excess NMP then rinsed twice with NMP (400 μL/well; 100 μL vol; 10 cycles), with blotting after each rinse, to eliminate excess CDI. While still slightly wet with NMP, MSIA-Tips were then immediately exposed to anti-human DBP antibodies in MES buffered saline, pH 4.7 (0.05 g/L; 150 μL/well; 50 μL volume; 50 cycles; 15 repetitions/loops, approximately 50 min total exposure time) to facilitate their covalent attachment to the MSIA-Tips. Finally, MSIA-Tips were rinsed with 1 M ethanolamine, pH 8.5 (400 μL/well; 100 μL volume; 50 cycles) to cap any remaining CDI-preactivated sites, followed by two rinses with HBS-N (400 μL/well; 100 μL volume; 50 cycles). MSIA-Tips were stored in HBS-N at 4°C and used within 3 months of preparation.
When designing experiments that involve affinity extraction, it is important to keep in mind that immobilized affinants (antibodies, binding partners, etc.) do not function by “mopping up” everything presented to them in solution, given enough time. Binding to an immobilized affinity partner is governed by the equilibrium proportion:
where TipAntamt is the amount of protein bound to the affinity tip or sorbent, Tipamt is the molar quantity of binding sites available on the tip or sorbent, [Ant] is the molar concentration of antigen in solution, and Kd is the equilibrium binding dissociation constant. Thus, to maximize extraction yield, efforts should be made to use immobilized binding frits or sorbents with high capacity relative to analytical detection limits and antibodies with the lowest available Kd values should be used. Finally, the concentration of antigen in solution should be kept as high as possible.
For the experiments described here, 125 μL of EDTA plasma was mixed with 1.25 μL of 10% Tween 20 then diluted to 250 μL total volume with HBS. Samples were frozen at −80°C until use. Some plasma samples (as described) were preincubated with calcium by adding a small volume of 1 M CaCl2 (diluted in HBS) to approximately 125 μL plasma to obtain the desired Ca2+ concentration, followed by incubation at 41°C for 2–3 h, then 2-fold dilution with HBS. With the aid of a Beckman Multimek 96-channel automated pipettor robot, antibody prelinked MSIA-Tips were prerinsed (400 μL/well; 150 μL aspirate and dispense cycles; 10 cycles) with HBS then used to extract DBP from individual plasma samples at room temperature (250 μL/well; 85 μL aspirate and dispense cycles; 250 cycles, or about 20 min of exposure time). MSIA-Tips were then ejected from the robot and allowed to sit in their respective plasma samples at room temperature until they were individually washed (by drawing from a fresh reservoir of liquid and dispensing to waste) and eluted as follows: Five cycles of 150 μL of HBS-N, five cycles of 150 μL distilled water, five cycles of 150 μL of 2 M ammonium acetate/acetonitrile (3:1 v/v), and ten cycles of 150 μL distilled water. Elution was accomplished by air-drying the pipette frits then drawing 5 μL of a mixture of 100% formic acid/acetonitrile/distilled water (9/5/1 v/v/v), mixing over the pipette affinity capture frit for 20–30 sec, and dispensing into a 96-conical well polypropylene autosampler tray. Frits were then washed with an additional 5 μL distilled water which was used to dilute the eluted sample. Five microliters was injected into the LC-TOF-MS within 10 min of elution.
A trap-and-elute form of sample concentration/solvent exchange rather than traditional LC was used for these analyses. Five-microliter samples were injected by a Spark Holland Endurance autosampler in microliter pick-up mode and loaded by an Eksigent nanoLC*1D at 10 μL/min (90/10 water/acetonitrile containing 0.1% formic acid, solvent A) onto a protein captrap (Microm Bioresources, Auburn, CA) configured for unidirectional flow on a 6-port divert valve. After 2 min, the divert valve position was automatically toggled and flow over the cartridge changed to 1 μL/min solvent A (running directly to the ESI inlet) which was immediately ramped over 8 min to 10/90 water/acetonitrile containing 0.1% formic acid. By 10.2 min the run was completed and the flow back to 100% solvent A.
The bulk of DBP eluted between 5.5 and 7.5 min into a Bruker MicrOTOF-Q (QTOF) mass spectrometer operating in positive ion, TOF-only mode, acquiring spectra in the m/z range of 50 to 3000. ESI settings for the Agilent G1385A capillary nebulizer ion source were as follows: End Plate Offset −500 V, Capillary −4500 V, Nebulizer nitrogen 2 Bar, Dry Gas nitrogen 3.0 L/min at 225°C. Data were acquired in profile mode at a digitizer sampling rate of 2 GHz. Spectra rate control was by summation at 1 Hz.
Approximately 1.5 min of recorded spectra were averaged across the chromatographic peak apex of DBP elution. The ESI charge-state envelope was deconvoluted with Bruker DataAnalysis v3.4 software to a mass range of 1000 Da on either side of any identified peak.
Figure 1 demonstrates the degree of structural micro-heterogeneity observed for DBP when a population of greater than 100 people [61 “healthy” and 52 with type 2 diabetes (T2D)] was examined. Several key features of DBP are notable within the mass spectra presented below.
Historically, DBP has been studied as a differentiating marker of inheritance since 1960.24 Three common alleles are known to exist along with many rare variants.25,26 The three common DBP alleles arise from two point mutations: D416E and T420K, generating the GC*1F allele (containing D416 and T420), GC*1S allele (containing E416 and T420), and the GC*2 allele (containing D416 and K420).
In the deconvoluted spectra shown in Figure 1, variant DBP genotypes (for which a gene sequence has been established) are readily determined based on mass (see Figure 1 caption for the calculated masses of the different DBP allele products). As given by the deconvoluted mass spectra, the molecular masses of most of the DBP variants observed fit the sequences of the three major DBP alleles reported in the UniProtKB/Swiss-Prot database, considering that all cysteine residues are involved in disulfide bonds. Results are unambiguous and produced in an average of about 15–20 min per sample (from raw plasma sample to deconvoluted mass spectrum) when run in batch mode. The standard deviation in mass accuracy across the 100 samples analyzed was less than 2 Da surrounding the calculated mass.
The distribution of DBP alleles in the two populations studied is given in Figure 2. These populations consisted of African-American, Caucasian, and Hispanic individuals. For the “healthy” population, n = 122 alleles, and for the T2D population, n = 104 alleles. The data show a statistical excess of the GC*1S allele (and corresponding deficit of the GC*1F allele) in the diabetic population. Statistically, a Chi-squared test with 2 sample donor types and 3 major DBP alleles, α set equal to 0.01, and 2 degrees of freedom gives χ2 = 49.6, p < 0.0001, and Cramer’s V = 0.474. (Cramer’s V is a statistic measuring the strength of association between two variables as a percentage of their maximum possible variation, i.e., the strength of association or dependency between two (nominal) categorical variables in a contingency table.)
The mass spectra shown in Figure 1 clearly document a form of DBP that is mass shifted by +657 Da. This higher mass species is readily identified as a form of DBP (and not some other protein) because sample preparation techniques were designed to specifically isolate DBP, and because the mass shift relative to the base peak stays constant as the parent mass changes due to alterations in genotype (see Figure 1B,C,H).
Based on previous investigations described in the literature27–30 and the observed mass shift, this Δm +657 Da modification corresponds to a (NeuAc)1(Gal)1(GalNAc)1 trisaccharide.27,30–33 Most interestingly, the data collected in this study clearly and directly demonstrate that this trisaccharide glycoform does not modify the GC*2 allele product, even in heterozygous individuals possessing a second gene product that is glycosylated (see Figure 1G). The GC*2 gene product lacks this glycoform due to the mutation of the attachment site (T420) to a lysine residue.
DBP is susceptible to glycation. Protein glycation is defined as the covalent attachment of glucose to protein amino groups via the Maillard reaction with Amadori rearrangement resulting in a permanent 1-deoxyfructosyl adduct. Direct evidence for DBP glycation in the form of the signature +162 Da mass shift is readily seen in the deconvoluted mass spectra of some samples (e.g., as in Figure 1C). Indirect evidence that DBP is a target for glycation was first established by Jaleel et al. in 2005.34 Details regarding the degree of DBP glycation within the populations studied will be published elsewhere as part of a larger study involving additional proteins.
Several points regarding sample preparation merit mention as they directly impact the results of analyses like those described below.
Plasma sample preparation was optimized to maximize affinity extraction yield by introducing 50 mM CaCl2 into raw plasma followed by incubation at 41°C for 2.5 h prior to dilution with HBS buffer. Figure 3 shows a series of four plasma samples fortified with increasing concentrations of CaCl2. A maximal effect at 50 mM CaCl2 was observed with plasma from two different donors (results from only one donor shown). Mere addition of CaCl2 without heating could not bring about the optimal results shown in Figure 3C,D (data not shown).
Elution of proteins from MSIA-Tips is accomplished by exposure of the MSIA-Tips to an acidic solution containing one-third acetonitrile (by volume). In a comparison between 60% (v/v) formic acid, 0.4% (v/v) trifluoroacetic acid, and 40 mM HCl (all calculated by solving the appropriate equilibrium-based quadratic equation to give a final H3O+ concentration of 40–50 mM and thus pH of 1–2), the formic acid and trifluoroacetic acid solutions provided the best sensitivity. In both cases, if samples were left exposed to the atmosphere for greater than about 20 min, oxidation of DBP would began to occur (as evidenced by mass shifts), thus it was necessary to inject samples within about 10 min of elution from the MSIA-Tip. Ultimately, the 60% formic acid solution was chosen for the analysis of the samples discussed here because, in preliminary work, very dilute solutions of isolated protein in trifluoroacetic acid solution tended to give poor ESI-MS signal compared to when trifluoroacetic acid was absent.
Perhaps the most striking effect on sample preparation that was documented during these studies was the effect of adding CaCl2 and heat to samples prior to affinity-based extraction of DBP (Figure 3). The reason for the dramatic increase in sensitivity remains unclear, but given the fact that most anticoagulants used to collect plasma bind Ca2+, the explanation may have to do with Ca2+ restoring the natural folding patterns and/or binding interactions of DBP and/or proteins with which it comes into contact, thus facilitating DBP’s binding to our immobilized antibodies. It is well known, for example, that serum albumin carries a large portion of circulating calcium. Disruption of albumin’s natural folding patterns and, thereby, binding interactions, could have a dramatic effect on the “availability” of many different proteins for affinity-based extraction.
In the data presented here, the peak at +657 Da from DBP in deconvoluted mass spectra was readily identified as the (NeuAc)1(Gal)1(GalNAc)1 trisaccharide-modified DBP. Without having performed these analyses in a manner that specifically targeted DBP (e.g., if samples had been prepared using a nonaffinity-based, broad specificity solid-phase extraction technique) such an identification (i.e., linking the heavier peak to DBP in identity) would have been difficult if not impossible. This is because the first major line of evidence that the peak represented something related to DBP came from the fact that only DBP should have been detected based upon the sample preparation process. Secondly, because no peaks arising from other proteins were present, it was quickly recognized that the peaks at 51846, 51860, and 51776 Da (Figure 1B,C,H) were related (by a common mass shift) to the unmodified forms of the various DBP allele products. Addition of extra peaks to these spectra due to a less specific sample preparation procedure would likely have confounded interpretation.
In fact, it may not even have been possible to distinguish or recognize the various genotypic forms of DBP at all without having performed a targeted analysis of samples from individual donors.
The data showing an increased frequency of the GC*1S allele (and corresponding deficit of the GC*1F allele) in the diabetic population are in agreement with studies on Polynesian35 and Japanese36 subjects with regard to linkage of DBP genotype with T2D. Interestingly, these data do not agree with studies on the relationship between DBP genotype and T2D in French Caucasians,37 European-Americans,38 and Polish39 populations. Regardless of whether or not an association exists between DBP genotype and T2D, such information simply becomes unavailable unless larger numbers of people are taken into account and analyzed individually.
Studies by Christiansen et al.40 reported substantial difficulty in finding evidence for glycosylation of DBP in a preparation of the protein from a source originating from pooled plasma. Perhaps a source consisting primarily of donors possessing the GC*2 isoform accounted for the difficulty. Regardless, one would not necessarily have been aware of the lack of glycosylation on the GC*2 allele product and the constant +657 Da mass shift based on the genotype-dictated unmodified protein mass without having data originating from the analysis of plasma from multiple, individual donors.
The knowledge that DBP was intact and completely unmodified in any artificial manner is the feature that makes interpretation of the mass spectra presented here so abundantly simple. Barring rare isobaric variations, the masses of intact proteins are certain to provide comprehensive (albeit, not completely detailed) information with regard to the nature of DBP extracted from individuals—which is more than can be guaranteed anytime a protein is broken apart. The data presented here suggest that many posttranslational modifications (PTMs) are far easier to detect while proteins are still intact. Moreover, as shown in Figure 1C, multiple PTMs were readily detected through the analysis of intact DBP. Analyses of intact proteins may not provide perfectly detailed information, but they serve as comprehensive leads, steering subsequent characterization.
The above discussion warrants a comment about analytical resolving power. As mentioned in the beginning of the article, a major impetus for analyzing larger proteins by ESI-TOF-MS was that analytical blind spots begin to reintroduce themselves as instrumental resolving power (relative to the mass of biological variants) begins to wane. This happens with MALDI-TOF-MS above the m/z range of about 30,000. The data presented here demonstrate the ability of ESI-based mass spectrometry to complement the analytical capabilities of MALDI-TOF for the analysis of intact proteins, providing much needed resolving power by employing highly multiply charged ions to “dilute out” initial ion kinetic energy dispersion. Data on DBP genotype, glycation, and arguably glycosylation would likely simply not be available without ESI-based mass spectrometry.
The raw data presented here for the GC*2 allele product show that it lacks the common (NeuAc)1(Gal)1(GalNAc)1 trisaccharide glycoform modification in contrast to other DBP allele products. The GC*2 allele product, however, does not lack glycosylation altogether: Yamamoto et al.28–31,41 have shown that the GC*2 allele product carries a (Gal)1(GalNAc)1 disaccharide which converts to DBP-macrophage activating factor (DBP-MAF) upon exposure to β-galactosidase. (Evidence for this disaccharide-modified form of the GC*2 allele product can be seen as a +365 Da mass shift from the base peak in Panel D of Figure 1.) Other allelic forms of DBP also require exposure to sialidase to generate DBP-MAF.
But since the only amino acid difference between the GC*2 and GC*1F forms of DBP is mutation of T420 to lysine (removing the site of trisaccharidic glycosylation in GC*2 allele products), shouldn’t GC*1F (and GC*1S, for that matter) also carry a disaccharide modification—considering that the site of disaccharidic glycosylation on GC*2 remains intact in GC*1 variants? The data shown here actually hint that this is the case: Deconvoluted mass spectra of DBP from homozygous individuals (Figure 1B–D) show (more clearly, at least, than heterozygous cases) a small peak at Δm = +365 Da from the native, unmodified protein (represented by the base peak). In fact, trace amounts of the GC*1S allele product modified by both glycoforms may be evidenced in Panel C of Figure 1 at m/z 52225. We have recently followed up on these molecular clues and additional details on the genotype-linked O-glycosylation patterns of DPB have recently been published elsewhere.31
ESI-TOF-MS data collected on intact DBP (a 51-kDa protein) in a targeted manner across populations provided functional genotypic analysis and the detection of multiple, specific DBP modifications attached to particular allele products, facilitating the detection and identification of more than 10 forms of DBP—even providing data to suggest that some of these forms may be useful as biomarkers towards T2D. With regard to technique, targeted, mass spectrometric analysis of intact protein across populations was required to accomplish this task. Considering that two major categories of protein biomarker include modification of amino acid sequence and qualitative/quantitative changes in posttranslational modification, the potential utility for this type of analytical approach remains not only overlooked, but essentially untapped.
The authors are grateful to Intrinsic Bioprobes, Inc. team members for their helpful discussions. Work supported by Arizona State University’s Technology and Research Initiative Fund and by the Agilent Technologies Foundation in the form of Research Project Gift 08US-422UR.