|Home | About | Journals | Submit | Contact Us | Français|
Plasma biomarkers studies are based on the differential expression of proteins between different treatment groups or between diseased and control populations. Most mass spectrometry-based methods of protein quantitation, however, are based on the detection and quantitation of peptides, not intact proteins. For peptide-based protein quantitation to be accurate, the digestion protocols used in proteomic analyses must be both efficient and reproducible. There have been very few studies, however, where plasma denaturation/digestion protocols have been compared using absolute quantitation methods. In this paper, 14 combinations of heat, solvent [acetonitrile, methanol, trifluoroethanol], chaotropic agents [guanidine hydrochloride, urea], and surfactants [sodium dodecyl sulfate (SDS) and sodium deoxycholate (DOC)] were compared with respect to their effectiveness in improving subsequent tryptic digestion. These digestion protocols were evaluated by quantitating the production of proteotypic tryptic peptides from 45 moderate- to high-abundance plasma proteins, using tandem mass spectrometry in multiple reaction monitoring mode, with a mixture of stable-isotope labeled analogues of these proteotypic peptides as internal standards. When the digestion efficiencies of these 14 methods were compared, we found that both of the surfactants (SDS and DOC) produced an increase in the overall yield of tryptic peptides from these 45 proteins, when compared to the more commonly used urea protocol. SDS, however, can be a serious interference for subsequent mass spectrometry. DOC, on the other hand, can be easily removed from the samples by acid precipitation. Examining the results of a reproducibility study, done with 5 replicate digestions, DOC and SDS with a 9 h digestion time produced the highest average digestion efficiencies (~80%), with the highest average reproducibility (<5% error, defined as the relative deviation from the mean value). However, because of potential interferences resulting from the use of SDS, we recommend DOC with a 9 h digestion procedure as the optimum protocol.
Plasma is an extremely complex (1010-fold protein concentration range) and highly proteinaceous (60–80 mg/mL) matrix that is frequently used for the determination of disease bio-markers.1 Most methods of mass spectrometry-based protein quantitation rely on the quantitation of peptides as molecular surrogates for their parent proteins. A key assumption of all such methods is that a protein is converted to its component peptides reproducibly and efficiently. If the abundance of a selected peptide is used to represent the abundance of the protein, then this efficiency is assumed to be 100%. This, however, has previously been shown not to be the case.2,3 Any variation in digestion efficiency will affect the amount of peptide available for detection, which can adversely affect the reproducibility of the entire analysis and the accuracy of the results.
For multiple reaction monitoring (MRM) and other quantitative proteomics methods based on comparing the signals of native peptides to the signals of standard peptides, quantitation of a protein is based on the amounts of specific selected peptides. [Note: for a review of protein quantitation methods, see ref 4; peptide-based quantitation methods include AQUA,5,6 SISCAPA,7,8 MRM,9 and iMALDI.10-12] Thus, the completeness of digestion is not really important. In other words, it does not matter whether the entire protein is digested because only the production of these specific target peptides is important. Digestion of the remainder of the protein is not important. Whether the digestion goes to completion (i.e., whether the digestion process is “finished”) is important, but only because it affects the reproducibility of the results. If the digestion consistently and reproducibly only went to 50% completion, for example, and if this fact were known, it could be corrected for. However, the percent completion of digestion is usually not known, and if the percent completion of digestion is not reproducible, this would lead to errors in both absolute and relative quantitation measurements.
Why does the digestion process vary from protein to protein? There are structural reasons for variations in protein digestibilities. These include disulfide bridges, specific folds (as in prions), solubility issues (membrane proteins), glycosylation, and combinations of these features that can keep the enzyme from being able to access all or some of the cleavage sites. There are biological reasons that have led to the evolution of these structural features. Some proteins function as protease inhibitors (e.g., eppin), others are located in an enzyme-rich environment and must themselves be resistant to proteolysis. Proteolytic enzymes in particular need to be resistant to autolysis. This means that not all proteins are alike with respect to ease of proteolytic digestion, and this can lead to variabilities in digestion efficiencies between proteins. However, because enzymes are, in general, more resistant to proteolysis, it also provides a “window of opportunity” for the selection of digestion conditions where the target proteins can be unfolded but the enzyme itself is not denatured and remains active.
Previous studies comparing digestion efficiency could only monitor the “completeness” of the digestion by determining the sequence coverage of the peptides produced from target proteins3,13,14 or by comparing the number of successful protein or peptide identifications.3,14-17 “Completion” of the digestion was determined by monitoring peptide abundances to determine the point at which there was no further change in either peptide abundances or total peptide yield.
The ideal denaturing and digestion protocol would be one where complete digestion of the protein occurred in a short time period, resulting in the maximum observed peptide signal, followed by a “steady state” condition where no change in the observed peptide concentrations were observed. Such a protocol would be robust in that small changes in digestion time would not be an important factor, resulting in optimum digestion reproducibility.
In this current study, by using a set of stable-isotope labeled peptides that were developed for the quantitation of proteins in human plasma, we are now able to determine and compare the absolute amounts of peptides formed from 45 plasma proteins in a single analysis, when different denaturation/digestion procedures are used. Being able to measure the absolute quantities of the peptides produced allows an accurate and quantitative comparison of the digestion efficiencies of the 14 different denaturation/digestion protocols studied.
Plasma was collected in-house from a volunteer who provided written informed consent. Blood from a healthy male donor who had fasted for 16 h was collected by venous puncture using a 21-gauge Becton Dickinson (BD) Sample Needle (Becton Dickinson, Oakville, ON). Sixty milliliters of blood was collected in twenty 3.0 mL lavender Vacutainer K2 EDTA tubes (Becton Dickinson), which contain 5.4 mg EDTA per tube. The samples were immediately centrifuged twice at 1,000 × g for 15 min at room temperature to remove any cells. The plasma was then aliquoted into 1.0 mL sterile cryotubes and immediately frozen. The collection process took less than 1 h, and the plasma samples were kept at −80 °C.
All reagents were ACS grade or higher; all solvents used, including water, were LC/MS grade. Urea Sigma Ultra, ammonium bicarbonate (ABC), sodium deoxycholate (DOC), guanidine hydrochloride (GnHCl), 2,2,2,-trifluoroethanol (TFE), iodoacetamide (IAA), and formic acid were purchased from Sigma-Aldrich (Oakville, ON). Sodium dodecyl sulfate electrophoresis grade (SDS), Bond Breaker TCEP Solution (TCEP), Optima LC/MS grade acetonitrile (ACN), and Optima LC/MS grade water were purchased from Thermo-Fisher Scientific (Waltham, MA). LC/MS grade methanol was purchased from EMD Chemicals (Gibbstown, NJ). Sequencing-grade modified trypsin Gold was purchased from Promega (Madison, WI).
Proteotypic tryptic peptides containing isotopically-coded amino acids ([13C6]Arg or [13C6]Lys) representing 45 high- and moderate-abundance plasma proteins (Supporting Information Table 1) were synthesized at a 5 μmol scale using Fmoc chemistry on a Protein Technologies Prelude peptide synthesizer (Tuscon, AZ). An in-house cocktail of these 45 isotopically-labeled peptide standards peptides was created, with the concentration of the internal standard mixture balanced to give a stable-isotope labeled peptide peak area within a factor of 10 of the endogenous peptide peak area when the stable-isotope labeled peptide is added to a standard plasma tryptic digest.9
Plasma tryptic digestions were carried out using various chemical protocols and solvent protocols, alone and in combination. A common digestion method, using DOC as the denaturant, was included as a control during each experiment so that the methods could be accurately compared. Figure 1 outlines the experimental design for all of these different extraction protocols; the numbers refer to the specific stage at which denaturants were added in each protocol (Table 1).
Both raw and heat-denatured (boiled) plasma samples were prepared for urea denaturation as follows. To prevent degradation of the urea into cyanate ions that can carbamylate primary amines when exposed to heat,18 the plasma samples were heat-denatured prior to urea denaturation. A 50 μL aliquot of 10-fold diluted plasma (25 mM ABC) was heat-denatured by boiling for 5 min at 100 °C. This sample was allowed to cool on ice for 5 min, followed by chemical denaturation with 62.5 μL of 8 M urea. After approximately 5 min, the sample was diluted with 257.1 μL of 25 mM ABC, reduced with 41.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 45.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. The urea concentration was reduced to 1 M during digestion by dilution with 25 mM ABC in order to maintain the activity of the trypsin.
For urea denaturation without heat, 5 μL of undiluted human plasma was denatured with 40 μL of 9 M urea. After approximately 5 min, the urea-denatured sample was diluted with 25 mM ABC to give 10-fold diluted plasma sample. The urea concentration during digestion was reduced to 0.72 M by dilution with 25 mM ABC to maintain the activity of the trypsin.
The GnHCl digestion protocol of Buscher et al.19 in which 1.0 M GnHCl was used was modified in accordance with the instructions in the Promega trypsin product insert.20 The GnHCl concentration was kept below 1.0 M during tryptic digestion in order to maintain the activity of the trypsin in the presence of this strong chaotropic agent. Both raw and heat-denatured (boiled) samples were prepared as in the urea protocol, with the following modifications: a 5 μL aliquot of undiluted plasma was denatured with 75.6 μL of 6 M GnHCl (in 25 mM ABC) at which point the sample was either boiled for 5 min or left on ice. Both samples were then diluted with 289.6 μL of 25 mM ABC, thereby reducing the GnHCl concentration to 0.9 M during digestion.
Both raw and heat-denatured (boiled) plasma samples were prepared as in the urea protocol with the following modifications: a 50 μL aliquot of 10-fold diluted plasma (25 mM ABC) was denatured with 62.5 μL of 0.2% w/v SDS at which point the sample was either boiled for 5 min or left on ice. Both samples were then diluted with 257.1 μL of 25 mM ABC, reduced with 41.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 45.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. The final SDS concentration during digestion was 0.025% w/v to maintain the activity of the trypsin.
Both raw and heat-denatured (boiled) plasma samples were prepared as in the urea protocol, but with the following modifications: a 50 μL aliquot of 10-fold diluted plasma (25 mM ABC) was denatured with 50 μL of 10% w/v DOC at which point the sample was either boiled for 5 min or left on ice. Both samples were then diluted with 269.6 μL of 25 mM ABC, reduced with 41.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 45.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. The DOC concentration during digestion was reduced to 1% w/v which is compatible with trypsin activity.21
A 5 μL aliquot of undiluted plasma was denatured with 45 μL of 50% v/v TFE in 25 mM ABC to give 10-fold diluted plasma. Samples were diluted with 319.6 μL of 25 mM ABC, reduced with 41.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 45.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. The TFE concentration during digestion was reduced to 5% v/v by dilution with 25 mM ABC to maintain the activity of the trypsin.
A 50 μL aliquot of 10-fold diluted plasma in 25 mM ABC was diluted with 157.6 μL of 25 mM ABC, reduced with 23.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 25.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. Immediately prior to trypsin addition, ACN was added to give a final concentration of 40% v/v during digestion.13 The solvent addition was done prior to the addition of the trypsin in order to ensure that the plasma samples and the trypsin were only exposed to 40% v/v ACN and to prevent protein precipitation.13,22
A 50-μL aliquot of 10-fold diluted plasma in 25 mM ABC was diluted with 238.6 μL of 25 mM ABC, reduced with 32.1 μL of 50 mM TCEP (5 mM final concentration), and alkylated with 35.6 μL of 100 mM iodoacetamide (10 mM final concentration) as indicated below. Prior to trypsin addition, MeOH was added to give a final concentration of 20% v/v during digestion. Solvent was added prior to the addition of the trypsin in order to ensure that plasma samples and trypsin were only exposed to 20% v/v MeOH and to prevent protein precipitation.23
All of the denatured samples were reduced with 5 mM TCEP, which corresponds to a 22.5-fold excess over the ~26 mM concentration of protein cysteine in plasma that we have calculated from the known plasma protein concentration. The reaction was kept at 60 °C for 30 min. Samples were alkylated with 10 mM iodoacetamide (which corresponds to a 50-fold excess over the calculated plasma protein cysteine concentration24) at 37 °C for 30 min in the dark. Denatured samples not compatible with these incubation temperatures (i.e., those containing urea) were reduced and alkylated at room temperature. Modified sequencing-grade trypsin was added to the reduced and alkylated samples (43.8 μL of 0.4 mg/mL trypsin in 25 mM ABC) at a 20:1 protein/enzyme ratio, and samples were digested at 37 °C. Plasma was diluted 100-fold during the sample preparation process, resulting in a 500 μL final volume for all samples. Samples were collected in 20 μL aliquots at 0.5, 1, 2, 4, 6, 9, 12, and 23 h time points. The tryptic digestion was stopped by acidifying the sample with 80 μL of a 0.5% formic acid solution containing the cocktail of 45 SIS peptide standards. The sample was stored at −80 °C until analysis.
Prior to MS analysis, the digested samples containing DOC were centrifuged (Eppendorf centrifuge 5415D, Mississauga, ON) at room temperature for 2 min at 16,000 × g to effectively pellet DOC and remove it from the digest. All digests were then desalted and concentrated by solid phase extraction (SPE) using Waters Oasis (10 mg) columns according to the manufacturer's instructions (Waters, Mississauga, ON). Briefly, the SPE columns were conditioned and equilibrated with 1 mL MeOH and 1 mL dH2O, respectively. The sample solution (90 μL sample + 500 μL 0.1% formic acid) was then loaded onto the column, and the columns were washed with 1 mL dH20. The samples were eluted with 200 μL 50% ACN, 0.1% formic acid and lyophilized (Thermo Electron Corporation, ModulYoD Freeze-Dryer). Lyophilized samples were rehydrated with 0.1% formic acid to give 1 μg/μL protein concentrations.
An Eksigent NanoLC-1Dplus HPLC (Dublin, CA) coupled online to an Applied Biosystems/MDS Sciex 4000 QTRAP was used for the analyses. The desalted plasma digest samples (1 μL) were injected onto a reversed-phase capillary column (75 μm i.d. × 15 cm) packed in-house with Magic C18AQ (5 μm i.d. particles, 100 Å pore size, Michrom, Auburn, CA). The flow rate was 300 nL/min. A sample loading time of 6 min at 100% solvent A (2% acetonitrile, 0.1% formic acid) was followed by a 32-min linear gradient from 0 to 23% solvent B (98% acetonitrile, 0.1% formic acid), a 9-min linear gradient from 23% to 43% solvent B, a 2-min linear gradient to 80% solvent B held for 2 min, followed by a 10-min column equilibration step with 100% solvent A.
An Applied Biosystems (AB) Sciex 4000 QTRAP (Applied Biosystems, Streetsville, ON) equipped with a nanoelectrospray ionization source and controlled by Analyst 1.5 software was used for all of the LC-MRM/MS analyses. The following parameters were used: 1900–2000 V ion spray voltage; 25 nitrogen curtain gas; 150 °C interface heater temperature, 3.5 × 10−5 Torr vacuum gauge pressure; unit resolution on Q1 and Q3.
Uncoated fused silica emitter tips (20 μm inner diameter, 10 μm tip, New Objective, Woburn, MA), 1–3 L/min Ion Source Gas (GS1), and postcolumn addition makeup flow were used. Make-up solvent (80% v/v isopropanol, 10% v/v acetonitrile) at a flow rate of 50 nL/min was added using a PicoPlus 11 syringe pump (Harvard Apparatus, Holliston, MA). The 60 min MRM acquisition method was constructed using 90 MRM pairs with the tuned DP and CE voltages, which had previously been developed for these peptides, and a default collision cell exit potential of 23 V, (see Supporting Information Table 2, from ref 9). As was done in this previous study, an 8 min wide scheduled MRM retention time window with a 2 min cycle time was used for all data acquisition.
MultiQuant 1.1 (Applied Biosystems), with the integration algorithm MQL for peak integration, was used for MRM data analysis. Peak integration parameters were set with a total smoothing width of 1 point, a retention window of 120 s, a peak splitting factor of 2 points, and report largest peak enabled. Other peak integration parameters were used at default settings. Peak area ratios were generated by dividing the integrated peak areas of the extracted ion chromatograms for the natural peptide by the peak areas of the coeluting, equivalent MRM transitions of the isotopically-labeled peptide.
Currently there are numerous published protocols for proteolytic digestion, involving various methods of chaotropic, surfactant, and solvent based denaturation.2,5,13,14,16,17,19,21,25-27 It is well-documented that altering trypsin digestion conditions can affect which peptides are observed from proteins in proteolytic digests and in turn the overall sequence coverage of proteins,17 as was clearly demonstrated in the characterization of the urine proteome by Adachi et al.28 Primary and secondary structure of proteins can confer differing levels of resistance to proteolysis, which can result in poor accessibility of trypsin to internal cleavage sites.29 Advantages of in-gel digestion for mass spectrometry-based identification of hydro-phobic membrane proteins can largely be attributed to the denaturing power of SDS during the sample preparation step prior to gel electrophoresis. Recent hybrid methods for sample preparation have focused on combining the denaturing properties of SDS with the ease and improved recovery of in-solution digestion.30
With the intent of adapting sample preparation for MRM analysis to high-throughput liquid-handling robotics platforms, protocols were selected on the basis of their simplicity, cost effectiveness, and previous use in proteomics sample preparation, as well as on compatibility of the chemical or solvent system with tryptic activity and downstream mass spectrometric analysis.
Overnight (16 h) trypsin digestion at 37 °C is routinely used in proteomics, but this is for convenience only. Examining the tryptic digests at defined time points during the digestion process could give insights as to when digestion has actually reached completion. Decreasing the time required for tryptic digestion could dramatically impact sample throughput and time to first result, as well as the reducing the amount of tryptic autolysis products.
A schematic workflow for the digestion of human plasma is shown in Figure 1. Whenever possible, all digests were prepared from a common EDTA-plasma standard sample. This human plasma sample was diluted 1/10 with 25 mM ABC and was used for all digestions to minimize errors that might result from pipetting small volumes of highly proteinaceous material. The volumes of all samples were kept identical during reduction, alkylation, and digestion. To ensure that the amount of trypsin was not a limiting factor in our experiments, all samples were digested using a 1:20 ratio of trypsin to substrate rather than the 1:50 ratio commonly cited in the literature.17,21
The 45 proteins can be divided into 4 groups depending on their digestion profiles (Supporting Information Figure 1). For most proteins, including plasma retinol binding protein and transthyretin, a maximum signal was achieved within 4 h of digestion followed by a “plateau” over the following 20 h. This is the ideal case and allows the development of a robust digestion protocol where small variations in digestion time would not have a significant impact on peptide production. The peptide abundances in a second group of analytes, which included fibrinogen γ chain and apolipoprotein A-II, continued to increase during the 24 h period studied, without ever reaching a maximum. Other proteins, like haptoglobin β and α-2-antiplasmin, also generated a maximum signal within 4 h of digestion, but the signal then decreased as the period of digestion increased. For a fourth group of proteins, which included vitamin D binding protein and complement component C9, the shape of the digestion efficiency curve depended on the digestion method used. For these 3 groups of “non-ideal” proteins, consistent digestion times are required in order to achieve reproducible results.
SIS peptides were added to all time-point samples in equal amounts. To obtain the plots shown in Figure 2, peak area ratios (light:heavy) were generated as integrated peak areas of the endogenous peptide (light) to the peak areas of their coeluting SIS peptide (heavy). To provide a reference method, each set of experiments contained a DOC digest as a common standard condition (green line). A linear data-normalization step using the average response from 4 DOC data sets was used to correct the time points of other treatments and to facilitate comparisons between experiments. To generate these plots, a two point rolling average was applied in order to reduce local variation (note that rolling averaging affects only the line and not the data points in these plots).
A literature survey of plasma proteomics-related publications reveals that urea is a widely used chaotropic agent for solubilizing protein pellets, IEF sample prep, and general in-solution enzymatic digestion of proteins.31 The mechanism of protein denaturation by urea has been widely studied (ref 32 and references therein) and is thought to start with urea attachment to charged histidine and then to the positively charged amino acids and the amide groups,33 leading to disruption of hydrogen bonds, followed by solvation with water and urea.32,34 Urea, however, can degrade into ammonium cyanate when exposed to heat, which leads to carbamylation of free primary amines.
Stronger chaotropic chemicals that are incapable of modifying proteins, such as GnHCl and SDS, can also be used during tryptic digestion,19,35 but each of these chemicals presents additional challenges. The concentration required for GnHCl denaturation can severely reduce the enzymatic activity of the trypsin.36 SDS is not compatible with direct sample injection as it is not possible to remove SDS from samples with reversed-phase HPLC configurations. Even low levels of SDS (<0.01%) interfere with reversed-phase separation of peptides37,38 and suppress ionization in both MALDI and electrospray techniques, and off-line removal of SDS (for example, with SCX Ziptips) adds additional steps to the protocol and can lead to sample loss.
Sodium deoxycholate (DOC) is a bile salt surfactant with unique properties that make it compatible with easy sample preparation for MS-based proteomics. DOC is inexpensive and is easily removed prior to MS analysis or solid phase extraction, as it is acid-insoluble and precipitates at low pH levels.39 Additionally, trypsin retains its full activity in 1% w/v DOC solutions.17
Analysis of the tryptic peptides from all 45 proteins revealed that the rate of digestion for these analytes vary widely even within a single chemical denaturation method (urea, GnHCl, DOC, or SDS) (Figure 2 and Supporting Information Figure 1). Six representative plots of tryptic peptide production versus time, as a function of the chemical denaturant used, are shown in Figure 2A, and the remainder of these plots are provided as Supporting Information Figure 1.
It proved difficult to maintain the activity of trypsin in digests that contained GnHCl as a denaturant. The concentration of GnHCl was reduced to 0.9 M prior to the addition of trypsin, as recommended by Promega40 and in accordance with values found in the literature.19 However, in this study we found that GnHCl-containing digests consistently showed lower efficiencies of digestion, resulting in lower peak area ratios in the MRM analysis than with the other methods. SDS was found to increase the production of proteotypic peptides from digestion-resistant proteins when digestion was performed for longer than 9 h. SDS, however, is associated with contamination of the HPLC column and signal suppression37,38 and is therefore not the ideal denaturant. Our results indicate that DOC could be used as an alternative to SDS.
Solvent denaturation protocols are attractive as they alter the solution chemistry of the digestion by causing partial unfolding or some alteration of the protein's secondary structure. Volatile solvents can be easily removed from samples prior to MS analysis. Non-salt-based denaturation protocols have also been shown to provide reproducible and sensitive detection when compared to traditional protocols using urea.16 Russell et al.13 and Simon et al.22 reported that trypsin maintains its proteolytic activity in mixed-solvent systems containing up to 80% organic solvent. The use of organic solvents as denaturants must be optimized to determine the proper concentration suitable for both solubilization of proteins and retention of tryptic activity.
Our solvent digestion protocol using MeOH was adapted from studies indicating that the addition of MeOH enhances the activity of trypsin when compared to aqueous conditions.27,41 The solvent digestion protocol using ACN was adapted from Russell et al.13 In both protocols, solvent is added immediately prior to the addition of trypsin in order to prevent protein precipitation before digestion. A tryptic digestion protocol using 50% v/v TFE as a denaturant, reported by Adachi et al.,28 was modified for use in our experiments. As was the case for methanol and acetonitrile, consideration was given to obtaining concentrations of TFE solvent that both denature plasma proteins and are compatible with tryptic digestion. To ensure that the plasma was exposed to an appropriate concentration of TFE, denaturation was done prior to sample dilution. Sample dilution reduced the concentration of TFE to 5% v/v, which was compatible with efficient tryptic enzyme activity.
Figure 2B shows representative plots showing the production of target proteotypic peptides, from the same 6 plasma proteins as shown in Figure 2A, as a function of solvent denaturation and digestion time. All of the remaining digestion profiles are provided in Supporting Information Figure 1. DOC was included as a control treatment to ensure reproducibility and allow for comparison between solvent and chemical denaturant experiments.
Both MeOH and TFE produced tryptic digestion profiles similar to those obtained with DOC. Although other studies have reported the use 40% acetonitrile, and even up to 80% acetonitrile as a denaturant,42 in our experiments, a significant reduction in digestion efficiency was observed for all 45 analytes in the presence of 40% v/v acetonitrile, even though no protein precipitation was observed. However, moderate digestion of some proteins, such as haptoglobin β (Figure 2B), was observed when using acetonitrile denaturation. These results indicate that it is possible to obtain tryptic digestion efficiencies equivalent to that obtained with DOC by using solvent denaturation.
The data above indicate that tryptic digestion efficiencies equivalent to that obtained with DOC (a chemical denaturant) could be obtained with solvent denaturation, using MeOH or TFE. To determine if the digestion protocols could be further improved, solvent denaturation was combined with sodium deoxycholate denaturation. Digestion protocols were designed to examine the effects of the following combined treatments: DOC/TFE, DOC/MeOH, and TFE/MeOH (Figure 2C).
The combination of chemical and solvent denaturation did not significantly improve digestion efficiencies, compared to the digestion efficiencies obtained with sodium deoxycholate alone. The strongest combined denaturation method was found to be DOC/MeOH, with 29 proteins attaining a maximum signal under these conditions. However, it should be noted that 22 proteins attained their maximum signals with DOC alone in this experiment (Figure 3).
Although proteotypic peptides for 41 of the 45 proteins studied reached their maximum signals and improvement of overall digestion efficiency could be achieved for certain proteins, complete digestion of other proteins was never attained, and the digestion efficiencies of several proteins were even reduced when these combined systems were used (Figure 2C). For example, both α-1-acid glycoprotein and antithrombin-III peptide production decreased 2-fold when reduced with a combined denaturation method. This suggests that the combination of denaturants may have reduced the efficiency of the trypsin and upset the balance between creating a strongly denaturing environment and maintaining trypsin enzyme activity. Tryptic digestion profiles of all 45 proteins in the presence of all of the chaotropic and solvent denaturation methods are provided in Supporting Information Figure 1.
Heat is often used as a strong denaturing condition in the analysis of proteins, assisting in the unfolding of proteins. Heat denaturation is routinely used in SDS-PAGE sample preparation and is widely regarded as one of the strongest methods of protein denaturation. In this portion of the study, the chemical denaturation protocols using SDS, urea, DOC, and GnHCl were modified for use with heat denaturation. Each of the four chemical denaturant protocols was applied to two aliquots of plasma, all of which were taken from the same stock sample of human EDTA-plasma, one of which was heat-treated. Our results indicated that heat denaturation does not have a dramatic effect on the rate of tryptic digestion when combined with SDS, DOC, or GnHCl denaturation. However, heat denaturation of plasma samples prior to chemical denaturation with urea improved the efficiencies of the urea digestion protocol for 32 out of the 45 proteins examined (Figure 2D and Supporting Information Figure 1).
To compare the digestion results from these different digestion protocols, we determined not only the maximum signal but the frequency at which this maximum intensity was reached using the 10 protocols that showed significant differences (protocols using heat did not show significant effect and were not included in this portion of the study). As can be seen from Figure 3, the maximum observed peptide production from 38 of the 45 proteins was obtained with either the SDS or the DOC protocols, with 41 of the 45 proteins achieving their maximum signals using these protocols. Thus, for proteins that are hard to solubilize, detergents such as SDS and DOC may be better choices of denaturants than urea or GnHCl. Over the range of denaturing conditions studied, TFE and MeOH were found to outperform DOC for 7 of the 45 proteins studied. However, the use of ACN resulted in no peptide reaching maximum peptide signal compared with the other solvents.
It is difficult to rank these methods by digestion efficiency alone, especially since the results differ so widely from protein to protein. Although the profiles are very informative, each time point represents only a single technical replicate. Therefore, since the reproducibility of the method is of paramount importance for accurate quantitation, we chose to characterize the complete technical reproducibility of these protocols by generating 5 replicate samples of a standard plasma sample digested for 4, 9, and 16 h. We selected the 4 most promising methods on the basis of the results above (DOC, DOC/MeOH, DOC/TFE, and TFE/MeOH) and compared them to SDS (arguably the strongest denaturant in the study) and urea (representing the most commonly used method), for a total of 6 digestion protocols, 3 digestion times, and 5 replicates, giving 90 unique digests. The results of these replicate analyses are shown in Table 2 and Figures Figures44--77.
To normalize data for each analyte, the maximum value of these average peak area ratios (light:heavy) for these 5-fold replicate analyses was determined, and the 90 peak area ratios were normalized by dividing each by this maximum value (from this point forward, this will be referred to as normalized data). The minimum average peak area ratio was also identified for each analyte, and all 45 proteins were ranked. The minimum average peak area ratio of an analyte is the lowest response for that analyte under all conditions and digestion lengths. This effectively describes the range of digestion efficiencies observed: the higher the minimum average peak area ratio of an analyte, the more rapid the digestion, whereas the maximum peak area ratio gives information on “completeness” of the digest.
To compensate for the different levels of signal observed for the 45 analytes and to determine if greater differences exist between protocols when looking at different protein classes, proteins were ranked on the basis of minimum average peak area ratios, and the 6 most effective protocols were compared. To better reveal differences between digestion conditions, all 45 proteins were divided into three equal groups based on the range of digestion efficiencies observed: proteins that were rapidly digested (Figure 4A), proteins that were moderately digested (Figure 4B) and proteins resistant to digestion (Figure 4C).
The results show that peptides from 31 of the 45 proteins reach a steady state of tryptic digestion products after 8 h of tryptic digestion and showed no further increase or decrease for up to 16 h. For 25 of these 31 proteins, rapid and complete digestion was observed after only 4 h of tryptic digestion. For the remaining 14 proteins, however, 10 proteins actually show a decrease in signal with prolonged digestion (9 and 16 h). Proteins that are slow to digest show a progressive increase in signal over longer digestion periods, with all 6 denaturation methods. Signals from 4 of the analytes kept increasing even after 16 h of digestion and did not reach a plateau during this time period. Thus, the optimal digestion protocols and reproducibility are analyte-dependent (Figure 5).
Although no single digestion protocol produced the highest signal for all of the proteins, if the target protein is known, selection of the appropriate digestion protocol can increase analyte signal more than 3-fold. For example, although DOC produced the highest peptide signals for most of the proteins, apolipoprotein C-I showed its highest signal when SDS was used. What we have presented here is the “best” method for general shotgun proteomics. Certainly, if the target protein is known, 3–5 peptides should be selected and examined for digestion efficiencies and for MRM tuning as well, as was previously pointed out.9
As can be seen in Table 2, there is a large amount of protein to protein variability, but the protein displaying this maximum variability is not the same in every treatment. For example, zinc α-2-glycoprotein showed the overall highest variability (66%) with 4 h of digestion with urea. With other treatments, however, this protein gave % error values (defined as the relative deviation from the mean) close to the overall average. In contrast, almost all of the % errors for L-selectin are above average, while hemopexin showed very low % errors in all treatments.
The average % errors for each treatment and digestion time are given in Figure 6B. For the 9 and 16 h digestion times, the DOC treatment showed the lowest % error (the highest reproducibility) of all 6 treatments, a % error of 4.72% and 4.30%, respectively. Even for the 4 h digestion time, the DOC treatment had a % error of only 5.79%. Urea showed poor reproducibility relative to the other treatments, with the highest (or one of the highest) % errors in all treatments and at all digestion times.
The results in Table 2 clearly demonstrate the complexity of the protein digestion process and the challenges in obtaining reproducible and effective digestion of a mixture of proteins. Looking at the % digestion of α-1-glycoprotein (classified as a Group C or “resistant” protein, in Figures Figures44 and and5),5), we find that the digestion efficiencies increased with digestion time all treatments but reached widely differing values (from 6% with DOC/TFE to 100% with SDS). In contrast, transthyretin (another Group C protein) shows very little effect of digestion time but a large effect of the choice of denaturant (ranging from 2% to 100% efficiency).
The optimum digestion conditions for each protein can be found in Supporting Information Figure 1. However, the target protein may not always be known, as in shotgun proteomics or biomarker discovery, where the optimum method for a large group of proteins must be selected. Figure 6A–C shows the frequency distribution of digestion efficiencies (% completion) of the different treatments, for each digestion time. Looking at Figure 6A and B, the frequency distribution of digestion efficiencies shift to higher values (higher % completion) as one moves from 4 to 9 h digestion times. When one compares the distribution of % completion at 9 and 16 h digestion times (Figure 6B and C), the efficiencies for the combined treatments (DOC/MeOH and DOC/TFE) actually shift to lower % efficiencies. Comparing DOC (dark blue line) and SDS (pink line) at the three digestion times shows an increased % completion from 4 to 9 h but not much change between 9 and 16 h, whereas urea (yellow line) was not as efficient as DOC or SDS at any of the digestion times.
The comparison of the different treatments with respect to reproducibility (% errors), at the three different digestion times, is shown in Figure 6D–F. In general, increasing the digestion time from 4 to 9 h (Figure 6D and E) shifts the frequency distribution of % errors to lower values, reflecting an increase in reproducibility, probably indication that digestion was not complete digestion at the shorter digestion times. The shift from 9 to 16 h digestion time (Figure 6F) does not have as much effect on the frequency distributions as did the change from 4 to 9 h. The longer digestion times did reduce the % errors for the “outliers”; for example, the highest % errors for urea were 68%, 27%, and 18% at 4, 9, and 16 h, respectively.
Examining the average digestion efficiencies (Figure 7A), SDS and DOC show the best overall efficiencies at the 9 and 16 h time points, while the “combinations” of DOC/TFE or DOC/MeOH may show some promise at shorter digestion times. However, in terms of % error, these combined treatments show higher % errors than either SDS or DOC. When both digestion efficiency and % error are considered, DOC is the best overall method (at 9 and 16 h digestion times), where it shows a higher digestion efficiency and a lower % error than SDS. Under the digestion conditions used, urea shows the lowest digestion efficiency and one of the highest average % errors. Therefore, considering both digestion efficiency and reproducibility, DOC appears to be the best method for general protein digestion.
Interestingly, SDS, DOC, and urea appear to reach a “plateau” in terms of the average digestion efficiencies (~80%) and do not show much change between 4 and 16 h. The % errors for SDS and DOC, however, are reduced with increasing digestion time, with DOC showing the lowest % error (and therefore the highest reproducibility) at both the 9 and 16 h digestion times.
The recommendation of DOC as a denaturant is consistent with the investigation of Lin et al.17 regarding the use of DOC, in which they found that DOC aided solubilization and led to improvements in tryptic digestion, particularly for proteolytically resistant proteins. In contrast to the Lin study, however, we did not observe any deleterious effect of longer digestion times on the average DOC digestion efficiency or reproducibility, although we did observe this phenomenon for a few proteins. In our study, comparing the 4, 9, and 16 h DOC digestion times, the longer digestion times produced higher average reproducibilities (±5.8%, ±4.7%, and ±4.3% errors, respectively).
Our goal in studying the different digestion protocols was to determine the conditions under which the highest possible percentage of the protein has been digested in the most reproducible way, in other words, where the production of tryptic peptides has reached its highest plateau. Although this can be achieved for almost every protein, we found that the “optimum achievable” digestion conditions are protein-dependent, and there is no one common procedure that is best for all of the proteins.
Even in this relatively small set of proteins, wide variations in digestion performance were observed. For 31 of the 45 proteins, “ideal” conditions could be achieved. For these proteins, reproducible and stable levels of tryptic digestion products (as determined by MRM peak area ratios) were obtained after 9 h of tryptic digestion, and the peptide MRM ratios showed no further increase or decrease for up to 16 h at 37 °C. For 25 of these 31 proteins, the maximum MRM ratios were observed after only 4 h of tryptic digestion. For the remaining 14 proteins, however, either the signal did not reach a plateau during this 16 h time period, or the peptide signals reached a maximum and then decreased. For these “non-ideal” proteins, consistent digestion times are required in order to achieve reproducible results.
Improvement in digestion efficiencies was achieved with several proteins by altering denaturation conditions. Although there is no single “best” digestion method for all 45 human plasma proteins, one must take into account the analyte signal intensity produced and the digestion time required for maximum digestion, as well as whether consistent digestion performance across all 45 analytes is needed for a particular application. Although one single digestion protocol may never be 100% efficient for all proteins, protocols can be selected and optimized for proteins of interest in order to ensure reproducibility and optimal detection. For clinical applications involving detection of biomarker proteins, digestion protocols should be selected specific to the target proteins of interest.
It is also worth noting that different protocols can yield different steady state conditions, in other words, the digestion reaches “completion”, but the digestion is still not “complete” (Supporting Information Figure 1). Transthyretin and plasma retinol binding protein are good examples of this (Figure 2A–C). Twenty-four proteins reached a steady state with nearly all of the digestion protocols (Supporting Information Figure 1). The proteotypic peptides from 21 proteins do not reach a steady state within the 24 h digestion times with most of the digestion conditions used. The proteotypic peptides from 10 proteins continued to increase with increasing digestion time and never reached a plateau under most of the digestion conditions studied. The proteotypic peptides from 5 proteins decreased with increasing digestion time under most digestion conditions, and there were 6 proteins whose digestion was so treatment-dependent that no generalization could be made. For the 3 non-steady-state groups of proteins, care must be taken to precisely control the digestion times in order to achieve reproducible digestion for comparative proteomics studies. This is another advantage of methods performing absolute quantitation using SIS peptides. On the basis of the relative peak areas of the SIS and native peptides, it would be possible to create calibration curves in order to correct for less than 100% digestion of the target protein.
Ultimately, using absolute quantitation methods, we have found that there are digestion protocols better than urea. Tryptic digestion in solutions containing urea routinely showed lower digestion efficiencies than methods using SDS or DOC, achieving maximum peptide production for only 10 out of 45 proteins (Figure 3). Both SDS and DOC have been shown to give higher tryptic peptide production, which in turn leads to increased signal intensities and therefore a more sensitive MRM assay. However, SDS contaminates MS instrumentation, interferes with chromatographic resolution, and is difficult to completely remove from sample digests. With a 9 h digestion, DOC produced one of the highest average digestion efficiencies (~80%), with the highest average reproducibility (<5% error). We therefore conclude that the best method for untargeted protein denaturation is DOC with a 9 h digestion time. The DOC denaturation protocol is also simple and amenable to automation. However, it should be stressed that, in order to maximize reproducibility between analytical runs, sample preparation must still be carefully defined and rigorous standardized operating protocols must be used.
We would like to thank Genome Canada and Genome BC for providing funding for the proteomics platform facility. This work was supported in part by a grant from the National Institutes of Health Grants 1U24 CA126476 (PI: Steven A. Carr, Broad Institute of Harvard and MIT) as part of the NCI's Clinical Proteomic Technologies Assessment in Cancer Program.