PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Methods Mol Biol. Author manuscript; available in PMC 2013 September 1.
Published in final edited form as:
PMCID: PMC3758905
NIHMSID: NIHMS500306

Protein Quantitation Using Mass Spectrometry

Abstract

Mass spectrometry is a method of choice for quantifying low-abundance proteins and peptides in many biological studies. Here, we describe a range of computational aspects of protein and peptide quantitation, including methods for finding and integrating mass spectrometric peptide peaks, and detecting interference to obtain a robust measure of the amount of proteins present in samples.

Keywords: Proteomics, Quantitation, Proteins, Peptides, Mass spectrometry

1. Introduction

Mass spectrometry (MS)-based quantitative proteomics has been applied to solve a wide variety of biological problems, and several MS-based workflows have been developed for protein and peptide quantitation (Fig. 1). In mass spectrometric quantitation methods it is usually assumed that the measured signal has a linear dependence on the amount of material in the sample for the entire range of amounts being studied. A prerequisite for accurate quantitation is that unwanted experimental variations in sample extraction, preparation, and analysis be minimized, and it is therefore critical that each step in the workflow is optimized for reproducibility.

Fig. 1
Workflows for mass spectrometry-based protein and peptide quantitation. (a) Metabolic labeling (1, 2). (b) Protein labeling (4). (c) Chimeric recombinant protein labeling (8, 9). (d) Peptide labeling (4, 5). (e) Isobaric peptide labeling (7). (f) Synthetic ...

One way of optimizing the reproducibility is to label the samples with stable isotopes, mix them together and perform the subsequent sample-handling steps on the mixed sample. The earlier in the workflow that the stable isotope label is introduced and the samples mixed, the smaller is the effect of variations in sample handling. Metabolic labeling (1, 2) provides the earliest possible introduction of stable isotope labels into the sample (Fig. 1a). Here, labels are introduced as isotopically distinct metabolic precursors, and the samples can be mixed before all subsequent steps in the work-flow. It is important to monitor the level of incorporation of the label, but this can, for example, be done by using two heavy labels that are incorporated into the samples with equal efficiency (3). In cases when metabolic labeling is not feasible, the stable isotope labels also can be introduced later in the workflow (49) by heavy isotope labeling of proteins (Fig. 1b, c) or peptides (Fig. 1d–f). In general, stable isotope labels need to be designed carefully in order to prevent introducing systematic errors caused by dissimilar behavior of the compounds with different labels. For example, it has been observed that using hydrogen/deuterium substitution in the heavy label can affect the retention time of the labeled peptides, while 12C/13C substitution does not have any observable effect on the retention time (10).

Label-free methods (1113) for quantitation are often used when the introduction of stable isotopes is impractical (e.g., in many animal studies) or the cost is prohibitive (e.g., in biomarker studies where a relatively large number of samples need to be analyzed). Three label-free quantitation workflows are shown in Fig. 1g–i. In these workflows the different samples are analyzed separately and it is therefore critical that each step of the workflow is carefully optimized for reproducibility. In label-free quantitation workflows, usually the peptide ion peaks are integrated and used as a measure of quantity. This allows the quantity of protein and peptides to be compared in different samples (Fig. 1g) or the absolute quantity can be calculated using a standard curve (Fig. 1h). The peptide fragment ions can also be used for quantitation by integrating one or more of their peaks (Fig. 1i) as, for example, in Multiple Reaction Monitoring (MRM) (14). Using fragment ions for quantitation provides increased specificity because in addition to requiring the mass of the precursor ion be close to its predicted mass, the masses of the fragment ions are also required to be correct. Because peptides fragment in a sequence-specific manner, additional specificity can be gained by requiring that the relative intensities of the fragment ions do not deviate from the expected intensities. Alternative methods for quantitation using fragment mass spectra do not integrate peaks but are based on the results of searching protein sequence collections (see Note 1).

Currently, there are several software packages available for analysis of data from these different workflows where the quantitation is done by integrating peaks of ions that correspond to peptides or their fragments (see Note 2 for a few examples). Here, we describe how the mass spectra are processed to allow for finding the peptide peaks, detecting interference, and integrating the peaks to obtain a measure of the amount of material present in the samples.

2. Methods

Step 1: Detecting peptide peaks

Peptide peaks of interest for quantitation may range between smooth peaks with a large signal-to-noise ratio and noisy peaks that are barely above the background. The width of these peaks is, however, characteristic of the resolution of the mass spectrometer, the data acquisition parameters used, as well as the mass-to-charge ratio (m/z) of the peptide. Therefore, peaks can readily be detected by scanning the mass spectra for local maxima of the expected width (see Note 3). In addition, peptides are not observed as a single peak in mass spectrometry, but as a cluster of peaks, because of the presence of small amounts of stable heavy isotopes in nature (e.g., 1.11% 13C) and each peptide contains many carbon atoms. The relative intensities of the peaks in these isotope clusters are characteristic of the atomic composition of the peptides and they are strongly dependent on the peptide mass (Fig. 2a–c, see Note 4).

Fig. 2
Isotope distributions of peptides. (ac) The isotope distribution of peptides is strongly dependent on the peptide mass (see Note 4). (dg) Examples of peptide isotope distributions observed by LC-MS with different levels of interference ...

A majority of quantitation experiments are performed by coupling liquid chromatography with mass spectrometry, which introduces a retention time dimension. During these experiments, usually the same peptide is observed during several adjacent time points (Fig. 2d–g) with highly abundant peptides typically being observed over larger time windows than low-abundance peptides. But even with separation in both m/z and retention time, it is not uncommon to have unwanted interference between peaks from different peptides (Fig. 2e, g).

Step 2: Detecting interference

The following characteristics of peptide peaks can be used as filters to differentiate them from interfering and non-peptide peaks: (1) the width of individual peaks in m/z and retention time, (2) the intensity distribution of the isotope clusters, and (3) the measured peptide m/z. These characteristics are shown in Fig. 3 for two peptides. The width of individual peaks as a function of m/z is highly characteristic of the instrument parameters with very little variation and therefore a narrow peak width filter can be used. The width of individual peaks as a function of retention time (Fig. 3a–c, j–l) shows larger variation. This variation is mainly dependent on the peak intensity and the elution time, although strong peptide sequence dependent variation can also be observed, and therefore a wider filter must be applied. High-accuracy measurement of peptide mass is a sensitive and selective filter that is highly reproducible even at the tails of the peak where the intensity is low (Fig. 3g–i, p–r). The shape of the isotope distribution is also a sensitive and selective filter that can be used to detect interference from other peaks (Fig. 3d–f, m–o). A convenient measure of the similarity of isotope distributions is the dot product (see Note 5) between them (Fig. 3f, o). The dot product can be applied to compare sets containing any number of peaks, for example, to detect interferences when a set of fragment ions is monitored in a MRM experiment. In the example shown in Fig. 3, dot product analysis of the chromatograms shown in the panels on the right shows that only the first isotope cluster corresponds to the peptide of sequence YVLTQPPSVSVAPGQTAR, while the second and third peaks are interfering peaks from peptides whose first three isotope peaks have a similar m/z, but their relative intensity is different.

Fig. 3
Examples of the variation in mass measurements and the shape of isotope distributions. (ai) Peptide with amino acid sequence: AADDTWEPFASGK; jr) Peptide with amino acid sequence: YVLTQPPSVSVAPGQTAR. Panels from top to bottom: The intensity ...

Step 3: Measuring peptide quantity

The quantity of peptides is measured by calculating the height or the area of the corresponding peaks in the ion chromatograms. Careful background subtraction is essential for accurate determination of both the height and the area of peaks (see Note 6). The advantage of using the height of the peak as the measure of quantity is the simplicity and robustness of its calculation (e.g., the average or median height for a few points around the centroid can be used). The peak height is a good measure of quantity if the width of the peak does not vary between samples and the signal is strong with little noise. In contrast, the peak area is a better measurement of quantity when there is substantial noise because many more data points are used, but it is much more sensitive to interference from other peaks because of the larger area in the m/z and retention time space that is used. The difficulty in calculating the peak area is in deciding where the peak ends and the background starts in both m/z and retention time dimensions. This determination can be very challenging for peaks with long tails. It is also important to use the same peak limits for a specific peptide in all samples. One way of circumventing the problem of finding the peak limits is to select a function and fit its parameters (e.g., centroid, width, skewness, etc.) to the peak and integrate the function. However, often it is not straightforward to find a function that fits well to all peaks in the spectrum.

Step 4: Matching peptides from different experiments

In many quantitation studies more than one experiment (i.e., replicates and/or multiple samples) is performed. This requires the matching of the peptides quantified in the different experiments. For successful matching of peptides, the retention time scales of all experiments have to be aligned, because there are always uncontrolled variations in the experimental conditions that affect the peptide retention times in a nonlinear manner. This alignment can be done by identifying peaks present in all experiments that can be used as landmarks. These peaks are matched across experiments using either their mass and retention time, or their identity as determined by tandem MS. A smooth function is fitted to the retention times of these landmarks and used for aligning the retention times of all quantified peptides. The residual difference in retention time for the landmarks can be used to estimate the uncertainty in the alignment.

For some mass spectrometers, the m/z scale needs to be calibrated between experiments. This mass calibration can be done using the same landmarks as used for retention time alignment. When experiments are aligned in retention time and are mass calibrated, the quantified peptides can be matched within windows determined by the uncertainty in the retention time and the m/z.

The measured intensities of peptide peaks commonly vary from experiment to experiment in a global manner. It is therefore advisable to design experiments so that only a few of the quantified peptides have changes related to the hypothesis, and the majority of peptides change because of random variations in the experimental conditions. The randomly changing peptides can be used to normalize the overall intensity using either their median change in the intensity ratios or by fitting an intensity dependent smooth function to the measured intensity ratios.

Step 5: Calculating protein quantity

Protein quantity can be estimated by measuring of peptide quantities. There are, however, several factors that can make the estimates of protein quantity uncertain even when highly accurate peptide quantities have been obtained. Because only a few peptides are typically measured for a given protein, these peptides might not be sufficient to define all isoforms of the protein that are present in the sample – i.e., some of the peptide sequences might be shared with other proteins, making them only suitable for quantitating the group of proteins. A few peptides might also be modified, and the change in the amount of the modified and unmodified forms of the protein is often not the same. Despite these issues, a reasonable estimate of the protein quantity can often be obtained even when only a few of its peptides are quantified. When many peptides are observed for a given protein it can be possible to even calculate the variation in quantity of several isoforms.

Step 6: Determination of the significance of the change in quantity

The significance of a measured change in quantity can be calculated if the distribution of random quantity changes (due to uncontrolled variation of experimental conditions) is known (Fig. 4a). This distribution can be obtained by analysis of technical and biological replicates. When the distribution of random quantity changes is known, the significance of a measured change in quantity can be calculated by integrating under the curve from the measured change in quantity to infinity and dividing this area by the area under the entire distribution of random changes. This value represents the probability that the measured quantity change was obtained from purely random variations, that is, the probability of rejecting the null hypothesis that there is no change in the experimental conditions. The distribution of random quantity changes is strongly dependent on the experimental conditions and the workflow that is chosen. For example, for label-free quantitation the distribution of random quantity changes depends on the number of replicates obtained (Fig. 4b–g). It is important to design quantitation experiments to minimize the width of the distribution of random quantity changes to allow for detection of small nonrandom changes.

Fig. 4
(a) The distribution that represents the null hypothesis, that is, that a given ratio is random. This distribution can be obtained by analysis of samples where only random variation is expected (technical and biological replicates). Then the significance ...

Acknowledgments

This work was supported by funding provided by the National Institutes of Health Grants RR00862, RR022220, NS050276, and CA126485, the Carl Trygger foundation, and the Swedish research council.

Footnotes

1Alternative methods for quantitation search fragment mass spectra against a protein sequence collection and use the search results for quantitation. One method uses the number of different fragment mass spectra that identifies a peptide as a measure of its quantity (15). Another method calculates a measure that is based on the fraction of the protein sequence that the identified peptides cover (16). However, these alternative methods that are not based on peak integration are generally less accurate when only a few fragment spectra or peptides are observed for a given protein because of the limited statistics. On the other hand, they are less sensitive to interference and can often be more robust.

2There are many software packages available for quantitation. A few examples of freely available software are listed below:

3For a mass spectrum where I(k) is the measured intensity at a point k with 0 ≤ kN, and N is the total number of points in the mass spectrum. The peaks are detected by calculating the sum, equation M1 over the expected peak width wl for each point, l, in the spectrum, and detecting local maxima in S(l). In cases where there is sufficient noise in the spectrum the signal-to-noise ratio is calculated by taking the ratio of the root mean square (RMS) of the intensities over the peak ( equation M2, where Î is the mean intensity over the peak) and the RMS of the intensities in a nearby region where there are no peaks (see Note 6).

4Peptides are observed as clusters of peaks in mass spectrometry, because of the presence of small amounts of stable heavy isotopes in nature (e.g., 0.015% 2H, 1.11% 13C and 0.366% 15N, 0.038% 17O, 0.200% 18O, 0.75% 33S, 4.21% 34S, 0.02% 36S). The intensities of the isotope distribution are calculated accurately by including all possible isotopes. The largest effect comes from 13C and a first order estimate of the relative peak intensities is given by equation M3, where Tm is the intensity of peak m in the distribution, m is the number of 13C, n the total number of carbon atoms in the peptide, and p is the probability for 13C (i.e., 1.11%). The isotope distribution of peptides is strongly dependent on the peptide mass because the number of atoms increases with mass, and therefore the probability increases for having one or more of the naturally occurring heavy isotopes.

5The normalized dot product between the measured intensities, I = (I1, I2,…, In) and theoretical intensities T = (T1, T2,…, Tn) of the isotope distribution is given by equation M4. The range of the normalized dot product is from −1 to 1. If the measured and theoretical intensities are identical the resulting dot product is 1 and any differences between them will result in lower values of the dot product.

6Low-frequency background can be removed by fitting a smooth curve to the regions of the mass spectrum where there are no peaks. This smoothing can, for example, be achieved by applying a very wide and strong smoothing function to the entire spectrum, which will result in a smooth function slightly higher than the background. Subsequently, points in the original spectrum that are far above this smooth curve are removed (i.e., the peaks). The smoothing procedure is repeated, this time without including the peaks, to produce a smooth function that will closely follow the background of the spectrum (25).

References

1. Oda Y, Huang K, Cross FR, Cowburn D, Chait BT. Accurate quantitation of protein expression and site-specific phosphorylation. Proc Natl Acad Sci USA. 1999;96:6591–6. [PubMed]
2. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics. 2002;1:376–86. [PubMed]
3. Schwanhausser B, Gossen M, Dittmar G, Selbach M. Global analysis of cellular protein translation by pulsed SILAC. Proteomics. 2009;9:205–9. [PubMed]
4. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol. 1999;17:994–9. [PubMed]
5. Mirgorodskaya OA, Kozmin YP, Titov MI, Korner R, Sonksen CP, Roepstorff P. Quantitation of peptides and proteins by matrix-assisted laser desorption/ionization mass spectrometry using (18)O-labeled internal standards. Rapid Commun Mass Spectrom. 2000;14:1226–32. [PubMed]
6. Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc Natl Acad Sci USA. 2003;100:6940–5. [PubMed]
7. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics. 2004;3:1154–69. [PubMed]
8. Beynon RJ, Doherty MK, Pratt JM, Gaskell SJ. Multiplexed absolute quantification in proteomics using artificial QCAT proteins of concatenated signature peptides. Nat Methods. 2005;2:587–9. [PubMed]
9. Anderson L, Hunter CL. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol Cell Proteomics. 2006;5:573–88. [PubMed]
10. Yi EC, Li XJ, Cooke K, Lee H, Raught B, Page A, Aneliunas V, Hieter P, Goodlett DR, Aebersold R. Increased quantitative proteome coverage with (13)C/(12)C-based, acid-cleavable isotope-coded affinity tag reagent and modified data acquisition scheme. Proteomics. 2005;5:380–7. [PubMed]
11. Schulz-Knappe P, Zucht HD, Heine G, Jurgens M, Hess R, Schrader M. Peptidomics: the comprehensive analysis of peptides in complex biological mixtures. Comb Chem High Throughput Screen. 2001;4:207–17. [PubMed]
12. Wang W, Zhou H, Lin H, Roy S, Shaler TA, Hill LR, Norton S, Kumar P, Anderle M, Becker CH. Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal Chem. 2003;75:4818–26. [PubMed]
13. Wiener MC, Sachs JR, Deyanova EG, Yates NA. Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. Anal Chem. 2004;76:6085–96. [PubMed]
14. Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, Spiegelman CH, Zimmerman LJ, Ham AJ, Keshishian H, Hall SC, Allen S, Blackman RK, Borchers CH, Buck C, Cardasis HL, Cusack MP, Dodder NG, Gibson BW, Held JM, Hiltke T, Jackson A, Johansen EB, Kinsinger CR, Li J, Mesri M, Neubert TA, Niles RK, Pulsipher TC, Ransohoff D, Rodriguez H, Rudnick PA, Smith D, Tabb DL, Tegeler TJ, Variyath AM, Vega-Montoto LJ, Wahlander A, Waldemarson S, Wang M, Whiteaker JR, Zhao L, Anderson NL, Fisher SJ, Liebler DC, Paulovich AG, Regnier FE, Tempst P, Carr SA. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat Biotechnol. 2009;27:633–41. [PMC free article] [PubMed]
15. Liu H, Sadygov RG, Yates JR., 3rd A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem. 2004;76:4193–201. [PubMed]
16. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics. 2005;4:1265–72. [PubMed]
17. Li XJ, Zhang H, Ranish JA, Aebersold R. Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal Chem. 2003;75:6648–57. [PubMed]
18. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26:1367–72. [PubMed]
19. Cox J, Matic I, Hilger M, Nagaraj N, Selbach M, Olsen JV, Mann M. A practical guide to the MaxQuant computational platform for SILAC-based quantitative proteomics. Nat Protoc. 2009;4:698–705. [PubMed]
20. Mortensen P, Gouw JW, Olsen JV, Ong SE, Rigbolt KT, Bunkenborg J, Cox J, Foster LJ, Heck AJ, Blagoev B, Andersen JS, Mann M. MSQuant, an open source platform for mass spectrometry-based quantitative proteomics. J Proteome Res. 2010;9(1):393–403. [PubMed]
21. Khan Z, Bloom JS, Garcia BA, Singh M, Kruglyak L. Protein quantification across hundreds of experimental conditions. Proc Natl Acad Sci USA. 2009;106:15544–8. [PubMed]
22. Boehm AM, Putz S, Altenhofer D, Sickmann A, Falk M. Precise protein quantification based on peptide quantification using iTRAQ. BMC Bioinformatics. 2007;8:214. [PMC free article] [PubMed]
23. Mason CJ, Therneau TM, Eckel-Passow JE, Johnson KL, Oberg AL, Olson JE, Nair KS, Muddiman DC, Bergen HR., 3rd A method for automatically interpreting mass spectra of 18O-labeled isotopic clusters. Mol Cell Proteomics. 2007;6:305–18. [PubMed]
24. MacLean B, Tomazela DM, Shulman N, Chambers M, Finney G, Frewen B, Kern R, Tabb DL, Liebler DC, Maccoss MJ. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26:966–8. [PMC free article] [PubMed]
25. Woo EM, Fenyo D, Kwok BH, Funabiki H, Chait BT. Efficient identification of phosphorylation by mass spectrometric phosphopeptide fingerprinting. Anal Chem. 2008;80:2419–25. [PubMed]
26. Zhang G, Fenyo D, Neubert TA. Evaluation of the variation in sample preparation for comparative proteomics using stable isotope labeling by amino acids in cell culture. J Proteome Res. 2009;8:1285–92. [PMC free article] [PubMed]