PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Am Soc Mass Spectrom. Author manuscript; available in PMC 2012 April 1.
Published in final edited form as:
PMCID: PMC3145359
NIHMSID: NIHMS279735

Improving Proteome Coverage on a LTQ-Orbitrap Using Design of Experiments

Abstract

Design of experiments (DOE) was used to determine improved settings for a LTQ-Orbitrap XL to maximize proteome coverage of Saccharomyces cerevisiae. A total of nine instrument parameters were evaluated with the best values affording an increase of approximately 60% in proteome coverage. Utilizing JMP software, 2 DOE screening design tables were generated and used to specify parameter values for instrument methods. DOE 1, a fractional factorial design, required 32 methods fully resolving the investigation of six instrument parameters involving only half the time necessary for a full factorial design of the same resolution. It was advantageous to complete a full factorial design for the analysis of three additional instrument parameters. Measured with a maximum of 1% false discovery rate, protein groups, unique peptides, and spectral counts gauged instrument performance. Randomized triplicate nanoLC-LTQ-Orbitrap XL MS/MS analysis of the S. cerevisiae digest demonstrated that the following five parameters significantly influenced proteome coverage of the sample: (1) maximum ion trap ionization time; (2) monoisotopic precursor selection; (3) number of MS/MS events; (4) capillary temperature; and (5) tube lens voltage. Minimal influence on the proteome coverage was observed for the remaining four parameters (dynamic exclusion duration, resolving power, minimum count threshold to trigger a MS/MS event, and normalized collision energy). The DOE approach represents a time- and cost-effective method for empirically optimizing MS-based proteomics workflows including sample preparation, LC conditions, and multiple instrument platforms.

Keywords: LTQ-Orbitrap XL MS/MS instrument parameters, Design of experiments, Fractional-factorial design, Saccharomyces cerevisiae, Proteomics

Introduction

Mass spectrometry is at the intersection of several proteomics workflows and the diversity of its user base continues to expand. A consequence of the rising popularity and importance of mass spectrometry (MS) in biological research has been increasing demands on instrument time and performance. Because the time and cost of MS-based proteomics experiments are significant, the efficient optimization and set-up of instrument parameters remain of paramount importance when pushing the qualitative and quantitative limits of proteome analysis. Although data quality in a proteomics experiment can be defined multiple ways (e.g., proteome depth, biological relevance, quantitative accuracy), experimental outcomes are often dictated to varying degrees by common factors that include sample preparation, sample fractionation and separation, instrument settings, and post-acquisition bioinformatic platforms. Many MS proteomics laboratories, including our own, have a preferred method(s) for MS interrogation, but lack systematic studies to justify the overall optimization of the instrument method. Initiatives from the Human Proteome Organization (HUPO) Proteomics Standards Initiative and Clinical Proteomic Technology Assessment for Cancer (CPTAC) have focused on improving the reproducibility of proteomics measurements within and between laboratories by advocating the use of benchmark proteome standards. Recent studies from researchers directly involved with these initiatives have shown interesting results for measuring the performance of liquid chromatography (LC) and MS instrumentation [1], evaluating LC-MS interlaboratory performance [2], and reproducibility in generating protein identifications by LC-MS [3]. However, thus far these initiatives have not focused on MS instrument parameter optimization; rather, they have allowed each laboratory to use a “favorite method” or a standard operating protocol method. Limited investigations of high resolving power MS instrument parameters exist in demonstrating maximum instrument response [47] and, furthermore, a large-scale investigation of MS instrument parameters for increased proteome coverage is absent. Herein, we report results from our efforts to systematically and efficiently explore nine MS instrument parameters on a LTQ-Orbitrap XL gauging instrument performance using several proteomic metrics for the analysis of Saccharomyces cerevisiae.

An efficient method for investigating the effects of several MS parameters is fractional-factorial design (FracFD), which generates an experimental framework for evaluating several variables (>3) in less time than more conventional approaches such as full-factorial design (FullFD) [8, 9]. FullFD investigates one variable at a time and usually at several different values/levels to accomplish experimental objectives. A recent example of this approach was reported by Raji et al. [10] for optimizing the response of three synthetic peptides on two electrospray ionization (ESI) MS instruments. However, as the number of experimental variables increase, the process becomes more time consuming and costly. For example, an experiment with n variables/factors at two different levels requires 2n experiments. As previously proposed by Riter et al. [9] as an effective tool for mass spectrometrists, FracFD provides a more efficient experimental approach or design of experiments (DOE) in which a carefully chosen subset of experiments is performed simultaneously evaluating variables at two levels. These two levels, a maximum and minimum for continuous variables or two categories for categorical variables, are most beneficial to the DOE platform analysis if they are selected based upon experimental or literature reference. It is common to perform one-fourth to one-eighth the number of full-factorial experiments in a FracFD significantly reducing the time of analysis. Recently our group successfully employed the FracFD DOE platform reducing experimental time and cost for the development of an air amplifier to increase MS-ion abundance [11] and for the optimization of sample preparation conditions to improve the MS detection of glycans [12].

In an effort to empirically justify the settings for several MS parameters in a standard shotgun LC-MS/MS experiment, we examined the responses of a total protein digest of S. cerevisiae as a function of nine LTQ-Orbitrap MS/MS instrument parameters in two DOE experiments. The proteomic metrics (i.e., responses) used to assess the significance of each parameter were: (1) total number of protein groups (one or more proteins identified with the same peptides and unable to be distinguished as unique); (2) unique peptides; and (3) spectral counts, and these offer quantitative feedback for the analysis of a tryptic digest of S. cerevisiae. The mass accuracy of the resultant peptides was not employed as a metric due to the outcome of database searching with different MS tolerances (±1–10 ppm). It was demonstrated that as the MS tolerance was increased, there was an initial gain in the number of proteins identified followed by very little variation in the number as the tolerance was opened (see Supplementary Figure 1). The first two responses demonstrate the proteome and protein coverage and consider database redundancy. Regarding protein quantification, label-free spectral counting affords a relative measurement of protein concentration by comparing the number of resultant MS/MS spectra from peptides associated with a specific protein [1317]. The most advantageous instrument method would afford the highest number of total spectral counts with reproducibility maintained over S. cerevisiae replicate analyses. S. cerevisiae has a sufficiently complex and highly characterized proteome, and was the first organism with a complete annotated database of the complete proteome [18]. It is one of the most extensively analyzed organisms in proteomics research spanning the analysis of MS instrument technologies [1926] to efforts evaluating global protein expression [24, 2731]. While this investigation utilizes the entire S. cerevisiae proteome, we were only concerned with the relative changes in proteome coverage and the sensitivity of the measurements to detect change and reveal significant variables. The DOE method described herein represents a viable strategy for moving forward with establishing proteomics as a robust, reproducible, and translatable technique for researchers spanning multiple disciplines of biological research and technology development.

Experimental

Saccharomyces cerevisiae Sample Preparation

S. cerevisiae strain Y15696 (BY4742; MaTα; his3D1; leu2D0; lys2D0; ura3D0; YIR034c::kanMX4), an auxotroph for lysine due to the lys1 gene deletion, was acquired from EuroScarf (Frankfurt, Germany). The experimental design analyzing the yeast sample is illustrated in Figure 1 and described here in more detail. The yeast was grown for 24 h in yeast peptone dextrose broth at 30 °C to exponential-phase. The culture was harvested by centrifugation at 5000 rpm for 10 min at 4 °C. The cell pellet was washed in 50 mL of 50 mM Tris-HCl (Sigma-Aldrich, St. Louis, MO, USA) and again subjected to centrifugation as described above. The yeast cell pellet was flash-frozen prior to lysis with mortar and pestle. The resulting powder was reconstituted in 50 mM Tris-HCl. Following centrifugation, cellular debris was removed as the supernatant was collected. A modified Bradford Assay (Coomassie Plus Assay) and a bicinchoninic acid assay, both products of Pierce (Thermo Scientific, Rockford, IL, USA), were used to approximate the total protein concentration. An in-solution tryptic digestion was completed on ~1 mg of protein and is briefly described here. Urea (Sigma-Aldrich) was added to the yeast protein solution such that the final concentration was 8 M. The denatured proteins were reduced by adding a 100 mM dithiothreitol (DTT) (BioRad, Hercules, CA, USA) solution to a final concentration of 5 mM followed by a 30 min incubation at 56 °C. The solution was then allowed to cool to room temperature followed with alkylation by adding a 200 mM iodoacetamide (Sigma-Aldrich) solution to a final concentration of 20 mM and incubated for 30 min at room temperature in the dark. The reaction was quenched with 100 mM DTT for 30 min and then diluted with 50 mM Tris-HCl, such that the urea concentration was 2 M. Proteins were digestion with trypsin (Sigma-Aldrich) at a 1:50 enzyme:substrate ratio and allowed to proceed overnight at 37 °C. Formic acid (Sigma-Aldrich) was added to the peptide solution present as 1% of the volume. The sample was aliquoted into multiple identical fractions (by volume) and dried under reduced pressure prior to storage at −20 °C.

Figure 1
The experimental workflow investigating LTQ-Orbitrap MS/MS instrument parameters anticipating the increase of proteome coverage. JMP software afforded the generation of experiments for nanoLC-LTQ-Orbitrap MS/MS investigation of the tryptic digest of ...

NanoLC-LTQ-Orbitrap MS/MS Analysis

A nanoLC-1D (Eksigent Technologies, Dublin, CA, USA) was coupled to a LTQ-Orbitrap XL (Thermo Scientific, San Jose, CA, USA) via a continuous, vented column configuration described previously [32]. Both the trap and analytical columns were self-packed with Magic C18AQ stationary phase (5 μm particle, 200Å pore) (Michrom Bioresources, Auburn, CA, USA) utilizing a pressurized cell. Mobile phase A and B were composed of water/acetonitrile/formic acid (98/2/0.2% and 2/98/0.2%, respectively). The solvents (Burdick and Jackson, Muskegon, MI, USA) were HPLC-grade and the formic acid (Sigma-Aldrich, St Louis, MO, USA) was MS-grade. Two μL of yeast digest (100 ng/μL in 50 mM Tris-HCl pH 8.0) was loaded onto the trap column followed by washing with approximately 10 column washes with 2% B from Channel 1 flowing at 1.5 μL/min. The following gradient was applied at a flow-rate of 350 nL/min: 2% B (0–5 min), 2%–10% B (5–7 min), 10%–40% B (7–67 min), 40%–90% B (67–68 min), 90% B (68–78 min), 90%–2% B (78–80 min), 2% B (80–85 min). A new reconstituted sample was loaded every eight sample injections for analysis. Details of the LTQ-Orbitrap XL instrument settings and pertinent comparisons will be provided in Results in Discussion (vide infra) and Supplementary Tables 1 and 2.

Data Analysis

Shotgun proteomics data generated during this study was searched against a concatenated target-reverse S. cerevisiae database (Uniprot ver. 4932) created with Bos taurus trypsin sequence, and Homo sapiens keratin and keratin related proteins using Mascot Daemon (ver. 2.2.2, Matrix Science, Boston, MA, USA) to batch process files, Mascot Distiller (ver. 2.2.1.0, Matrix Science, Boston, MA, USA) to generate peak lists, and then Mascot (ver. 2.3.01, Matrix Science, Boston, MA, USA) to perform the searches. Carbamido-methyl (C) was set as a fixed modification and deamidation (N and Q) and oxidation (M) were set as variable modifications. Additional search settings included a maximum of 2 missed cleavages, peptide tolerance of ±5 ppm, and MS/MS tolerance of ±0.6 Da. Protein grouping, statistical filtering, and quantification (spectral counts) of the Mascot DAT files were accomplished using ProteoIQ (ver. 1.5.05, BioInquire, Athens, GA, USA) that utilizes a combination of Peptide/Protein Prophet [33, 34] and ProValT [35]. One ProteoIQ project was created for each DOE FracFD or FullFD method (32 projects for DOE 1 and 11 projects for DOE 2) and the data was filtered based on a maximum 1% protein false discovery rate (FDR).

The number of protein groups, total spectral counts, and unique peptides for each replicate as a function of the 9 LTQ-Orbitrap XL instrument parameters are shown in Supplementary Tables 3 and 4. These measurements (i.e., responses) were used to generate the outcome for the DOE screening data analysis in JMP 8.0.2 (SAS Institute, Inc., Cary, NC, USA) as illustrated in Figure 1. Half normal quantile probability plots, the complementary bar graphs, and statistical measurements afforded presentation of influencing factors.

Results and Discussion

Previous S. cerevisiae LTQ-Orbitrap MS/MS Analysis

S. cerevisiae has been used as a MS performance standard evaluating the performance of several laboratories with equivalent instrument platforms and bioinformatics [2]. Paulovich et al. [2] provides a reference S. cerevisiae dataset to the MS community for opportunity to evaluate the performance of LTQ and LTQ-Orbitrap MS/MS instruments with the S. cerevisiae NIST performance standard. The laboratories included in this study were requested to employ a “favorite” instrument method as well as a standard operating protocol with both applying a 2 h gradient for instrument performance analysis. While the instrument methods were not optimized, Paulovich et al. [2] describes that this investigation allows for laboratories to compare instrument performance and expand upon the development of optimized methods for analysis. Resultant data was processed with MyriMatch [36] as the bioinformatic platform and the absolute number of proteins and peptides identified were used as a measure of performance.

Although we requested but were unable to acquire the NIST performance standard for use in our LTQ-Orbitrap MS/MS instrument parameter investigation, their RAW data files were attainable through ProteomeCommons.org for data sharing. As a rough comparison of the analysis to our own, a dataset was randomly selected in which 120 ng of the yeast sample was loaded on the column (Orbi2_study8_-W080923_yeast_120_ft8_pc in triplicate analysis). Processing the RAW data with the more commonly employed Mascot bioinformatic platform combined with ProteoIQ, as described vide supra, resulted in 1088 protein groups, 7707 unique peptides identified, and 19,790 spectral counts. While this output exceeded our best results by approximately 2-fold, several differences are apparent between the studies, and instrument method deviations are detailed in Supplementary Table 2.

One of the more significant differences in experimental conditions in comparison to Paulovich et al. [2] is analytical separation. Paulovich et al. [2] employs a 2 h gradient whereas our methods employ a 1 h gradient. The reason for a reduced gradient length in our study was attributed to the nature and size of the DOE studies and the fact that we were primarily interested in relative changes in proteome coverage, not in setting new records in numbers of proteins identified. This difference in gradient length is evidenced to influence the peak capacity and consequently analyte separation and detection [37, 38]. An extended gradient decreases the probability of species overlapping, and therefore reduces complexity as the analyte assumes MS detection supporting an increase in protein identifications. Second, a direct comparison of methods would require access to the NIST yeast standard.

DOE 1

Requiring only half the number of experiments and consequentially half the time (32 in triplicate versus 64 in triplicate), DOE 1 afforded the analysis of six factors (see Table 1) demonstrating effective variables by a FracFD platform. These six factors were of great interest to our group attributable to the curiosity in MS data acquisition and empirically demonstrating factors contributing to the data quality and quantity. Half normal quantile probability plots and the corresponding bar graphs were generated by JMP affording a demonstration of influencing factors as a function of 3 responses (see Figure 2). It is clearly evidenced that the monoisotopic precursor selection (MIPS) function and the ion trap (IT) maximum ionization time, also known as maximum injection time, are significant variables; the large absolute value of the contrast and the Lenth t-ratio, and the almost zero p-Value exhibit the influence of these parameters. The negative contrast values indicate that the first item specified for the categorical factor (on for MIPS) and the minimum value for the continuous factor (80 ms for IT maximum ionization time) afford the greatest influence in proteome coverage and spectral counts. MIPS affords the selection of only the monoisotopic peak, while excluding all other peaks in the same isotopic distribution, and as evidenced, significantly effects proteome coverage. As expected, IT maximum ionization time greatly influences the responses as the interplay between shorter maximum ionization times and the automatic gain control target (AGCTarget) allows for more MS/MS events to occur between precursor ion scans, and accordingly a greater number of available peptide sequences subjected for identification. Longer maximum IT ionization times may time out when little to no signal is present in the analysis not reaching the AGCTarget and wasting time between the precursor ion scans. Consequently with the shorter IT maximum ionization time favored, the number of MS/MS events, while considered not significant for number of proteins identified, was most effective when set to 8 events. As a result of a personal communication, the AGCTarget for both the ion trap and Orbitrap were not evaluated in this DOE investigation, but maintained at 8×103 and 1×106, respectively [39].

Figure 2
Half normal quantile probability plots and corresponding bar graphs for each response for the determination of significant variables for DOE 1 FracFD. The half normal quantile curves (blue line) in each plot represent the normal distribution, and data ...
Table 1
Six factors were included in DOE 1 for the analysis of improved proteome coverage by nanoLC-LTQ-Orbitrap MS/MS. The settings used for previous MS analysis are included followed by the minimum (Min.) and maximum (Max.) factor values included in the DOE ...

It is evidenced that dynamic exclusion (DE) duration influences the number of spectral counts [40], and can affect the proteome coverage of investigation. More abundant peptides eluting off the column over a broader chromatographic peak will inherently have more opportunities for MS interrogation depending on the exclusion time period. DE duration, as displayed in Figure 2a and b, appears to be a significant variable; however, both the minimum and maximum values are favored depending on the response. The maximum factor value for DE duration, 180 s, generates an increased number of protein groups identified. The MS selects ions for interrogation by abundance, excluding ions for 180 s gives rise to MS interrogation of lesser abundant ions over a 3 min period versus that of a shorter time period and, consequently, a greater variety of species have the opportunity for MS/MS analysis. Spectral counts as a response (see Figure 2c), the minimum factor value, 30 s, affords a greater output due to shorter exclusion periods of highly abundant species and not as demanding of the interrogation of lower abundant species. Normalization efforts will facilitate direct comparisons in quantification, but it is important to acknowledge these results when investigating a sample with a large dynamic range.

Two factors with minimal if any contributions towards the responses, minimum count threshold and resolving power (RP) of the precursor survey scan, appear to fall closely to the limit of significance in response towards the number of unique peptides identified (Figure 2b). In the instance of minimum count threshold, the absolute value of the Lenth t-ratio is just outside the commonly significant value of two. Also, the individual p-value falls close to the 0.05 significance limit. This factor, minimum count threshold to trigger a MS/MS event, establishes the minimum amount of signal required for an ion to be selected for a MS/MS event. In principle, a larger value would instigate MS/MS interrogation of more abundant ions potentially generating higher quality mass spectra. When deemed a significant variable (see Figure 2b), a value of 500 counts is most effective for the minimum count threshold, still the factor does not greatly influence the proteome coverage or spectral counts. Yates and coworkers extensively evaluated the minimum count threshold and demonstrated similar results generating no significant difference in the number of protein identifications at comparable threshold values [41]. The last factor, RP, did not significantly increase proteome coverage; however, the bar graphs suggest that a resolving power of 30,000fwhm at m/z 400 may contribute to increased proteome coverage as opposed to 60,000fwhm. The instrument method from DOE 1 generating the best response data employed a RP of 30,000fwhm at m/z 400 (see Table 2a). The maximum RP, and as a consequence the potential for increased mass accuracy, does not necessarily contribute to increased protein identifications and Kim et al., who systematic evaluated resolving power in shotgun proteomic experiments, also demonstrated limited gain in protein identifications when comparing maximum RP [42].

Table 2
Factor and response values for DOE 1 method generating the most protein identifications with ProteoIQ. (a) As demonstrated through the half normal plots and bar graphs for DOE 1 (Figure 2), this instrument method contains the appropriate settings for ...

The resolution of FracFD, or degree of confounding, is specified prior to creation of the screening design table influencing the number of experiments required for analysis and the possible number of aliasing effects. For our purposes we selected a resolution of five, which afforded no confounding effects, equivalent to the resolution of a FullFD, though requiring only half the number of experiments and, hence, half the time. This type of resolution affords the realization of significance of each variable on the data whether or not the variables interact with each other [9]. As displayed in Figure 2, confounding factors are specified in the analysis and recognized as significant variables. However, due to the resolution specified for DOE 1, confounding factors can be confirmed as significant or insignificant based on the results of the individual factors. This confounding provides a glimpse of possible significance of interacting factors of DOE 1 analysis had a resolution of five not been performed. Most half normal plots in Figure 2 contain IT maximum ionization time confounded with MIPS, but it is clear that each individual factor and not just the confounding nature cause the variables to be significant towards the response.

While six instrument parameters were included in DOE 1, the setting for a seventh instrument parameter was also suggested from the investigation. ProteoIQ affords output of peptide discriminant value distributions gauging probability [33] as illustrated in Figure 3. Figure 3 represents discriminant value plots for the DOE 1 instrument method producing the most protein identifications (see Tables 2a and b for factor and response data), and Supplementary Figure 2 represents discriminant value plots for the DOE 1 instrument method producing the least protein identifications (see Tables 3a and b for factor and response data). Both figures conclude that peptides in the 2+ charge-state yield more positive peptide identifications versus 3+ or 4+ charge-state peptides. As mentioned in Supplementary Table 1, 1+ and unassigned charge-states are rejected from MS/MS analysis. Attributable to discriminant value distribution plots, peptides with charge-states >3+ appear to be consuming available MS/MS interrogation without giving rise to peptide identification, and accordingly charge-state 4+ may also be rejected. Although Figure 3 and Supplementary Figure 2 suggest that 2+ charge-state peptides are predominately observed and identified, further investigations are necessary to evidence if only 2+ charge-state peptides should be selected opposed to 2+ and 3+ peptides. Overall, the instrument method resulting in the most protein identifications for DOE 1 (Table 2a) resulted in 490 confident (maximum 1% FDR) protein groups and 3187 unique peptides from triplicate analysis (see Table 2b), while the method resulting in the least protein identifications for DOE 1 (Table 3a) resulted in 238 protein groups and 1694 unique peptides (see Table 3b). Evaluating LTQ-Orbitrap MS/MS instrument parameters afforded the improvement in instrument response by roughly 2-fold.

Figure 3
ProteoIQ output of peptide discriminant value distributions for the instrument method that generated the most protein identifications. The number of peptides is plotted versus the discriminant value or the measurement of peptide assignment accuracy. The ...
Table 3
Factor and response values for DOE 1 method generating the least protein identifications with ProteoIQ. (a) Comparing to the instrument method that generated the most protein identifications, the opposite parameter settings yielded the least response. ...

DOE 2

An additional DOE investigation (DOE 2) was initiated following DOE 1 data processing in order to further the investigation of instrument parameters. DOE 2 evaluated normalized collision energy (NCE), tube lens voltage, and capillary temperature (see Table 4) for increased proteome coverage using the S. cerevisiae tryptic digest. Vide infra, our curiosity in the interplay of these factors with the resultant number of proteins identified, lead to the selection of parameters. NCE provides a level of energy for peptide fragmentation in the LTQ, and it is crucial to select a value in which the species is sufficiently fragmented; too little NCE will result in no fragmentation, while too much NCE may over-fragment the peptide limiting sequence information and complicating the MS/MS spectra through generation of wn and dn side chain fragment ions and internal fragment ions. Within the Xcalibur software, the default setting for NCE is 35, however Paulovich et al. [2] employed a NCE of 28 and limited to no empirical evidence in this selection contribute to our curiosity in altering the NCE. The tube lens voltage directs ions into the ion guide which is offset from the orifice of the detector. The redirection of ions prevents neutral species from accumulating in the MS. This voltage may be a function of the molecular weight and charge as our group has assessed a tube lens value of 120 V for N-linked glycans (unpublished data) and variation of tube lens voltage for intact proteins influenced by the molecular mass (unpublished data). The capillary temperature influences the desolvation and other associated properties of the electrospray droplets as they travel from the ESI emitter towards the orifice of the MS and form gas-phase ions.

Table 4
Three factors were included in DOE 2 for a more complete nanoLC-LTQ-Orbitrap MS/MS parameter analysis. Similar to Table 1, the previous instrument setting, minimum and maximum factor values, and motivation driving the investigation are included. Also, ...

The equivalent motivation and experimental workflow was followed as illustrated in Figure 1, and the parameters producing the greatest number of protein identifications from DOE 1 were used (see Table 2 and Supplementary Table 1). Whereas a resolution of five for DOE 1 was accomplished in half the number of experiments as a FullFD, DOE 2 required a FullFD to accomplish the same resolution. To make for a more complete experimental design, three additional experiments were included in DOE 2 reflecting the instrument parameter settings employed for the best responses from DOE 1, as well as midpoints for tube lens (120 V) and capillary temperature (187 °C) such that time permitted (see Table 4).

Eleven experiments in triplicate were completed resulting in half normal probability plots and bar graphs produced by JMP. Figure 4 exhibits that the tube lens voltage is a significant variable for all responses (Figure 4a, b, and c) and capillary temperature is a significant variable for two responses (Figure 4a and b). The bar graphs reveal that the minimum value for tube lens voltage (100 V) and capillary temperature (150 °C) are preferred for increased response. Tube lens voltage contributes to the identity of the species allowed to be directed towards the MS detector and this analysis reveals a lower voltage than previously employed is favored. Capillary temperature alters the droplet desolvation rate, and the minimum temperature favored suggests a rate limiting thermal degradation and charge stripping which would be exist if the temperature was too high. As with DOE 1, confounding factors are represented as significant factors; however, attributable to the resolution, any aliasing can be evaluated based on the individual factors. The instrument method investigated in DOE 2 providing the most protein identifications is presented in Table 5a while the response data from the triplicate analysis is presented in Table 5b. As demonstrated in the systematic characterization of LTQ-Orbitrap MS/MS instrument parameters in DOE 1, DOE 2 resulted in increased responses. A total of 570 protein groups were confidently identified in DOE 2, which is an increase of 80 protein groups over the best results from DOE 1 affording roughly 20% more proteome coverage.

Figure 4
Half normal quantile probability plots and corresponding bar graphs for each response demonstrating the significance of each factor in DOE 2 FullFD organized equivalent to Figure 2. Those factors that deviate from the half normal quantile curve (blue ...
Table 5
Factor and response values for DOE 2 instrument method generating the greatest number of protein identifications. (a) Based on the half normal plots and bar graphs for DOE 2 (Figure 4), the capillary temperature and tube lens in this method reflect the ...

Conclusions

The DOE platform afforded a systematic approach investigating large experimental space for the analysis of 9 LTQ-Orbitrap MS/MS instrument parameters. Variables and their settings of significant influence to most instrument responses in DOE 1 included 80 ms IT maximum ionization, MIPS on, and 8 MS/MS events. In DOE 2, a capillary temperature of 175 °C and a tube lens value of 120 V afforded the best instrument response. Overall improvement to the instrument method afforded 570 protein groups with the best DOE 2 method employed versus 238 protein groups with the worst DOE 1 method. The proteome coverage increased approximately 60%, performing approximately 75% of the total experiments required for a FullFD.

Here it is evidenced that LTQ-Orbitrap MS/MS parameters influence the resultant data (see Supplementary Table 1 for full detailed parameter settings). Significant improvement was realized from this evaluation, and optimization for individual instruments and conditions may be required. The objective of these initial DOE studies was to demonstrate the significance of each variable for improved proteome analysis. Whereas the minimum or maximum value was determined as an improvement, depending on the condition or type of high resolution MS, each MS instrument is unique, and this investigation will provide a proven foundation with which to begin optimization for increased proteome coverage. Modifications to the nanoLC and bioinformatic platforms also merit investigation and may contribute to increased proteome coverage.

Supplementary Material

Acknowledgments

The authors acknowledge the financial support of the National Institutes of Health (grant 5T32GM00-8776-08), which supports G.L.A. in the North Carolina State University Molecular Biotechnology Training Program, the National Science Foundation (grant MCB-0918611), and the W. M. Keck Foundation. The authors also thank Hunter Walker and Tim Collier for their helpful discussions.

Footnotes

Supporting Information Available The data associated with this manuscript may be downloaded from the ProteomeCommons.org Tranche network using the following hash:

44kPwxvjy9zCSFSirSGCbFUQGMRyjztmv0a547DeJ5g+vNN2OK4lUPGloA/LhxsLLOfmPuVkiMROcijpdRCA-h7AW8hcAAAAAAAAFpw==

The hash may be used to prove exactly what files were published as part of this manuscript’s dataset, and the hash may also be used to check that the data has not changed since publication.

Electronic supplementary material The online version of this article (doi:10.1007/s13361-011-0075-2) contains supplementary material, which is available to authorized users.

References

1. Rudnick PA, Clauser KR, Kilpatrick LE, Tchekhovskoi DV, Neta P, Blonder N, Billheimer DD, Blackman RK, Bunk DM, Cardasis HL, Ham AJL, Jaffe JD, Kinsinger CR, Mesri M, Neubert TA, Schilling B, Tabb DL, Tegeler TJ, Vega-Montoto L, Variyath AM, Wang M, Wang P, Whiteaker JR, Zimmerman LJ, Carr SA, Fisher SJ, Gibson BW, Paulovich AG, Regnier FE, Rodriguez H, Spiegelman C, Tempst P, Liebler DC, Stein SE. Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analyses. Mol. Cell Proteom. 2010;9(2):225–241. [PMC free article] [PubMed]
2. Paulovich AG, Billheimer D, Ham AJL, Vega-Montoto L, Rudnick PA, Tabb DL, Wang P, Blackman RK, Bunk DM, Cardasis HL, Clauser KR, Kinsinger CR, Schilling B, Tegeler TJ, Variyath AM, Wang M, Whiteaker JR, Zimmerman LJ, Fenyo D, Carr SA, Fisher SJ, Gibson BW, Mesri M, Neubert TA, Regnier FE, Rodriguez H, Spiegelman C, Stein SE, Tempst P, Liebler DC. Interlaboratory study characterizing a yeast performance standard for benchmarking LC-MS platform performance. Mol. Cell Proteom. 2010;9(2):242–254. [PubMed]
3. Tabb DL, Vega-Montoto L, Rudnick PA, Variyath AM, Ham AJL, Bunk DM, Kilpatrick LE, Billheimer DD, Blackman RK, Cardasis HL, Carr SA, Clauser KR, Jaffe JD, Kowalski KA, Neubert TA, Regnier FE, Schilling B, Tegeler TJ, Wang M, Wang P, Whiteaker JR, Zimmerman LJ, Fisher SJ, Gibson BW, Kinsinger CR, Mesri M, Rodriguez H, Stein SE, Tempst P, Paulovich AG, Liebler DC, Spiegelman C. Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry. J. Proteome Res. 2010;9(2):761–776. [PMC free article] [PubMed]
4. Oberacher H, Walcher W, Huber CG. Effect of instrument tuning on the detectability of biopolymers in electrospray ionization mass spectrometry. J. Mass Spectrom. 2003;38(1):10–116. [PubMed]
5. Soule MCK, Longnecker K, Giovannoni SJ, Kujawinski EB. Impact of instrument and experiment parameters on reproducibility of ultrahigh resolution ESI FT-ICR mass spectra of natural organic matter. Org. Geochem. 2010;41(8):725–733.
6. Zhou Y, Song JZ, Choi FFK, Wu HF, Qiao CF, Ding LS, Gesang SL, Xu HX. An experimental design approach using tesponse surface techniques to obtain optimal liquid chromatography and mass spectrometry conditions to determine the alkaloids in Meconopsi species. J. Chromatogr. A. 2009;1216(42):7013–7023. [PubMed]
7. Wenner BR, Lynn BC. Factors that affect ion trap data-dependent MS/MS in proteomics. J. Am. Soc. Mass Spectrom. 2004;15(2):150–157. [PubMed]
8. Louvar JF. Simplify, experimental design. Chemical Eng. Prog. 2010;106(1):35–40.
9. Riter LS, Vitek O, Gooding KM, Hodge BD, Julian RK., Jr. Statistical design of experiments as a tool in mass spectrometry. J. Mass Spectrom. 2005;40(5):565–579. [PubMed]
10. Raji MA, Schug KA. Chemometric study of the influence of instrumental parameters on ESI-MS analyte response using full factorial design. Int. J. Mass Spectrom. 2009;279(2/3):100–106.
11. Robichaud G, Dixon RB, Potturi AS, Cassidy D, Edwards JR, Dow TA, Muddiman DC. Design, modeling, fabrication, and evaluation of the air amplifier for improved detection of biomolecules by electrospray ionization mass spectrometry. Int. J. Mass Spectrom. 2010 in press. [PMC free article] [PubMed]
12. Walker SH, Papas BN, Comins DL, Muddiman DC. Interplay of permanent charge and hydrophobicity in the electrospray ionization of glycans. Anal. Chem. 2010;82(15):6636–6642. [PubMed]
13. Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal. Chem. 2007;389(4):1017–1031. [PubMed]
14. Gao J, Friedrichs MS, Dongre AR, Opiteck GJ. Guidelines for the routine application of the peptide hits technique. J. Am. Soc. Mass Spectrom. 2005;16(8):1231–1238. [PubMed]
15. Liu HB, Sadygov RG, Yates JR. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 2004;76(14):4193–4201. [PubMed]
16. Old WM, Meyer-Arendt K, Aveline-Wolf L, Pierce KG, Mendoza A, Sevinsky JR, Resing KA, Ahn NG. Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Mol. Cell Proteom. 2005;4(10):1487–1502. [PubMed]
17. Zybailov B, Coleman MK, Florens L, Washburn MP. Correlation of relative abundance ratios derived from peptide ion chromatograms and spectrum counting for quantitative proteomic analysis using stable isotope labeling. Anal. Chem. 2005;77(19):6218–6224. [PubMed]
18. Payne WE, Garrels JI. Yeast protein database (YPD): a database for the complete proteome of Saccharomyces cerevisiae. Nucleic Acids Res. 1997;25(1):5–62. [PMC free article] [PubMed]
19. Shevchenko A, Jensen ON, Podtelejnikov AV, Sagliocco F, Wilm M, Vorm O, Mortensen P, Shevchenko A, Boucherie H, Mann M. Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels. Proc. Natl. Acad. Sci. USA. 1996;93(25):14440–14445. [PubMed]
20. Washburn MP, Wolters D, Yates JR. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 2001;19(3):242–247. [PubMed]
21. Peng JM, Elias JE, Thoreen CC, Licklider LJ, Gygi SP. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2003;2(1):43–50. [PubMed]
22. Nägele E, Vollmer M, Hörth P. Improved 2D nano-LC/MS for proteomics applications: a comparative analysis using yeast proteome. J. Biomol. Tech. 2004;15:15134–15143. [PMC free article] [PubMed]
23. Wei J, Sun J, Yu W, Jones A, Oeller P, Keller M, Woodnutt G, Short JM. Global proteome discovery using an online three-dimensional LC-MS/MS. J. Proteome Res. 2005;4(3):801–808. [PubMed]
24. de Godoy LMF, Olsen JV, de Souza GA, Li GQ, Mortensen P, Mann M. Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol. 2006;7(6):R50. [PMC free article] [PubMed]
25. Piening BD, Wang P, Bangur CS, Whiteaker J, Zhang HD, Feng LC, Keane JF, Eng JK, Tang H, Prakash A, McIntosh MW, Paulovich A. Quality control metrics for LC-MS feature detection tools demonstrated on Saccharomyces cerevisiae proteomic profiles. J. Proteome Res. 2006;5(7):1527–1534. [PubMed]
26. Picotti P, Bodenmiller B, Mueller LN, Domon B, Aebersold R. Full dynamic range proteome analysis of S. cerevisiae by targeted proteomics. Cell. 2009;138(4):795–806. [PMC free article] [PubMed]
27. de Godoy LMF, Olsen JV, Cox J, Nielsen ML, Hubner NC, Frohlich F, Walther TC, Mann M. Comprehensive mass spectrometry-based proteome quantification of haploid versus diploid yeast. Nature. 2008;455(7217):1251–1260. [PubMed]
28. Futcher B, Latter GI, Monardo P, McLaughlin CS, Garrels JI. A sampling of the yeast proteome. Mol. Cell Biol. 1999;19(11):7357–7368. [PMC free article] [PubMed]
29. Gygi SP, Rochon Y, Franza BR, Aebersold R. Correlation between protein and mRNA abundance in yeast. Mol. Cell Biol. 1999;19(3):1720–1730. [PMC free article] [PubMed]
30. Usaite R, Wohlschlegel J, Venable JD, Park SK, Nielsen J, Olsson L, Yates JR. Characterization of global yeast quantitative proteome data generated from the wild-type and glucose repression Saccharomyces cerevisiae strains: the comparison of two quantitative methods. J. Proteome Res. 2008;7(1):266–275. [PMC free article] [PubMed]
31. Washburn MP, Koller A, Oshiro G, Ulaszek RR, Plouffe D, Deciu C, Winzeler E, Yates JR. Protein pathway and complex clustering of correlated mrna and protein expression analyses in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA. 2003;100(6):3107–3112. [PubMed]
32. Andrews GL, Shuford CM, Burnett JC, Hawkridge AM, Muddiman DC. Coupling of a vented column with splitless nanoRPLC-ESI-MS for the improved separation and detection of brain natriuretic peptide-32 and its proteolytic peptides. Journal of Chromatography B-Analytical Technologies in the Biomedical and Life Sciences. 2009;877(10):948–954. [PMC free article] [PubMed]
33. Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002;74(20):5383–5392. [PubMed]
34. Nesvizhskii AI, Keller A, Kolker E, Aebersold R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 2003;75(17):4646–4658. [PubMed]
35. Weatherly DB, Atwood JA, Minning TA, Cavola C, Tarleton RL, Orlando R. A heuristic method for assigning a false-discovery rate for protein identifications from mascot database search results. Mol. Cell Proteom. 2005;4(6):762–772. [PubMed]
36. Tabb DL, Fernando CG, Chambers MC. MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J. Proteome Res. 2007;6(2):654–661. [PMC free article] [PubMed]
37. Liu HJ, Finch JW, Lavallee MJ, Collamati RA, Benevides CC, Gebler JC. Effects of column length, particle size, gradient length, and flow rate on peak capacity of nanoscale liquid chromatography for peptide separations. J. Chromatogr. A. 2007;1147(1):30–36. [PubMed]
38. Wang XL, Stoll DR, Schellinger AP, Carr PW. Peak capacity optimization of peptide separations in reversed-phase gradient elution chromatography: fixed column format. Anal. Chem. 2006;78(10):3406–3416. [PMC free article] [PubMed]
39. Johnson KL. 2010.
40. Zhang Y, Wen ZH, Washburn MP, Florens L. Effect of dynamic exclusion duration on spectral count based quantitative proteomics. Anal. Chem. 2009;81(15):6317–6326. [PubMed]
41. Wong CCL, Cociorva D, Venable JD, Xu T, Yates JR. Comparison of different signal thresholds on data dependent sampling in orbitrap and LTQ mass spectrometry for the identification of peptides and proteins in complex mixtures. J. Am. Soc. Mass Spectrom. 2009;20(8):1405–1414. [PMC free article] [PubMed]
42. Kim MS, Kandasamy K, Chaerkady R, Pandey A. Assessment of resolution parameters for CID-based shotgun proteomic experiments on the LTQ-Orbitrap mass spectrometer. J. Am. Soc. Mass Spectrom. 2010;21(9):1606–1611. [PMC free article] [PubMed]
43. Schlotzhauer SD. Elementary statistics using JMP. SAS Institute Inc; Cary: 2007.