Optimization of MS Instrument Parameters for Broad Mass Range
The mass range of high-sensitivity detection for TOF instruments in linear mode is generally in the 2–20 kDa range.5
In essence, mass spectrometric profiling from complex biological samples has been limited to the analysis of peptides, low-mass proteins, or protein fragments. This limitation is due to lower desorption and m/
z analysis efficiency at higher mass, and is exacerbated by suboptimal sample preparation, data acquisition, and data processing procedures. For data acquisition, we have enhanced detection sensitivity in an extended mass range up to 100 kDa for Bruker MALDI-TOF/TOF mass spectrometers by optimization of specific instrument parameters
A time-of-flight tube is the most common analyzer geometry used for MS protein profiling due to the high reproducibility and mass accuracy of resulting spectral data sets. There are numerous parameters for manipulating ion flights through a TOF system, and operators usually adjust parameters to collect data in either the low-mass or high-mass range. For instance, higher laser intensity and longer time-lag focusing (delayed extraction)19,20
are needed to enhance detection at heavy mass. The broader m
focusing requires a compromise with optimal instrumental resolution.21
However, for example, for an Ultra-flex TOF instrument, a 2-fold compromise in resolution helps achieve more than a 20-fold gain in sensitivity (detector SNR) for masses above 8 kDa.22
Furthermore, lower mass focusing voltage helps to preserve peak width over the broader mass range, although at the expense of the lower “optimal” resolution. For instance, at the default setting of ion source 2 voltage (see Experimental Procedures) the optimal resolution of 850 is achieved at about m
5000, but there is a fast increase in the signal width outside mass focusing range (e.g., fwhm doubles at m
9000). With the slightly lower setting of this voltage that we used to optimize the broad mass range acquisition, the optimal resolution is lower (~500); however, the signal width changes slower (e.g., doubles at m
25 000). As a result, compared to default focusing voltage, the peaks are broader for lower masses (below m
10 000), nominally the same width between m
20 000 and 50 000 ( range), and narrower for heavy masses. This is a compromise setting, which does not have a major impact on overall sensitivity for signals below 50 000 m/z
, although the heavy mass signals may exhibit up to 2-fold enhancement of SNR per point due to partial narrowing.
Figure 1 Average of 5 protein standard spectra acquired with optimized and default (inverted) ADC offset parameters. The raw spectra are scaled to the same noise level to illustrate improvement in detection sensitivity (SNR) for low-intensity signals. Inset shows (more ...)
More critical improvement of detection sensitivity can be achieved in the broad mass range by changing the default settings for the ADC offset and preamplifier bandwidth. A MALDI-TOF mass spectrum is usually the summation of several hundred laser shots taken at one or more positions from a sample spot. The dynamic range of intensities that can be detected by a TOF instrument after a single laser shot is limited by the detector ADC precision (8 bits in our study, Acquiris DP240). The default offset is typically set at a half of the ADC scale, which effectively uses all 8 bits of ADC range (256 intensity values) for detection of positive MS signals. If no signal is present, only the positive half of bit-noise is collected (typically, the 2 lower bits are lost, inset). However, due to intrinsic fluctuations of the ADC base level, low-intensity signals may fall below the detection cutoff for some of the laser shots, especially when superimposed on negative noise signals. As a result, in the average spectrum of multiple shots, these low signals get preferentially clipped at their base under such settings. This is especially detrimental to the detection of naturally broad signals from overlapping peaks or intermediate and heavy masses. shows raw spectra of the protein standard acquired under default and optimized conditions for the same m/z range. Note, for example, how in the shoulder peak and triplet between 35 and 40 kDa get completely lost in the noise with the default ADC offset (inverted spectrum). Changing the offset to about 40% of the scale value helps recover “negative” bits of the baseline noise and minimizes peak base clipping for low-intensity signals ( inset). The recovery of “negative” bits happens abruptly at any offset value lower than 49%, since negative noise is about 1% of the full ADC range for a single shot ( inset). However, due to (vertical) offset fluctuations, some residual clipping may still be sporadically observed for the single shots even at 44–46% offset (data not shown). Any setting below 40% avoids clipping completely. However, lowering the offset by 10% from default value causes a constant vertical shift in baseline intensity ( inset) by about 10% of the dynamic range (~24 of 256). Thus, especially large signals may get clipped on the top for some single shots, causing signal misshaping. This was not observed in the studied samples. Hence, the compromise value of 40% was optimal to allow the broadest dynamic range of ADC (~225 for a single shot) without substantial clipping of signal base.
Adjustment of the preamplifier filter bandwidth is another hardware setting that can enhance the sensitivity of low-intensity signals. Without this filter, the “flash” Ultraflex ADC (Acqiris DP240) samples the signals with a defined frequency without integration of the signals between sample points. This leads to lost signal per sample point for lower sampling rates. Because of limitations on the memory buffer length, data collection in a broad mass range requires using lower sampling rates. Switching on the preamplifier filter allows integration of the signal within the bandwidth and, thus, recovery of lost sensitivity per sample point. However, if the filter bandwidth is higher than the signal width, this may cause undesirable over-broadening and lost resolution. After studying the effects of three allowed bandwidth settings on the broadening of dark current noise from a single shot, we determined that a reasonable compromise setting of the preamplifier filter is at “medium” bandwidth in the acquisition software which corresponds to 200 MHz (Acqiris DP240 ADC specifications). Such a setting does not over-broaden true signal peaks and potentially helps recover some under-sampled intensities in broad mass range acquisition through integration of under-sampled signals.
Although the specific acquisition parameter optimization discussed above was performed for the Bruker Ultraflex instrument, it would be generally applicable to any TOF instrument whose detection system is based on the flash ADC and includes a preamplifier. For instance, the AB 4800 MALDI-TOF/TOF allows software adjustment of the ADC scale and percent-offset as well as preamplifier filter bandwidths. The PBS instruments do not allow such adjustments. The ADC offset is set in the hardware to 20% of scale for better broad mass coverage, however, at the expense of smaller dynamic range.
TOF–MS Data Processing
Since the described down-sampling procedure (in Experimental Procedures) involves signal integration, it had to be preceded by proper baseline subtraction to avoid contributions of a slowly varying baseline trend, while still preserving peak shapes and intensities. The manufacturer’s (Bruker) baseline subtraction algorithm (convex hull) was not adequate for heavy masses (above 20 kDa), since it apparently subtracted signals from the base of broader peaks. Such subtraction reduced sensitivity to broad heavy masses or overlapping signals. A de-trending model that we found to be more appropriate in the broad mass range was a decaying exponential, which is concave along the TOF record.
Baseline subtraction followed by integrative down-sampling processing is essentially doing the job of an “integrative ADC”. Flash ADC, set at a lower sampling rate, loses the under-sampled signal compared to higher sampling rates. Integration adds contributions from the under-sampled signal between sampling points and suppresses random noise. Similarly, integrative down-sampling postacquisition enhances the SNR per sample point proportionally to the square root of the down-sampling window length in the presence of random Gaussian noise. In the high-mass range (above 10 kDa), where both constant and quadratic down-sampling can be used, the SNR enhancement is a product of the gains from each procedure separately. In contrast to, for example, moving average integrative filtering,23
there is no signal broadening by this procedure, because each intensity point is counted once in non-overlapping windows. This procedure does not change the signal shape and width. The peak width in mass is the same before and after processing; only the point density per signal peak is smaller compared to raw data.16
After down-sampling, the intensity per point is more representative of the total signal, since the constant point density per peak is recovered.16
Hence, relative intensities of signal maxima are brought on comparable scale for visual inspection over the whole recorded mass range. This is different for the raw data, where the heavy mass peaks may appear lower just due to their broadness. This down-sampling is also a prerequisite for flawless peak detection, which assumes nominally constant peak-width. The precision of signal location and intensity, determined by peak detection algorithm, is limited by SNR. Thus, SNR enhancement also leads to improved precision of signal detection. The overall results of processing by constant and quadratic down-sampling preceded by baseline subtraction are (1) a constant point density of about 9 time steps for fwhm is recovered over the 100 kDa mass range; (2) the processed data is compressed by a factor of 20 compared to the original 1 Mb of raw data per spectrum, and (3) the resulting MS spectra have SNR values enhanced 10-fold and more above 10 kDa.
The improved sensitivity achieved through integrative down-sampling is illustrated in , which shows the MALDI-TOF spectrum for serum purified with C3 magnetic beads. The bottom trace (inverted) is the raw spectrum, and the top trace is the processed spectrum after down-sampling. The spectra are scaled to the same noise level to illustrate enhanced SNR per signal point achieved by down-sampling. Note that the noise level remains constant after processing in the full m
range. This allows the introduction of a global threshold for peak detection. The low-mass range () exhibits about 3-time SNR enhancement (
) mainly due to constant-rate integrating down-sampling (Tr
= 5) that we applied. The high-mass range () is improved by a combination of constant and quadratic down-sampling (10-fold and more above 10 kDa). Note also that the relative intensities of peak maxima in the processed spectra are now more representative of total relative intensities of signals (e.g., higher 1+
with respect to 2+
charge states, as seen for albumin: 1+
66 433). For unprocessed data, peaks are broader for higher m
and can obscure relative visual estimates from peak maxima.
Pooled serum spectrum was processed by integrative down-sampling and is shown in low (a) and high (b) mass range compared to raw data (inverted). Processed data is scaled to the same noise level as in the raw spectrum.
Enhancement of sensitivity to heavy mass signals above 20 kDa (shown in ) allows detection of about 15 more nonredundant signals (not counting ionization satellites) in addition to about 40 abundant peaks visible below 20 kDa. For instance, in , the only dominant signals above 20 kDa in the raw spectrum are the probable 1+ to 3+ charge states of albumin. After processing, many other features are now visible at higher m/z. The SNR of early mass signals is also enhanced. This allows lowering of the thresholds for peak detection and obtaining higher precision on peak locations and intensities of less abundant signals over the full MS record range of 100 kDa. Many of these low-abundance signals may be related to ionization satellites and chemical adducts, which then can be deconvoluted from parent peaks. Without the achieved sensitivity enhancement over the broad mass range, it would be impossible to discriminate between many ionization satellites and parent protein signals. For instance, if only data below 20 kDa is analyzed without enhancement of heavy mass signals, assignment of the peak cluster near 17 kDa to a particular serum protein moiety is not straightforward (e.g., myoglobin is at 16.9 kDa). After broad-mass enhancement, this cluster is easily identified as the +4 charge state of the peak cluster surrounding parent albumin peak near 70 kDa.
Enrichment and Purification of Proteins
The transition from chromatographic separation to solid phase via derivatized paramagnetic beads facilitates high-throughput, robotics-based applications and is a preferred up-front step for reduction in sample complexity. Predictably, we established that robotic sample preparation was more reproducible than manual procedures and also gave higher quality spectra. Thus, all samples were processed robotically in this study. To ascertain the preferred sample preparation conditions for broad mass coverage, we evaluated a variety of parameters (bead functionalities, MALDI matrices, dilutions, and plate surfaces) to determine reproducibility, number of detected peaks, and overall SNR in spectra spanning 100 000 m/z. We experimented with solution conditions for MALDI spotting using SA, DHB, and CHCA matrices. Although DHB and SA are known to enhance heavy mass signals more effectively, we found this enhancement to be quite minimal (~10% more heavy m/z signals) and at the expense of considerable loss of low m/z signals compared to CHCA (~30% less). More importantly, DHB and SA crystal formation was less uniform on MALDI plates, which severely compromised reproducibility (>50% variability in replicate signals). Likewise, the robotic spotting on an AnchorChip plate was much more uniform compared to stainless steel. Thus, with CHCA matrix on an AnchorChip, the changes of intensities in the spectra resulting from (up to 10) replicate spotting of the same sample were within 5–10%. Therefore, CHCA matrix and the AnchorChip surface were found to be more reliable for high-throughput MS protein profiling. The sample-to-matrix dilution of 1:5 was also helpful both for better reproducibility and higher SNR in the higher mass range. Sample replicates (aliquots) were found to be the major source of spectral intensity variation, producing 20–40% variability in peak signals. This is observed to a higher degree for peaks below m/z 20 000.
We tested a number of different bead functionalities including C3, C8, WCX, WAX, IMAC, and Con A. Beads functionalized with C3 and C8 capture proteins and peptides from biological samples based on hydrophobic interaction; WCX and WAX are weak ion exchangers; IMAC preferentially captures phosphorylated proteins; and Con A enriches specific glycoproteins. Samples were processed robotically using AnchorChips with CHCA matrix (1:5 dilution). Data was collected with optimized MS acquisition settings in the 2000–100 000 m
range. Replicate spectra from 6 spots for each affinity bead prep were averaged, baseline-subtracted, and preprocessed by down-sampling before signal detection. For quantitative comparison of processed spectra, we performed signal detection using a trivial peak detection algorithm provided by MD Anderson freeware in Matlab (Cromwell package).24
No wavelet smoothing was used from the Cromwell package, only a trivial peak detection routine. The algorithm is based on detection of the change in sign for the first difference (local maxima) at each point in a spectrum and does not include any thresholds. Although more elaborate peak detection algorithms are available, their comparison is outside of the scope of this study. We found this simple trivial detection procedure to be sufficient for quantitative analysis of our data after SNR was enhanced by optimized acquisition and preprocessing. To avoid detection of noise peaks, we added a condition of having the difference between the peak and the closest minimum on either the left or right of a peak to be greater than SNR = 1. SNR is defined as the ratio of measured signal intensity to the mean noise amplitude within the 10· fwhm of the peaks (100 time points for the data in this study). At least three noise measurements from peakless regions of each spectrum were averaged to estimate the noise. The error of the measured SNR for detected peaks was within ±0.5, with the minimum SNR for detected signals of about 0.5. The adequate performance of this automated signal detection procedure was ensured by visual overlay of detected peaks with the corresponding spectrum. The results of signal detection and SNR statistics are summarized in for the six studied bead functionalities.
Signal Detection Statistics for TOF–MS Spectra of Pooled Serum on C3, C8, IMAC, WCX, WAX, and Con A ClinProt Beads
About a quarter of detected signals for all bead functionalities were above 15 kDa. The median SNR for these >15 kDa masses was higher than the overall median. This illustrates an effective extension of the TOF detection range to heavy masses. Con A and WAX beads show lower efficiency in binding higher MW proteins compared to low MW species. This may be related to a higher selectivity of these bead types. WCX shows the best performance both in terms of median SNR and peak numbers for low mass, and comparatively lower intensities of high-mass signals (max(SNR)) with respect to low-mass intensities in the same spectra). IMAC also provides good overall mass coverage, but shows an opposite behavior with higher affinity to heavy mass proteins with respect to lower mass species (total median-(SNR) < heavy m/z median(SNR)). C3 and C8 beads produced highly similar spectra with less than 10% difference both from m/z peak lists and relative intensity patterns. These beads offer the best overall compromise in detection of low- versus high-mass signals (note similar maximum and median SNR values independent of mass range).
and show processed MALDI spectra for the 6 different affinity beads. WAX (, green) exhibits the lowest SNR and weakest affinity to heavy masses. In the high-mass range, Con A beads apparently cause attachment of their chemical component to albumin, producing broadened flat-top features (, magenta). There are very few features in the Con A spectrum due to the more specific protein enrichment. The most abundant feature in Con A spectra was a cluster centered around m/z 7800 and its doubly charged reflection. This peak cluster was also common for all other beads that we studied. Our results show that, although valuable for selective enrichment studies, the Con A and WAX beads do not provide the best choice for broad range m/z survey scanning. C8 spectra (, blue) are highly similar to C3 (, blue) with a slightly lower overall peak number and SNR for C8, so C3 is a slightly better choice of the two.
Figure 3 Comparison of mass spectral content for three affinity capture magnetic bead surfaces using CHCA matrix: C8 (blue), WAX (green), Con A (magenta). Spectra are shown after integrative down-sampling, and are normalized to maximum intensity and vertically (more ...)
Figure 4 Comparison of mass spectral content for three affinity capture magnetic bead surfaces using CHCA matrix: C3 (blue), IMAC (green), WCX (magenta). Spectra are shown after integrative down-sampling. The view is subdivided in two mass ranges (a) and (b), (more ...)
WCX (, magenta) has reduced efficiency at high m/z compared to low m/z, yet the spectral quality for lower masses is visibly the best (as the numbers in clearly indicate). There are more peak differences observed between C3 and IMAC or WCX in the low-mass range (~30% signals, ) then in the high-mass range (). Some examples of highly abundant peaks observed for C3, but suppressed for IMAC and WCX, include the singly charged ions at m/z 8200 and 15 100 and a doublet at 13750/13870 with their corresponding doubly charged reflections. Conversely, the peak at m/z 5940 is 10-fold more abundant for WCX and IMAC compared to C3. The most prominent common peaks for the three surfaces (in addition to 1+ through 4+ charges of albumin (m/z = 66 400, 33 200, 22 100, 16 600)) include a peak doublet at m/z 6500/6700, a cluster centered around 7800, and triplet at 8950/9160/9310. Detailed results on the common and different subsets of peaks can be found in a table in Supporting Information. We also noticed that robotic C3 preparation resulted in greater reproducibility in MS signals versus IMAC or WCX preparation. This is most likely due to the manufacturer’s protocol for IMAC and WCX which have smaller elution volumes and increased number of steps. Of the beads tested, we concluded that C3, IMAC, and WCX gave the most favorable and complementary protein profiles for a large mass window (WCX being more efficient in lower m/z range, while IMAC is slightly better for higher m/z). We recommend their complementary use for high-throughput MALDI–TOF profiling of complex protein mixtures.
As expected, biofluid sample preparation protocols have a major effect on the spectral quality, as illustrated in . IMAC affinity capture of serum was compared in 3 different systems: Ciphergen SELDI chips with sinapinic acid (SA) matrix recorded on a Ciphergen PBS II instrument, SELDI chips with SA matrix analyzed on a Bruker Ultraflex III, and ClinProt magnetic beads with CHCA matrix analyzed on an Ultraflex III. The top two traces in show that affinity capture on IMAC chip surfaces produces similar mass records independent of changes in detector systems (PBS II versus Ultraflex III spectrometer). Similarity is measured by the correlation of intensity patterns and peak positions, and is more than 0.95 (on 0 to 1 scale) for the example shown. In contrast, the spectrum produced after affinity capture using magnetic beads is significantly different (lower trace in , correlation of 0.75), especially in intermediate- and low-mass range (<5 kDa). This is notwithstanding the fact that both chip and beads possess IMAC chemical affinity. Different protocols were used for preparation, however, which accounts for some of the observed differences. In particular, urea is used in the binding step for the IMAC chip. For instance, the peak at 12 kDa is present in the SELDI chip spectra in , but absent in the bead spectrum. In fact, we found that the 12 kDa protein (or protein fragment) is absent in the bead spectrum whether urea is used or not (data not shown). This may be due to different requirements for binding or elution of protein/peptide when utilizing bead-based approaches.
Figure 5 Comparison of mass spectral content for IMAC affinity capture by SELDI chip with SA matrix recorded on Ciphergen PBS II instrument (magenta), by SELDI chip recorded on Bruker Ultraflex III (blue), and by ClinProt magnetic beads with CHCA matrix recorded (more ...)
Other noticeable differences for IMAC beads include, for example, suppression of registered intensities for peak clusters around 14, 29, and 79 kDa. These m
peak values are consistent with literature references of known abundant serum protein molecules.25–31
Also, IMAC beads show enhancement of peptide signals below 4 kDa, as well as observation of the 1+
charge states of albumin. The observed differences in ionization satellite content can be largely attributed to the change of matrix. For instance, higher charge states and less matrix adducts and neutral loss is characteristic of CHCA compared to SA. In general, we observed that the albumin cluster dominates the heavy mass spectra from magnetic beads, while some abundant serum proteins in the intermediate- and high-mass range are captured with lower efficiency. Thus, binding to IMAC beads is more selective compared to the IMAC chip.
In summary, we have extended the mass range of MALDI-TOF high-sensitivity detection for profiling of complex protein mixtures. This has been accomplished through the combined optimization of sample preparation procedures, MS instrument parameters, and data processing. C3, IMAC, and WCX beads were determined to be complementary and favorable for broad mass range protein detection. Key instrument parameters for enhancing sensitivity in an extended mass range included the adjustment of the ADC offset and preamplifier filter values. Data processing utilized a combination of exponential baseline subtraction, and constant and quadratic down-sampling, to increase SNR and precision of signal peak detection. Overall, this “broad range” enhancement will extend the utility of MS-based analysis of biological samples through the improvement of protein/peptides detection sensitivity. The improvement in the relative linear range of detection will also complement efforts to enrich for lower abundant proteins.32
Our developed methods are relevant to applications where broad mass range detection is advantageous, such as in clinical biomarker discovery, microorganism identification,33,34
and MS imaging of biosamples.35–37