|Home | About | Journals | Submit | Contact Us | Français|
Fourier transform ion cyclotron resonance mass spectrometry has the ability to realize exceptional mass measurement accuracy (MMA); MMA is one of the most significant attributes of mass spectrometric measurements as it affords extraordinary molecular specificity. However, due to space-charge effects, the achievable MMA significantly depends on the total number of ions trapped in the ICR cell for a particular measurement, as well as relative ion abundance of a given species. Artificial neural network calibration in conjunction with automatic gain control (AGC) is utilized in these experiments to formally account for the differences in total ion population in the ICR cell between the external calibration spectra and experimental spectra. In addition, artificial neural network calibration is used to account for both differences in total ion population in the ICR cell as well as relative ion abundance of a given species, which also affords mean MMA values at the parts-per-billion level.
Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry has the ability to detect ions with unparalleled mass measurement accuracy, even when working with complex analytes such as proteins, metabolites, and other molecules found in biological, biochemical, and physical science fields.1–5 The fundamental relationship between cyclotron frequency and m/z has been under investigation since the introduction of FT-ICR MS in 1974 by Comisarow and Marshall.1 In order to reach the highest mass measurement accuracy (MMA) achievable for a given FT-ICR MS system, frequency shifts due to space-charge-effects must be accounted for using external or internal calibration strategies.6, 7 While internal calibration8–13 provides the best correction for space-charge effects it often requires specialized hardware or software.
In order to account for frequency shifts, external calibration has been utilized to improve the MMA of measurements made by FT-ICR.14 Amster and co-workers developed a calibration curve to account for the differences in ion populations between the calibration spectrum and subsequent spectra acquired for data analysis resulting in <10 parts-per-million MMA.15, 16 Based partly on both the work of Amster15, 16 and Smith17, Oberg and Muddiman reported a novel external calibration law which provided data with <5 ppm MMA.18 Amster and co-workers have also developed stepwise-external calibration, in which calibration spectra are acquired at low trapping voltages that provide low ppm mass accuracy. This is then followed by collection of spectra at more customary trapping voltages where the major peaks, which appeared in the spectrum collected at low trapping voltages, were used as “internal calibrants” to calibrate the rest of the spectrum.19
Hunt and co-workers have demonstrated the usefulness of combining external calibration with automatic gain control (AGC) where the number of ions in the ICR cell are controlled to fall within the external calibration range.20 This approach allowed them to routinely achieve MMA of <2 ppm. This external calibration strategy has been implemented on commercially-available FT-ICR MS instruments equipped with AGC.20, 21 Muddiman and co-workers were able to achieve high MMA using this approach.22 The Smith and Gygi research groups commonly report MMA values of ~5 ppm when utilizing AGC for external calibration procedures.23–25 Recently, Muddiman and co-workers utilized a combination of AGC with calibration laws that used multiple linear regression and were able to achieve an external calibration with mass measurement accuracy in the parts-per-billion (ppb) range.26
Artificial neural networks (ANNs) utilize back propagation techniques to establish the weights and biases needed to fit a target output using measured parameters as input data. Unlike multiple linear regression, one does not need to specify a mathematical functional relationship between the input and target data – the ANN is trained to produce a set of outputs which minimize the difference between the target data and the ANN output. The concept of an ANN has been applied in many different fields of chemistry including various problems in spectroscopy such as the calibration of previously existing spectra as input functions in NMR spectra,27, 28 mass spectra,29–32 infrared spectra,33 and other forms.34 They have also appeared in the study of chemical sensors applications35, 36 and protein folding37. The concept of having a program that can generate solutions to an unknown via the input of a known is extremely useful.18 Herein, data from a hybrid LTQ-FT-ICR mass spectrometer have been applied to an artificial neural network system in an attempt to generate m/z values with high mass measurement accuracy by utilizing parameters to account for space charge effects within the ICR cell. These results are compared with that of previously published data utilizing multiple linear regression to provide a fit for this data.26
Poly(propylene glycol) with an average molecular weight of 1000 Da, ammonium acetate (>99%), and formic acid were purchased from Sigma (St. Louis, MO). HPLC grade acetonitrile and high-purity water were purchased from Burdick and Jackson (Muskegon, MI). 2-Propanol (HPLC grade) was purchased from Fischer Scientific (Fair Lawn, NJ). All materials were used as received.
A modified version of an electrospray ionization (ESI) source developed previously in this laboratory was coupled to a hybrid ThermoFisher Scientific (San Jose, CA) LTQ-FT Ultra MS equipped with an Oxford Instruments actively shielded 7 T superconducting magnet (Concord, MA). All spectra were acquired with a resolving power at 400 m/z set to 100,000fwhm, along with AGC settings ranging from 5.0 × 105 to 3.0 × 106. Samples were introduced by direct infusion using a 100 μL gastight syringe (Hamilton, Las Vegas, NV) and the syringe pump on the LTQ-FT Ultra at a flow rate of 0.5 μL/min. The ESI emitter tips used were 360 μm o.d., 50 μm i.d. and tapered to 30 ± 1.0 μm i.d. (New Objective, Woburn, MA) and held at a constant potential of 2200 V for all experiments. Electrospray solutions were comprised of 70:30 2-propanol/water with 0.5mM ammonium acetate (NH4OAc).
Calibration of the instrument was completed utilizing a user-defined list of eleven monoisotoptic m/z values for ammonium-adducted PPG-1000 oligomers, which ranged from m/z = 732 to m/z = 1312. This calibration was conducted using the manufacturer's protocol which generates coefficients for five different AGC target values of which the largest 3 were used: 5.0 × 105, 1.0 × 106, and 3.0 × 106.
The frequencies were obtained utilizing diagnostic mode in the instrument software, which allowed for calibration utilizing artificial neural networks. The AGC level was first set to 5.0 × 105 and five calibration spectra were recorded, along with their respective cyclotron frequencies. These frequencies provided 55 points (11 data points, 5 spectra) which were used to train the neural network. This was followed by the collection of five validation spectra including their cyclotron frequencies, to determine the achievable MMA through the use of neural networks. Both training and validation datasets were the result of analysis of spectra of ammonium-adducted PPG-1000 oligomers, as these experiments were for proof-of-principle. In an effort to reduce systematic error with respect to AGC levels, this procedure was then repeated for AGC target levels of 3.0 × 106 and 1.0 × 106 in that order. Values for total ion population (AT) and relative ion population (AS) were calculated as described previously.26
The basic computational unit of a neural network is the neuron. Our network consisted of six neurons arranged in three layers. The initial layer contained one neuron with a logsig output function. The middle layer contained four neurons each with a logsig output function. The outer layer contained a single neuron with a linear output. Each neuron computes the value of N via the equation: N = Wi*Pi+Bi and applies N to the input of its output function. Wi is a weight applied to the data by the neuron and Bi is a bias. The input data (matrix P) contains one row for each mass spectrum and one column for each experimentally measured independent variable upon which the calibration is based. Our experiments typically utilized 55 rows and either 1, 2 or 3 columns (i.e. frequency, AT and AS). A simplified diagram of the neural networks used in this work is shown in Figure 1. The maximum number of weights/biases available to our ANN was 9W/6B, 10W/6B or 11W/6B depending on the number of independent variables used.
For training of the neural network the input layer used the calibration data set, which was obtained through a set of known compounds (PPG-1000 oligomers) that were analyzed by FT-ICR MS. This calibration data allowed the neural network to calculate an output which was compared to the target (m/z data) and through several iterations determined the weights and biases which provided outputs closest to those provided in the calibration data set. Once the desired output is obtained, the same weights and biases are applied to the validation data, which then results in m/z values upon which to base the mass measurement accuracy for the validation data.
Regardless of the number of inputs or outputs, the number of neurons can be changed which may change the quality of the fit provided by the artificial neural network. There is usually an optimal number of neurons that would allow for the best fit within the system, and provide for acceptable computational time; in this case it was determined to be 6 neurons.38 This allowed for a high quality, smooth fit of the line through the known points as seen in Figure 2a. Too few neurons (1 or 2) and the ANN did not converge well or took an extensive amount of time trying to find what would be the best solution. The counter example is taking too many neurons (≥8) which resulted in the neural network over fitting the line which leads to inaccurate prediction of the m/z values corresponding to particular frequency points, which is shown in Figure 2b. For the experiments described in this paper, all runs and data available in this publication were completed utilizing 6 neurons. Statistically the residuals were interpreted as errors and as such would be expected to follow a normal distribution. Several steps were taken to examine the residuals and evaluate whether they were normally distributed.
The data for the Neural Networks program was collected and applied using MATLAB 7.1.4 R14 SP3 for Windows XP or MATLAB 22.214.171.1244 (R2006A) for Redhat Enterprise Linux 5. Input data was from AGC target values of 5.0 × 105, 1.0 × 106, and 3.0 × 106. The trainbr neural network function from the Matlab Neural Network Toolbox was utilized. Trainbr is a network training function that updates the weight and bias values according to the Levenberg-Marquardt optimization algorithm.38 It minimizes a combination of squared errors and then determines the correct combination so as to produce a network that generalizes well. This process is called Bayesian regularization. The output also contained an estimate of the effective number of parameters needed for the fit.
The input and target data were normalized to fall between +1 and -1 using the Matlab routines mapminax prior to initiation of the neural network training, but upon output these values are converted back to their proper numerical values. In addition, the training performance target was set to zero and was terminated when validation performance decreased more than two times since the last decrease in performance. Initial neural network weights and biases were assigned using random numbers generated by Matlab. Typically thirty training runs were made per data set. This allowed a wide range of initial weights and biases to be explored. Our criteria for the network which provided the best fit was one which produced the lowest mean square of residuals, termed MSE, while producing residuals which passed at least 4 out of 5 normalcy tests.
Data was analyzed through variations of a script which utilized Matlab to train a neural network from one data set, and exploit the trained neural network to produce m/z values from another data set, referred to as the “test” data set. These programs were based on equations previously published by Muddiman18, 26 which utilized the relative ion abundance of a given species (AS) and the total ion population (AT) of a FT-ICR MS data set and their effect on mass measurement accuracy. The first program utilized frequency alone and did not incorporate relative ion abundance of a given species nor the total ion population. The second program incorporated the cyclotron frequency and total ion population of a given species (AT), but not the relative ion abundance and the third program incorporated the cyclotron frequency, total ion population, and the relative ion abundance of a given species.
Key data that was determined via the Matlab Neural Networking program was saved in a data matrix which included values for the calibration data set including MMA, median, standard deviation and other values that would allow for interpretation of the neural network training. The Matlab program also exported values for the test data set, which are the values of particular interest since these are the points which the trained neural network actually predicts.
Prior to applying our ANN to FT-ICR data, the effectiveness of the ANN was evaluated by applying the network to the Filip dataset available from NIST.39 The initial 82 points were divided into 54 points for training and 28 for validation. The results are summarized in Table 1. The residuals produced by both MLR and ANN were normally distributed and in excellent agreement with the NIST certified values. The MSE values produced by ANN during training were lower than that produced by MLR; the MSE values produced by ANN during validation were about twice as large as those obtained by MLR. The effective number of parameters utilized by the ANN for the Filip dataset was 9.7 compared to the 11 used by MLR. The effective number of parameters utilized by the ANN were 9.7 (frequency alone), 10.6 (frequency and AT) and 12.3 (frequency, AT and AS). The fact that the effective number of parameters increases by about 1 parameter each time another independent variable is added is consistent with the importance of AT and AS in the calibration of FTICR experiments. The importance of both AT and AS has been previously reported.18
Several tests were applied to assess whether the residuals were normally distributed. These were the Lilliefors test for residuals,40 the Shapiro-Wilk parametric hypothesis test of composite normality,41 the D'Agostino-Pearson's L2 test for assessing normality of data using skewness of kurtosis,42 the Jarque-Bera test, 43 and the Anderson-Darling test for assessing normality of sample data.44 The results from these tests, histograms and normal probability plots demonstrated that a fit using artificial neural networks provided a better fit of data than the MLR method previously published. This is because in all but one case the data provided by the neural network passed more of these normalcy tests than the previously produced MLR results. Also of importance, is the mean square of error values (MSE) calculated for the results of each run of the ANN; as this value approaches zero it indicates a better fit.
Figure 3 demonstrates one comparison between the previous scheme utilizing MLR26 and the results achieved by the ANN method to perform a fit from the calibration data set. This calibration data was also utilized to generate the coefficients for the MLR equations and to train the ANNs. Figure 3 contains data acquired at an AGC level of 5.0 × 105 and utilized only frequency. Figure 3A illustrates the residuals generated by the fitting of the calibration data by MLR. For this example the points on the normal probability plot deviate from the line that represents where theoretical residuals representing a normal distribution would be located. In addition, the box plot identifies several outlier points (+) that are produced by utilizing the MLR methodology on this particular data. In contrast, Figure 3B shows that utilizing ANN to fit the same data produces residuals with a much more normal distribution. Many more data points of the normal probability plot fall along the line that represents a normal distribution of residuals, and there are no outliers present in the boxplot. In addition, the residuals from ANN fits pass at least 4 of the 5 normality tests, which is an improvement over the residuals produced by MLR (failed all 5).
Included in Figure 3 are the average mass measurement accuracy (MMA Avg) and the mean square of error (MSE). If systematic error is eliminated MMA average should approach zero and the MSE should also approach zero. The MLR method was able to achieve 24 ppb for MMA average versus −0.05 ppb for the fit provided by the ANN, for this data set. In addition, the MSE value for the ANN fit is also smaller than that provided by the MLR fit of this data. Figure 3 demonstrates only one example of the many data sets that were analyzed in this study. The data for all additional experiments can be found in the Supplementary Material.
In addition, data from the validation data set was utilized to test how well the ANN performed versus the MLR calibration method. The data from the validation data set also illustrated that artificial neural network provided a better fit for the data. However, the MMA average provided by the ANN did not achieve the same levels as that provided by MLR, with one exception. One possible explanation is the fact that the residuals from the ANN data are normally distributed while those from the MLR are not. The MMA average for every ANN fit did remain sub-ppm for the validation data set, ranging from 602 ppb to 40ppb. This and the rest of the data are illustrated in supplementary data section of this manuscript.
All MMA values achieved through the use of artificial neural network calibration and the application of AT and AS parameters to account for global and local space-charge affects improved those provided strictly from the instrument.26
The methods reported herein continue to reinforce the importance of accounting for global and local space-charge affects in FT-ICR to achieve the best possible MMA values. It was demonstrated that mean MMA values in the ppb range can be achieved across a range of AGC target levels utilizing artificial neural networks to calibrate data from a hybrid LTQ-FT-ICR MS. Taking into account the automatic gain control settings, it is important to note that the calibration provides the most improvement when used at the highest possible population of ions, as expected. Also of importance is the fact that calibration with ANN did provide a better overall fit for these data (as evidence from the MSE of the residuals), even though the average MMA was not as good as that provided for by the MLR methodology.26 When these methods described herein are executed properly, we are able to obtain average MMAs between 600 and 40 ppb.
The authors gratefully acknowledge financial support received from the National Cancer Institute, National Institutes of Health (R33 CA105295), the W.M. Keck Foundation, and North Carolina State University. In addition, D.K.W. would like to Merck & Co. and the ACS Division of Analytical Chemistry for their support.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Supplementary Information A summary of all data analyzed utilizing ANN and MLR is contained in the Supplementary Information. This summary consists of tabulated data as well as figures in a similar format to Figure 3 for every dataset analyzed.