|Home | About | Journals | Submit | Contact Us | Français|
Two-dimensional gel electrophoresis (2DE) offers high-resolution separation for intact proteins. However, variability in the appearance of spots can limit the ability to identify true differences between conditions. Variability can occur at a number of levels. Individual samples can differ because of biological variability. Technical variability can occur during protein extraction, processing, or storage. Another potential source of variability occurs during analysis of the gels and is not a result of any of the causes of variability named above. We performed a study designed to focus only on the variability caused by analysis. We separated three aliquots of rat left ventricle and analyzed differences in protein abundance on the replicate 2D gels. As the samples loaded on each gel were identical, differences in protein abundance are caused by variability in separation or interpretation of the gels. Protein spots were compared across gels by quantile values to determine differences. Fourteen percent of spots had a maximum difference in intensity of 0.4 quantile values or more between replicates. We then looked individually at the spots to determine the cause of differences between the measured intensities. Reasons for differences were: failure to identify a spot (59%), differences in spot boundaries (13%), difference in the peak height (6%), and a combination of these factors (21). This study demonstrates that spot identification and characterization make major contributions to variability seen with 2DE. Methods to highlight why measured protein spot abundance is different could reduce these errors.
Two-dimensional gel electrophoresis (2DE) is the most widely used technique for expression proteomics.1,2 A major concern with this approach, however, is variability caused by sources other than true differences between groups of samples.3–6 Biological variability and technical variability can make analysis more difficult. An additional source of variability is the visualization of protein spots and the identification of what is a spot and what are the spot boundaries.7,8 We completed an analysis recently of the variability between gels associated with different detergents.9 Significant differences were found with different combinations of detergents used for solubilizing and focusing the sample. During this analysis, it was apparent that an additional source of variability was the size and shape of the spot outline selected by the software. This area is a largely unrecognized problem in 2DE but may account for a significant component of the variability. In this study, we have further analyzed this source of variability and propose a simple method to detect it.
Rats were sacrificed, hearts were perfused with saline, and the left ventricular free wall was dissected free. Samples were frozen in liquid nitrogen and stored at −80°C until used. Frozen tissue was placed in a Biopulverizer with additional liquid nitrogen and ground to a powder. We used the combination of detergents, which we found previously had the least technical variability.9 Rat heart left ventricle (100 mg) was ground in a tissue homogenizer in 200 μl of the following buffer: 5 M urea, 2 M thiourea, 2% CHAPS, and 2% sulfobetaine 3-10, 0.2% biolytes, 1% DTT, leupeptin (1 μM), pepstatin (1 μM), aprotinin (0.3 μM), EDTA (2.5 mM), sodium orthovanadate (0.2 mM), sodium fluoride (50 mM), PMSF (2.5 mM), and benzonase (50 U/100 μl). The lysate was then sonicated on ice with three 2 s pulses with a 3-s rest between each pulse. The sonication was repeated every 15 min for 1 h, after which, the lysates were centrifuged for 5 min at 750 g, 4°C, to remove debris. Protein (600 μg) was added to 600 μl of a buffer containing 7 M urea, 2 M thiourea, 2% CHAPS, and 1% amidosulfobetaine 14, 0.2% biolytes, and 1% DTT. A volume of 185 μl was used to rehydrate an 11-cm immobilized pH gradient strip (pH 3–10). Proteins were focused in a Protean isoelectric focusing cell (Bio-Rad, Hercules, CA, USA) for 100,000 volt hours with a maximum voltage of 8000 volts and a maximum current of 50 μA/strip. After focusing, strips were equilibrated sequentially in buffers containing DTT and iodoacetamide and separated by SDS-PAGE on an 8–16% gradient gel using a Criterion Doceca cell (Bio-Rad). Gels were washed with deionized water, fixed with 10% methanol/7% acetic acid, stained overnight in the dark with Sypro Ruby (Invitrogen Molecular Probes, Carlsbad, CA, USA), destained with 10% methanol/7% acetic acid, and imaged on an FX Pro Plus fluorescent imager (Bio-Rad). Images were analyzed using PDQuest software, Version 7.1. Spots were detected automatically and matched on the three images, followed by manual editing of spots and spot alignment by an experienced user to improve detection and eliminate artifacts. Spot intensity, gel coordinates, molecular weight (MW), isoelectric point (pI), and spot dimensions were converted to annotated gel markup language (AGML) and analyzed using a set of tools, which we have developed using a Matlab tool box. The spot intensities were normalized by expressing them as quantile values based on relative intensity.10 Spots that were not detected on a gel were assigned a quantile value of 0. The quantile difference between the highest abundant and lowest abundant spot in each match was determined.
Three aliquots of the same rat heart lysates were separated by 2DE simultaneously. Protein spots were aligned across all three gels. One of the gels and an enlarged region of all three gels are shown in Figure 1. Spots are labeled with numbers and letters to show the high degree of correlation across the gels. Protein spots were well resolved, and most of the spots could be matched across all of the gels. Spot intensity values in each gel were normalized according to quantile using the AGML tool. This normalization approach ranks the proteins according to their intensity and is expressed as a value between 0 and 1. We have shown that this method of nonparametric normalization is superior to methods based on trusted spots or on cumulative spot intensity.10 The intensity rank for each spot is expressed as a number between 0 and 1.
We next analyzed the differences between the highest and lowest abundance spots for each match across the three gels. Seventy-four percent of the spots had a maximum difference in intensity of 0.3 quantile unit or less between the highest and lowest abundant members of the set. However, 96 of the 664 spots (14%) had intensities that differed by 0.4 quantile values or more. To determine if larger differences between spots were associated with the MW or pI, we plotted the position on the gels of spots that differed by more than 0.4 in quantiles between the maximum and the minimum. Spots that showed increased variability were distributed across the pI and MW ranges of the gel and did not cluster in any region of the gel (not shown). We next attempted to determine what the reason was for the difference in spot intensity in this set of replicate gels. Differences in intensity between aligned spots could be a result of failure to identify a spot, inaccurate matching, differences in the outline of the area identified as a spot, or a true difference in intensity of the spot. As the gels were run from replicates of the same sample, true differences in protein spot volume between gels are caused by technical variability in the separation. To determine which of these factors was responsible for the variability in the spots with abundance differences, total abundance, dimensions of the spot, and peak intensity were exported to an excel spreadsheet. These data were analyzed along with a manual assessment of the appearance of the spots. In the set of 96 spots, in which the difference in quantile intensity is 0.4 or greater, 57 (59%) of the differences were caused by a failure to identify a spot. An example of a spot that was not detected in one gel is seen in Figure 2. Spot number 380 is identified in the first two gels but not in the third. The flat field view and the 3D view show that the spot is present in the third gel but was not identified by the software. The omission was not recognized initially by the human user. The cause of the abundance differences was then compared in the remaining 38 spots, for which each gel had a spot. Each spot can be described in terms of its x and y dimensions, which determine its size, and by a peak intensity. These dimensions were compared. A level of twofold difference in dimensions or peak intensity was chosen as a cutoff to identify important discrepancies. Differences in peak intensity represent true changes in the amount of protein present at a given point on the gel. Differences in x or y dimensions represent true differences in the size of the spot or an inconsistency in the definition of the boundaries of the spot. Differences in the outline of the area were the cause of 12 (13%) of the errors. A true difference in the peak height was seen in 6 (6%), and a combination of differences in the peak height and spot dimensions was the cause in 20 (21%). In no cases was an error in matching of spots the cause. All errors could be attributed to some parameter of spot detection. An example of a difference in assignment of the dimensions of the spot is seen in Figure 3. Spot number 373 is seen in all three gels. The x dimension differs between them, however. In gel number 1, a shoulder of spot number 372 is identified as a part of spot number 373, accounting for the error in intensity between replicates. An example of a spot that is different in peak height between the replicates is shown in Figure 4. In this example, spot number 307 has a peak height of 227 in gel 1 and 199 in gel 2 but only 83 in gel 3.
This study was designed to assign the cause of variability within replicate gels from the same sample. As the sample is the same, the underlying protein abundance must be the same for all gels. We found that 14% of spots differed by more than 0.4 quantile unit. The majority of the differences was a result of the failure of the software to identify a spot. The failure was not detected by an experienced human user during the initial confirmation of the spot alignment. The detection of these errors by a simple analysis of the individual parameters of the spots suggests an approach to dealing with these sources of error. The use of the data as an XML file, as we have done with AGML, makes these parameters relatively easy to check. Simple algorithms can detect why spots are different. Although there are exceptions to the rule, spots that are not detected at all or for which the x or y dimensions of the spot are changed are less likely to be true differences than spots that have differences in peak height. These algorithms can then direct the human user to re-examine spots that appear to be different but where the differences are primarily a result of differences in x or y dimensions or where differences occur because of an absence of the spot.
This project has been funded with federal funds as part of the National Heart, Lung, and Blood Institute (NHLBI) Proteomics Initiative from the NHLBI, National Institutes of Health, under contract no. N01-HV-28181. Additional support for this project came from the Department of Veterans Affairs.
The authors have no conflict of interest.
The use of animal tissues described in this manuscript was approved by the Medical University of South Carolina Institutional Animal Care and Use Committee.