|Home | About | Journals | Submit | Contact Us | Français|
Protein extraction methods can vary widely in reproducibility and in representation of the total proteome, yet there are limited data comparing protein isolation methods. The methodical comparison of protein isolation methods is the first critical step for proteomic studies. To address this, we compared three methods for isolation, purification, and solubilization of insect proteins. The aphid Schizaphis graminum, an agricultural pest, was the source of insect tissue. Proteins were extracted using TCA in acetone (TCA-acetone), phenol, or multi-detergents in a chaotrope solution. Extracted proteins were solubilized in a multiple chaotrope solution and examined using 1-D and 2-D electrophoresis and compared directly using 2-D Difference Gel Electrophoresis (2-D DIGE). Mass spectrometry was used to identify proteins from each extraction type. We were unable to ascribe the differences in the proteins extracted to particular physical characteristics, cell location, or biological function. The TCA-acetone extraction yielded the greatest amount of protein from aphid tissues. Each extraction method isolated a unique subset of the aphid proteome. The TCA-acetone method was explored further for its quantitative reliability using 2-D DIGE. Principal component analysis showed that little of the variation in the data was a result of technical issues, thus demonstrating that the TCA-acetone extraction is a reliable method for preparing aphid proteins for a quantitative proteomics experiment. These data suggest that although the TCA-acetone method is a suitable method for quantitative aphid proteomics, a combination of extraction approaches is recommended for increasing proteome coverage when using gel-based separation techniques.
Together, genomic and proteomic approaches promise to reveal a multidimensional view of a biological system. Just as genomic studies are plagued with problems such as coverage,1 repeat sequences,2,3 and complex nucleic acid secondary structure, proteomic approaches have their fair share of limitations.4,5 For example, there is no fully characterized proteome equivalent to a fully sequenced genome, as the numbers of potential modifications to a protein that can change its function are numerous—over 300. Additionally, genomic approaches rely on nucleic acids that have highly similar chemical and physical properties and can be accomplished using amplification techniques to increase detection. There is no proteomic technique equivalent to PCR, making it necessary to look at proteins at the concentration at which they exist naturally and in the presence of a great many other proteins, some of which are present at much higher concentrations and most of which vary tremendously in their biophysical properties. Other technical issues plaguing proteomic approaches include gel-to-gel reproducibility,6 biases toward identifying similar proteins in unrelated proteomic studies,7 and reliability of protein extraction methods.8–11 The latter is the most important step in any proteomic experiment, as a reliable and comprehensive protein extraction is the closest proteomic equivalent to a fully sequenced and annotated genome. Any biological conclusions that are drawn from a proteomic study are only as strong as the data indicate—that the extracts are reproducible and rich in protein diversity.
Proteomics approaches are highly valuable for studying organisms with limited genomic resources available, as the power of MS coupled with database similarity searching, allow the rapid identification of protein homologues in related species.12 Our study focuses on comparing protocols for the extraction of proteins from Schizaphis graminum (Sg), an aphid species of agricultural importance and for which there are limited genomic resources. Aphids are plant-feeding insects that pose a worldwide agricultural problem. Besides the obvious damage done to the plant by feeding, aphids are vectors of numerous viruses that infect crop plants.13–16 Proteomics approaches to understanding the molecular mechanisms of virus transmission17–19 promise to reveal new approaches for disease management that may specifically disrupt aphid protein function and aphid-virus interactions. Furthermore, aphids harbor maternally derived endosymbionts, including Buchnera aphidicola, which are necessary for aphid survival20,21 and have been implicated in virus transmission.21 One can easily imagine using a proteomics approach to monitor potential aphid-symbiont protein interactions22 and to identify bacterial protein targets that can be disrupted to compromise aphid survival. This would not be possible with a genetics strategy. In light of the power of proteomics to reveal the molecular details of aphids as crop pests,17–19, 23–25 we set out to test commonly used protein extraction methods with aphid proteins.
There are a few properties of aphids, as with all insects, that make protein extraction technically challenging. Two highly abundant proteins, chitin and actin, can interfere with the resolution of proteins of similar molecular weight (MW) and isoelectric point (pI), and they pose a dynamic range problem with protein quantitation. Analogous issues are observed for rubisco in plant extracts10 and albumin in serum extracts.26 Similarly, they pose a dynamic range problem in protein quantitation. Certain exoskeleton proteins, e.g., chitin and actin, which are not well-solublized, even by the strongest chaotropic agents, can interfere with gel electrophoresis by causing the appearance of streaks. Additionally, proteins from the endosymbiont Buchnera should be well-represented in aphid protein extracts, especially the highly abundant chaperonin GroEL homologue, symbionin,24 and they may pose similar challenges as chitin with regard to dynamic range and isoelectric focusing interference.
To deal with these challenges, we tested and compared three protein extraction methods reported in the literature to be successful with other recalcitrant tissue types: TCA-acetone precipitation,8,27 the phenol extraction method described for plant tissues,8,10,28,29 and the multi-detergent extraction method described for cyanobacterium.30 The virtues and pitfalls of each of these approaches are determined using qualitative and quantitave gel electrophoresis methods including 2-D Difference Gel Electrophoresis (2-D DIGE). The first 2-D DIGE experiment compared the proteins extracted by all three extraction methods. A second 2-D DIGE experiment explored the TCA-acetone extraction for reproducibility and reliability for future gel-based proteomic studies using Sg as a model system.
Parthenogenic-reproducing colonies of two genotypes of S. graminus SC or F,31 described previously, were maintained on caged barley (Hordeum vulgare) at 20°C with an 18-h photo period. Plants were infested 1 week after germination with 18–20 adult aphids. Colonies were allowed to develop undisturbed for 21 days, after which, all of the life stages of the aphids were collected, weighed, and frozen at –80°C in 50 mL BD-Falcon tubes (Becton Dickinson, Franklin Lakes, NJ) for later use. Care was taken to remove any plant and soil debris from the aphids before freezing, so as not to contaminate the aphid protein samples.
Prior to each type of extraction, 3 g of aphids were ground to a fine powder in liquid nitrogen using a prechilled mortar and pestle, transferred to a 50-mL BD-Falcon tube containing the respective extraction solutions, and mixed as described below. Figure 1 shows a simplified flow-chart comparing the three different extraction protocols.
Frozen aphid tissue was added directly to 10% TCA in acetone containing 2% β-mercaptoethanol (ME) (1 g aphid tissue:10 ml TCA-acetone w/v) and mixed by inverting the tube 10 times. Proteins were precipitated overnight for at least 12 h at –20°C. Precipitated protein was centrifuged at 5000 g for 30 min, washed three times in 10 mL in ice-cold acetone with vigorous disruption of the pellets with a glass rod between each wash, and air-dried. Pellets were frozen at –80°C until used.
Proteins were extracted in a buffer8 containing 100 mM potassium chloride (KCl), 0.1 mM PMSF, 2% β-ME, 0.7 M sucrose, 500 mM Tris, pH 7.5, 50 mM EDTA, 1% polyvinylpolypyrolidone, and 1× HALT EDTA-free protease inhibitor cocktail (Pierce, Rockford, IL; 1 g aphid tissue:10 ml phenol extraction buffer w/v). An equal volume of Tris-buffered phenol, pH 7.5, was then added, and the extraction was shaken vigorously on a platform shaker at 4°C for 30 min. The extraction was centrifuged at 5000 g and the upper phenol layer removed and re-extracted twice with an equal volume of extraction buffer. The final volume of phenol recovered was typically one-third the starting volume of phenol. To precipitate the proteins, the final phenol phase was added to 5 volumes of 0.1 M ammonium acetate dissolved in methanol. The proteins were precipitated at –20°C for at least 12 h. After precipitation, the pellets were washed twice in ice-cold methanol, twice in ice-cold acetone as described for the TCA-acetone extraction, and air-dried. Pellets were stored at –80°C until used.
Proteins were extracted in a buffer containing 7 M urea, 2 M thiourea, 4% CHAPS, 2% amidosulfobetaine-14, 1% dodecyl maltoside, 20% glycerol, 200 mM KCl, 100 mM dibasic sodium phosphate, pH 7.6, and 1 mM PMSF (1 g aphid tissue:10 ml buffer w/v). Extracts were shaken moderately for 20 min at room temperature and centrifuged at 9400 g for 30 min. Supernatant was collected and added to an equal volume of 10% TCA in acetone containing 2% β-ME to precipitate the proteins overnight at –20°C. Extracts were washed, dried, and stored as described above for the TCA-acetone procedure.
Proteins from each extraction type were solubilized in rehydration buffer (7 M urea, 2 M thiourea, 4% CHAPS) and quantified using a microplate Quick Start Bradford assay (Bio-Rad, Hercules, CA) using BSA to generate a standard curve. Protein (10 μg) was boiled in 20 μl 2× SDS loading buffer32 and loaded onto precast 10-lane, 10–20% PAGE gels (Invitrogen, Carlsbad, CA) with dimensions of 8 cm × 8 cm and 1 mm thick. Gels were run at a constant 125-V for 2 h at room temperature in the SureLock XCell mini-cell (Invitrogen), fixed in 40% methanol:10% acetic acid for 30 min, and stained overnight with Colloidal blue (Invitrogen).
To quantitatively compare the protein extractions, four replicates from each extraction type were labeled with Cy2, Cy3, or Cy5, according to the manufacturer's instructions (GE Healthcare, Piscataway, NJ). Cy-Dye-labeled samples were grouped randomly during 2-D gel electrophoresis so that each gel contained a Cy2-, a Cy3-, and a Cy5-labeled sample.
To examine the TCA-acetone extractions in further detail, three TCA-acetone technical replicates for each genotype were labeled with Cy3 and Cy5 to incorporate a dye-swap design, according to the manufacturer's protocol (GE Healthcare). A total of 150 μg protein was labeled with each dye for the three replicates, allowing for the analysis of 50 μg protein/replicate. A combined Cy2-labeled internal standard containing a mixture of equal amounts of the protein from all of the extracts in the experiment was also included on every gel to facilitate gel-to-gel normalization.
The Cy-Dye-labeled experiments described above (analytical gels) as well as nonlabeled preparative gels for each extraction type were analyzed by 2-D electrophoresis (2-DE). The analytical gels containing Cy-Dye-labeled samples were used for quantitative analysis, and the preparative gels containing nonlabeled samples were used for spot-picking. A total of 50 μg each Cy-Dye-labeled sample or 500 μg protein was loaded onto immobilized pH gradient (IPG) strips (pH 3–10 nonlinear, 24 cm; GE Healthcare) during an overnight passive rehydration of the strips, according to the manufacturer's specifications. The strips containing the Cy-Dye-labeled samples always contained a Cy2-, a Cy3-, and a Cy5-labeled sample. The first dimension was run on the IPGphor II (GE Healthcare) at 20°C with the following settings: Step 1: step and hold for 500 V, 1 h; Step 2: gradient 1000 V, 1 h; Step 3: gradient 8000 V, 3 h; and Step 4: step and hold 8000 V until 70,000 V, 8 h. Next, the IPG strips were reduced for 15 min with 64.8 mM DTT in SDS equilibration buffer (50 mM Tris-HCl, pH 8.8, 6 M urea, 30% glycerol, 2% SDS, 0.002% bromophenol blue) and then alkylated for 15 min with 135.2 mM iodoacetamide in SDS equilibration buffer. The second dimension was carried out using 12% PAGE tris-glycine gels (Jule, Inc, Milford, CT). Gels were cast 1 mm thick by 25.5 cm wide by 20.2 cm tall with an acrylamide:bis ratio of 38:1. The Ettan DALT Six system (GE Healthcare) was used to run the second dimension at 25°C with the following settings: Step 1: 10 mA/gel, 1 h; and Step 2: 40 mA/gel, 6 h or until the bromophenol blue front ran to the bottom of the gels. The preparative gels were fixed in a solution of 10% methanol and 7% acetic acid for 1 h, stained overnight in Colloidal Coomassie blue (Invitrogen), and destained in water for 12 h prior to scanning.
Gels were scanned on the Typhoon Variable Mode Imager Model 9400 (GE Healthcare) according to the manufacturer's specifications for Cy-Dyes (GE Healthcare), and Colloidal Coomassie blue (Invitrogen)-stained gels were visualized with the 632.8-nm helium-neon laser with no emission filter. DIGE gel images were analyzed using Progenesis Samespots, v. 3.1 (Nonlinear Dynamics, Newcastle Upon Tyne, UK). Fifty manual alignment seeds were added/gel (~12/quadrant), and the gels were then auto-aligned and grouped according to genotype and extraction type for analysis. Spots were selected as being differentially extracted (for the experiment to compare protein extractions) or differentially expressed (to compare the two aphid genotypes, F and SC) if they showed a >1.5-fold change in spot density and an ANOVA score of <0.05.
Approximately 200 proteins/extraction were picked manually from the preparative gels using a 1.5-mm picking pen (The Gel Company, San Francisco, CA). Three subsets of spots were selected: if they were unique to a particular extraction; if they were differentially extracted; or if they were not differentially extracted. The gel plugs were washed twice in distilled water, once in a 1:1 mix of 100 mM ammonium bicarbonate (NH4HCO3):acetonitrile (ACN) for 10 min and once in 100% ACN for 5 min. Dehydrated gel plugs were incubated with 100 ng modified trypsin (Promega, Madison, WI) in a total volume of 30 μl 40 mM NH4HCO3, pH 7.8, in 10% ACN for 30 min at 4°C to rehydrate the gel plugs and transferred to 30°C for an overnight digestion. The digestion supernatant containing digested peptides was recovered and saved for MS analysis. Additional peptides were eluted from the gel plugs, first in 50% ACN:2.5% formic acid (FA) and then in 90% ACN:0.1% FA, freeze-dried, resuspended in 10 μl 0.1% trifluoroacetic acid (TFA), and pooled with the digestion supernatant. Peptides were desalted using a C-18, 0.2 μl Ziptip (Millipore, Billerica, MA) and freeze-dried in a vacuum concentrator. The samples were reconstituted in 3 μl 0.1% TFA in 50% ACN prior to analysis by MS. Each sample (0.5 μl) was applied to a target plate (Applied Biosystems, Foster City, CA) and mixed with 0.5 μl matrix (10 mg/ml α-cyano-4-hydroxycinnamic acid in 50% CH3CN/0.1% TFA/1 mM ammonium phosphate) using the dried droplet method.33 All MS data were obtained using a model 4700 proteomics analyzer (Applied Biosystems) with tandem time-of-flight optics using the 4000 Explorer software (Version 3.6; Applied Biosystems). Prior to analysis, the MS was calibrated externally using a six-peptide calibration standard available from Applied Biosystems (4700 Cal mix). Most samples were calibrated internally using two common trypsin autolysis products [at mass-to-charge ratio (m/z) values of 1045.5642 and 2211.1046 Da] as mass calibrants. The external calibration was used as the default if the trypsin autolysis products were not observed in the spectra of the samples. MS spectra were acquired across the mass range of 900–4000 Da using 1 kV-positive ions and the reflector mode with a laser power of 4100. The signal from 1600 laser shots was averaged to produce the final MS spectra. For tandem MS (MS/MS) experiments, the instrument was operated at a laser power of 5300 with the collision-induced dissociation off and metastable ion suppressor on. Calibration was external using the known fragments of angiotensin I (monoisotopic mass=1296.6853 Da) as calibrants. For each spot, the 15 most-abundant ions not appearing on the exclusion list with a minimum signal/noise ratio of 10 were selected automatically as precursor ions for MS/MS analysis. The signal from 3000 laser shots was averaged to produce each MS/MS spectra. All m/z values reported in this study are monoisotopic.
The MS and MS/MS data collected were submitted as a combined search to Mascot (Matrix Science, Boston, MA)34 using the GPS Explorer software, V 3.5 (Applied Biosystems). The experimental data were searched against the entire National Center for Biotechnology Information (NCBI) nonredundant (nr), downloaded on July 1, 2007, for Buchnera proteins, and an aphid expressed sequence tag database (www.aphidbase.com) for aphid proteins. To search the data against the Acyrthosiphon pisum (pea aphid) gene models (http://www.hgsc.bcm.tmc.edu/projects/aphid/), a July 28, 2008, version of NCBI nr was downloaded to the Mascot (Matrix Science) server. The following search parameters were used: carbamidomenthyl-cysteine and methionine oxidation as variable modifications and one missed tryptic cleavage. The searches were done with a mass error tolerance of 25 ppm in the MS mode and 0.15 Da in the MS/MS mode. The preliminary protein identifications obtained automatically from the software were inspected manually for conformation prior to acceptance. Homology to known proteins was determined by searching against protein databases in NCBI with BLAST.35 Protein functional classification was determined using the PANTHER v. 6 classification system (http://www.pantherdb.org/). Predicted and hypothetical proteins were not searched using PANTHER and were instead reported as such (Table 1).
The pellets following precipitation from the three extraction methods had unique characteristics. The phenol pellet was white and flaky when dry. When exposed to the urea rehydration buffer, the pellet was tinged with pink, and it fully dissolved. The TCA-acetone and the multi-detergent pellets were light-brown and grainy when dry and dark-brown and viscous when exposed to rehydration buffer. Qualitatively, the phenol extraction seemed to give the cleanest and most soluble pellet.
1-D SDS-PAGE gels were used to examine the range of protein MW and to assess the presence of interfering substances in the SC aphid genotype extracts. All extraction methods revealed proteins with a wide range of MW from over 200 kDa to as low as 6.5 kDa. None of the gels showed obvious streaking or high background; therefore, they seem to be clean of interfering substances (Fig. 2). Highly reproducible 1-D gel-banding profiles were observed using the TCA-acetone and the phenol extractions. In contrast, the multi-detergent extraction failed to show reproducible 1-D gel-banding patterns; numerous major bands were present in one technical replicate and absent from the other (Fig. 2, arrows), suggesting the multi-detergent extraction protocol may need to be refined further for the extraction of aphid proteins. There were also obvious differences in the protein-banding pattern between the multi-detergent extraction and the TCA-acetone and phenol extractions, as well as slight differences between the TCA-acetone and the phenol extractions (Fig. 2). The 1-D gels were treated with a glycoprotein-specific stain to determine if there was a bias in the glycosylation state of the proteins extracted by the phenol method, as was reported in plants10. No differences were observed (data not shown) that would explain the variation among the 1-D gels. Minor differences between 1-D gel-banding patterns translated into major differences when plant extracts are examined using 2-DE.10 Therefore, we carried out 2-DE to further define differences between the extractions.
To examine the extracts in detail, we ran preparative 2-D gels for each extraction type and stained them with Colloidal blue (Invitrogen). Although the spot patterns obtained by all three extraction types were similar, numerous differences in the presence or absence of individual spots and charge trains were apparent. As we suspected from the 1-D gel analysis, gross differences were observed in the 2-D spot patterns of the different multi-detergent extraction technical replicates (Fig. 3). Others have reported that multi-detergent extractions of insect proteins resulted in disappointing performances in protein solubilization, resolution achieved in 2-DE, and subsequent visualization.36 Therefore, the multi-detergent method used here may not be suitable for a gel-based quantitative-comparative approach. Rather, it may be more suited to applications where the primary goal is protein discovery rather than correlating variation between samples to a biological activity.30 Further manipulation of the protocol is required for the multi-detergent method to be suitable for quantitative proteomics of aphid proteins. Quantitative approaches require minimal variation between replicates so one can attribute any change in protein expression to treatment conditions.37 Simply having a method that extracts large quantities of protein is not sufficient unless it is reproducible.
The phenol extracts lacked numerous spots present in the TCA-acetone and multi-detergent extractions (Fig. 3) but also contained its share of unique spots not found in the other extraction methods. The phenol method seemed to perform better with the extraction of proteins of neutral to basic pIs. Although all three methods extracted proteins of a wide range of MW and pI, the TCA-acetone and the multi-detergent methods were better able to extract proteins with an acidic pI. To try to determine why these biases existed, we identified ~200 proteins/extraction type using MALDI MS/MS and determined their functional classifications.
To determine the identity of proteins that were extracted by each of the extraction types, 200 spots/gel were picked and subjected to MALDI MS/MS analysis. Three classes of proteins were selected for picking: those that were present in one or two but absent in the other(s); those that were present in all but showed >1.5-fold change between the extractions (see Quantitative Differences); and those that were not extracted differentially. The MS analysis confirmed that numerous high and low abundance proteins were differentially extracted by the different methods. A summary of selected abundant proteins differentially extracted is presented in Table 2. Calreticulin, a highly abundant calcium-binding protein and β-tubulin, was absent from the phenol extraction, and HSP60, and the p0 ribosomal protein were only found in the TCA-acetone extraction (Table 2, Fig. 4). With the exception of β-tubulin, these are all abundant proteins in the endoplasmic reticulum (ER);38,39 however, the phenol extraction was not totally absent of ER-derived proteins, as it did contain the ER-derived microsome protein S3.40
The different extraction methods were also able to differentially extract unique isoforms of certain proteins. An acidic form of bicaudal, a critical developmental regulator, was extracted by the multi-detergent method, and a more basic form was identified in the phenol extraction (Table 2). Different isoforms of enolase, an abundant enzyme involved in glycolysis, were identified as being differentially extracted. The isoform at pI 5.7 was found in all extractions. Two additional isoforms at pIs 5.5 and 5.6 were identified specifically in the phenol extraction (Table 2). These different enolase isoforms are known to have distinct binding partners and subcellular localization.41,42 This raises the possibility that the isoforms specific to the phenol extraction share a binding partner or subcellular location that is made more accessible by the phenol during protein extraction. Indeed, at least two additional Buchnera GroEL isoforms were extracted by the phenol method (Fig. 4) but were absent from the protein pools extracted by the other methods. Such mass shifts may be a result of differences in glycosylation.43 Additionally, the GroEL charge trains seen in the TCA-acetone and phenol extractions were not found in the multi-detergent extraction (Fig. 4). Taken together, there are two major points highlighted by these data. First, for any single extraction type, an “extractome” is recovered rather than the entire proteome. Even highly abundant proteins may not be extracted by some methods. Secondly, gel-based methods appear to be well-suited for the identification of different protein isoforms. Using a gel-based approach, isoforms of even low-abundance proteins were identified easily, creating the appearance of charge trains and mass shifts.
There were a few notable differences in the functional classification of the proteins extracted by the three methods (Table 1). Upwards of 10% of proteins from each extraction type were not found in the PANTHER database, and ~6% of the proteins from each extraction type that were included in the PANTHER database had no biological function classified. These do not include the predicted or hypothetical proteins that we identified, which were not searched against the PANTHER database. The TCA-acetone method extracted more proteins involved in cell structure, muscle contraction, protein complex assembly, and protein folding, as compared with the other methods (Table 1). The phenol method extracted more proteins involved in protein phosphorylation but fewer proteins involved in cell motility and no proteins involved in chromosome segregation or intracellular protein trafficking. Given that there were minor differences in the other functional categories, and the quality of the protein identifications by MS among the different extraction types was high overall, we chose the TCA-acetone method for further quantitative investigation, as it outperformed the phenol and multi-detergent extraction types on the total protein yield, was technically the simplest extraction to perform, was the most cost-efficient, and was highly reproducible.
To detect subtle differences in protein concentration and to attribute these changes to biological phenomenon, the extraction methods used must be highly reproducible. Therefore, we set out to quantitatively assess and compare each method for their reproducibility and their reliability in extracting aphid proteins.
First, we measured the protein yield obtained by each extraction type by the Bradford assay. The most striking difference between the protocols tested was the protein yield obtained by each extraction. The TCA-acetone extraction method resulted in a greater than two- to four-fold higher yield in protein (20.4 mg/g) as compared with the phenol (7.3 mg/g) or the multi-detergent (4.79 mg/g) extraction methods. This might be explained by the fact that the phenol and the multi-detergent extraction methods had centrifiguation steps prior to precipitation that would have removed protein-rich debris (for example, exoskeleton and nuclei) found in the TCA-acetone pellets that are later solubilized in the 8 M urea rehydration buffer. In reproductive aphids, the lipid content of somatic cells can range from 58% to 76%;44 therefore, alternatively, it is possible that the enhanced performance of TCA-acetone in extracting aphid proteins correlates with its ability to delipidate and solubilize membrane proteins.45
To get the best representation of the aphid proteome in our future studies, we wanted to use the extraction protocol that delivered the greatest number of distinct proteins. To examine the extracts for total number of spots as well as to determine how many spots detected were differentially extracted, we labeled each extraction type with a different Cy-Dye (GE Healthcare) and used 2-D DIGE to compare the extraction protocols. Two technical replicates of each extraction type for each of the two aphid genotypes (F and SC) were included in the experiment. Numerous differences were apparent when examining a merged image of the F genotype proteins extracted by the phenol method labeled with Cy3, the TCA-acetone method labeled with Cy2, and the multi-detergent method labeled with Cy5 (Fig. 5). Similar differences were observed when examining a merged image of the SC genotype proteins extracted by the three methods (data not shown). A total of 1529 spots, from both genotypes, was included in the experiment. A power analysis, which gives the probability of seeing any real difference, revealed that 82.3% of the data were at a power of 0.8 or greater with the two replicates used for each extraction type. Three or more replicates would have given us 100% of the data at a power of 0.8 or greater. To consider a spot extracted differentially, the ANOVA score for the spot was <0.05, and there was at least a 1.5-fold change in the spot intensity for at least one extraction method compared with the other two methods. Each method extracted nonoverlapping subsets of proteins; spots were unique to each extraction, extracted by one or two out of the three methods above the 1.5-fold change threshold, or found in all extraction types below the 1.5-fold change threshold (Fig. 6). A majority of the proteins (1018 of 1388) were common to all three extractomes, but 26.7% of the total spots were differentially extracted by one or two of the three methods. Seventy-four spots were unique to the TCA-acetone method, 99 were unique to the phenol method, and 56 spots were unique to the multi-detergent method. The remainder were preferentially extracted by two of the three methods (Fig. 6). These results are in stark contrast to previous reports that the TCA-acetone method is sufficient for total protein extraction.27
Principle components analysis (PCA), proven as a tool to analyze variation in proteomics data,46,47 was used as an exploratory tool to identify potential sources of variation in our experiments. After our data analysis revealed the numbers of spots that were unique to the different extractions, we wanted to use a blind approach to verify the sources of variation in our experiment, and this was provided by PCA. Proteins with similar expression profiles and gels of similar samples will cluster, whereas differentially expressed proteins and gels containing dissimilar samples will segregate spatially. Variation among the individual spot intensities and among the gel replicates was explored simultaneously in the PCA as spot numbers and colored dots, respectively. The first principle component, displayed along the x-axis, is mostly explained by the variation between two different aphid genotypes used in the experiment (Fig. 7). The numbers at the extremes of the x-axis are the most highly, differentially expressed spots between the F and SC genotypes. The second source of variation in this experiment can be explained by the second principle component, as shown along the y-axis. Each technical replicate for these methods cluster close together (Fig. 7) on both axes. Numbers at the top of the y-axis represent spots that are preferentially extracted by the multi-detergent extraction method, and numbers at the bottom of the y-axis represent spots that are preferentially extracted by the phenol extraction method. The black circles denote the gels containing the multi-detergent extraction replicate gels, the red circles denote the gels containing the TCA-acetone replicates, and the green circles denote the gels containing the phenol extraction replicates. Thus, the PCA confirmed that variation as a result of the differences between the multi-detergent and phenol techniques was almost as great as the variation observed between the different genotypes (27–39%, respectively). The variation between the TCA-acetone and the phenol method, 17.81%, was explained by the third principle component (not shown). The variation between technical replicates was a very small fraction of variation in this experiment (principle component 8, 0.93%).
To determine the reproducibility and reliability of the TCA-acetone extraction and subsequent DIGE analysis to detect subtle differences in protein expression37 between the aphid genotypes, three technical replicates for the two aphid genotypes were each labeled with Cy3 or Cy5, and a dye-swap design was incorporated for a total of six replicates for each genotype. The power analysis from the previous experiment indicated that we needed at least three replicates to have 100% of the data at a power of 0.8 or greater, so six replicates were more than adequate for this analysis. An all-inclusive Cy2-labeled, pooled internal standard was included to normalize gel-to-gel variation.48 Nonlinear 24 cm format IPG strips (pH 3–10; GE Healthcare) were chosen to provide a broad pI-range, high-resolution map of the aphid proteome. Gel analysis (Samespots, Nonlinear Dynamics) identified 156 proteins as differentially expressed at a threshold of 1.5-fold difference with an ANOVA P value <0.05. A power analysis revealed that 100% of our data were at a power of 0.8 or greater. PCA was again used as an exploratory tool to identify potential sources of variation in our experiment. Variation among the individual spot intensities and among the gel replicates was explored simultaneously in the PCA and is represented as spot numbers and colored dots, respectively. The primary source of variation (87%) in our experiments can be explained by the two different aphid genotypes, F and SC. This variation is plotted along the x-axis and is the first principle component. Numbers in blue are spots up-regulated in genotype F, and numbers in red are spots up-regulated in genotype SC (Fig. 8). The technical replicates spread out along the y-axis or the second principle component and show that 4.2% of the observed variation in the experiment was largely a result of technical variation, which includes the technical replicates and dye swaps as well as gel-to-gel variation (Fig. 8). Thus, as we had suspected, the PCA confirmed that variation as a result of the differences between the technical replicates was a very small fraction of the total variation in this experiment.
Each protein extraction isolates a distinct “extractome” and furthermore, has a unique ability to extract certain types of aphid proteins. In a 2-D DIGE experiment, we demonstrated that each extraction type not only isolates unique proteins but also that each extraction type differentially isolates proteins found in one, two, or three of the extraction types (Figs. 4 and and66 and Table 2). The latter raises the question as to whether any one extraction method is extracting the same pool of proteins, or extracting additional or fewer proteins that resolve at the same MW and pI resulting in different intensities of the same spot in the different extractions. Indeed, it has been shown that each spot represents several proteins,49,50 and therefore, a change in intensity of the spot among the different extraction types may not only represent differential extraction of a single protein but differential extraction of multiple proteins. A clever technique to identify and quantify the distinct proteins migrating in a single spot is 2-D GeLC.50 This technique couples the protein-abundance index to the spot intensity to determine the mol fraction of every protein in the spot. The technique might be used to verify fold changes in the proteins deemed responsible for phenotypic differences between samples in cases where more than one protein migrate to a particular MW and pI.
One must be cautious when comparing results from different extraction protocols, as the PCA of the DIGE data comparing the three extraction techniques revealed almost as much variation between extraction types as exists between the proteomes of different aphid genotypes (Fig. 7). With such large differences in the proteins isolated by different extraction methods, efforts must be made to independently confirm the role of candidate proteins in the biological process for which they presume to regulate to avoid chasing artifacts of protein extraction. For example, an isoform of cyclophilin B was discovered as a candidate protein in multiple aphid genotypes that transmit barley yellow dwarf virus.17 The authors went on to show that cyclophilin B binds directly to barley yellow dwarf virions to confirm further the involvement of cyclophilin B in virus transmission.17 In our study, cyclophilin B was only found in the phenol extractions (Table 3); therefore, this protein, critical to the virus transmission pathway, might have been missed had the authors of the previous study used a different protein-extraction protocol.
The differences in the proteomes extracted by the three extraction types also raise the question as to whether one method of protein extraction is sufficient to scan proteomes for biomarkers associated with a specific phenotype. The TCA-acetone method certainly performed well in the quantitative 2-D DIGE experiment to assess differences between the two aphid genotypes. Almost 90% of the variation in the experiment could be attributed to variation between the genotypes and only a small fraction to variation among the technical replicates. These data show that the TCA-acetone method is an ideal first approach for extracting insect proteins. However, taken together with the DIGE data exploring the multiple extraction methods, additional approaches are recommended when investigating the aphid proteome, as in the case of screening for candidate proteins involved in virus transmission. One idea might be to use a tandem extraction protocol, for example, first using the phenol method to extract one pool of proteins, then using a TCA-acetone precipitation on the pellet generated from the phenol extraction to precipitate a different pool of proteins, and finally, combining the pellets of both for analysis. Other subfractionation schemes have been used previously to achieve greater proteome coverage.51,52 Alternatively, the proteins from different extraction procedures might be combined and analyzed simultaneously. With either suggested approach, it would be necessary to ensure statistically that each extraction step was reproducible using pilot experiments and methods such as PCA and power analyses (http://www.fixingproteomics.org/).
The authors gratefully acknowledge the support of award 2007–04567 National Research Initiative, Cooperative State Research, Education, and Extensive Service, U.S. Department of Agriculture, Agricultural Research Service Current Research Information System project numbers 1907–21000-024–00D and 1907–22000-018 and National Science Foundation Division of Biological Infrastructure-0606595. We thank Kevin Howe for excellent technical assistance in maintaining the 4700 proteomics analyzer used in this study and Dawn Smith for help in maintaining aphid colonies.
*Present address: Department of Statistical Science, Cornell University, Ithaca, New York 14853.
†Present address: Department of Chemistry, Mansfield University, Mansfield, Pennsylvania 16933.