|Home | About | Journals | Submit | Contact Us | Français|
Identification of large numbers of proteins from complex biological samples is a continuing challenge in the area of quantitative proteomics. We introduce here a simple and reliable multistep mass tagging technique using our recently developed solid phase mass tagging reagents. When coupled with two-dimensional liquid chromatography/nano-electrospray ionization ion trap mass spectrometry (2D-LC/nano-ESI-MS), this method allows enhanced protein identification when tested on samples from prokaryotic and eukaryotic sources. The proteome of Escherichia coli D21 grown to either mid-exponential or stationary phase, and the membrane proteome from established breast cancer cell lines BT474 and MCF7 were used as model systems in these experiments. In both experiments, the numbers of total identified proteins are at least twice the numbers identified from a single tagging cycle. The sample complexity can be effectively reduced with corresponding increases in protein identification using the multistep method. The strategy described here represents a potentially powerful technique for large-scale qualitative and quantitative proteome research.
Most biological samples are by nature highly complex and therefore pose a great challenge for currently available analytical tools to provide robust, high-throughput analysis of the components.1 A principal objective of proteomics is the systematic identification of all proteins expressed in a cell or tissue.2 In current proteome research, the use of one-dimensional (1D) separation techniques is limited due to their insufficient resolving capabilities and peak capacities. In order to increase the number of protein identifications, multidimensional separations, especially multidimensional liquid chromatography (LC), have emerged as major separation methods to resolve complex biological matrixes or cellular extracts prior to final MS/MS step. For example, multidimensional protein identification technology (Mud-PIT) is an automated method for “shotgun” proteomics, which combines multidimensional liquid chromatography with electrospray ionization tandem mass spectrometry.3,4 It has been demonstrated for the large-scale identification of the proteomes of Saccharomyces cerevisiae, Plasmodium falciparum, and posttranslational modifications in protein mixtures.2,4,5 Although this method also has limitations, it is capable of identification of hundreds of proteins, offering higher sample throughput than 1D-LC techniques.3 Other applications of multidimensional LC in proteomics have been detailed in some reviews.6,7
The next challenge for proteomic techniques is the reduction of sample complexity. Multidimensional LC-MS requires the enzymatic digestion of a complex protein mixture, leading to an even more complex mixture of peptides. In order to reduce the complexity before the separation and add a quantitative dimension to the proteomics analysis as well, a new approach to solid phase mass tagging based on simple and readily available solid phase peptide synthesis methodology was recently introduced.8 In this method, a tri-alanine peptide containing either 12C or uniformly labeled with 13C is synthesized on a methacrylate resin by standard Fmoc chemistry to obtain the desired tri-peptides differing by 9 mass units. Linkage to the resin is achieved through an acid-labile “Rink” linker group. Peptides are derivatized with iodoacetate using succinimidyl iodoacetate to provide a cysteine-reactive moiety. Compared with soluble tagging reagents, such as “isotope-coded affinity tags” (ICAT),9 the use of high-stringency washing procedures is possible using the solid phase approach because tagged peptides are covalently bound to the solid phase. These mass tagging reagents have been used to quantitatively measure the relative amounts of cysteine-containing peptides in model peptide mixtures and in mixtures of protein tryptic digests.8
In order to increase the protein identifications when dealing with complex biological samples, a new multistep mass tagging procedure was introduced that uses these newly developed solid phase mass tag reagents, coupled with an on-line automated 2D LC-MS platform we developed recently.10 The principal idea of this method is rather simple. Multistep mass tagging protocol, a core component of this new design, can be realized by virtue of the solid phase tagging pattern of our mass tag reagents. On-line automated 2D LC is capable of offering higher resolution and peak capacity and avoids the need for complicated switching valves and minimizes sample loss. Therefore, sample throughput and protein identifications are potentially improved. In this work, we present an evaluation of its ability to identify more proteins using two groups of complex samples: (1) the expressed proteome of Escherichia coli D21 grown to either mid-exponential or stationary phase, and (2) membrane proteins from breast cancer cell lines BT474 and MCF7.
Mutant of the E. coli K12 strain, D21, used in this study was obtained from the E. coli Genetic Stock Center at Yale University. The E. coli was incubated for 3 h (mid-exponential phase) or 18 h (stationary phase) as described elsewhere,11 after which they were harvested and stored at −20°C. The thawed bacterial suspension (1 mL) was centrifuged for 2 min in a microfuge (Eppendorf, Hamburg, FRG) to separate whole cells from the growth medium. The medium was decanted and the pellet was then resuspended in 200 μL 0.1 M Tris-acetate buffer with 0.01% (w/v) sodium dodecyl sulfate, pH 8.5. Trypsin and Staphylococcus V8 protease were both added to the suspension at final concentrations of 10 μg/mL and tris-carboxyethylphosphine (TCEP) was added to a final concentration of 0.1 mM. The suspensions were again incubated for 6 h at 37°C, followed by centrifugation in the microfuge for 2 min. The peptide-rich supernatants were removed and the pH of each was lowered to 4.0 using glacial acetic acid. Peptides were stored at 4°C and subsequently used without further treatment. The preparations of membrane digests from BT474 and MCF7 cell lines were carried out according to our previous publication.10
The concentrations of peptide solutions derived from the cells were determined using CBQCA quantitative protein assay (Molecular Probes, Eugene, OR).
Synthesis of solid phase mass tags was described previously.8 The multistep mass tagging protocol introduced in this study is schematically illustrated in Figure 11.. Because of the difference in the sample complexity in this study, different protocols were adopted. For E. coli samples, we used a simple three-cycle mass tagging protocol. In the first cycle, 100 μg of either 12C (light) or 13C (heavy) mass tag was added to 20 μg of peptide solution derived from the cell lines. GuHCl was added to a final concentration of 1 M and samples were incubated for 3 h at room temperature. After incubation, samples were centrifuged for 1 min in a microfuge (Eppendorf, Hamburg, FRG), and the supernatant fluid was retained for the subsequent cycle. For each of the two subsequent cycles, 100 μg of fresh mass tag was added to the supernatant fluid retained from the previous cycle and incubation was repeated as before. In all cases, samples were mixed using a vortex (Reax Top; Heidolph Instruments, Kelheim, FRG) set on continuous mixing mode in order to keep the resin in suspension during the incubation. In each cycle, after taking the supernatant away, resin-bound peptides tagged with either 12C or 13C were mixed together and then washed 5 times with 200 μL H2O containing 0.02% tri-fluoroacetic acid for 30 min each. After washing, tagged peptides were cleaved from the beads by incubating them with 50 μL of 95% (v/v) trifluoroacetic acid, 2.5% acetonitrile and 2.5% H2O for 1 h at room temperature. Samples were centrifuged as above and supernatants were removed and vacuum-dried in a Speed Vac (Savant Instruments, Holbrook, NY). Finally, mass tagged peptides were resuspended in 10 μL of aqueous 20% acetonitrile containing 0.02% trifluoroacetic acid.
For more complex breast cancer cell lines BT474 and MCF7 samples, we increased the number of tagging cycles to six. In the first cycle, 200 μg of either 12C (light) or 13C (heavy) mass tag was added to 200 μg of peptide solution derived from the cell lines and incubated for 3 h. For each of the three subsequent cycles, 200 μg of fresh mass tag was added to the supernatant fluid retained from the previous cycle and incubation was repeated as in the first cycle. In the final two tagging cycles, 500 μg of mass tag was added and the reaction was carried out overnight (12 h). All other subsequent steps are the same as described above. The amount of mass tag and the reaction time were increased in the final two cycles in an effort to maximize the capture of low-abundance peptides in the mixture.
The automated 2D LC-MS setup consists of a strong cation exchange column (7.5 cm in length, 180 μm i.d.) packed with polysulfoethyl A [poly(2-sul-foethyl aspartamide)] resin. This column was interfaced with another fused silica 75-μm i.d. column (15 cm in length) packed with Kromasil C18 resin through the automated auxiliary valve available on the LCQ Deca mass spectrometer. A sample (5 μL) of the tagged peptides was loaded onto the cation exchange column and runs were performed as described elsewhere.8
Data files from the chromatography runs were combined and batch searched against the E. coli or human nr database (National Center for Biotechnology Information) using the Sequest algorithm,12 contained within Thermoelectron’s Bioworks 3.1 software. The Fasta database used for the search was indexed to reflect the specificities of enzyme cocktails used without allowing partial cleavage. The Sequest criteria used for protein identifications were as follows. The threshold of cross-correlation (Xcorr) scores set for peptides were 1.5, 2.0, and 3.0 for +1, +2, and +3 charged fully digested peptides, respectively, and a threshold of 0.10 was required for ΔCn values for individual peptides. Except for those proteins identified by two or more peptides, manual review of data was performed.
We reasoned that, using a solid phase approach, the mass tagging step itself could be used as another “dimension” of separation. The reason for this is that not all cysteine-containing peptides in a complex mixture react with the tags during the first cycle of tagging, perhaps because of inherent differences in peptide reactivities. Therefore, a sample can be interrogated multiple times by adding fresh mass tags to the unreacted peptides from the previous tagging cycle. The reactions can also be manipulated by varying the time and amount of mass tags added. This concept is illustrated in Figure 11.. An added advantage is that the confidence of protein identification is potentially increased, because different cysteine-containing peptides from the same protein might be identified in different mass tagging cycles.
In order to determine the ability to identify more proteins by multistep mass tagging coupled with 2D LC-MS strategy, we applied a simple three-step protocol to the analysis of enzymatic digest of E. coli proteins harvested from D21 cells grown to either mid-exponential or stationary phase. The former was labeled with the 12C mass tag, while the latter was labeled with the 13C tag. Figure 22 is a diagram showing a breakdown of the number of total identifications and new identifications from both cells in each mass tagging cycle. As shown in the figure, the number of identifications in the first cycle accounted for only about half the total identifications. More than 25% of the proteins identified in each cycle were new. The distribution of identified proteins across the three different cycles indicates the effectiveness of identifying more proteins by applying a multistep mass tagging approach.
The relative protein expression level between two growth phases is calculated by comparing the areas under the curve in the elution profile for each of the two peptides having identical sequences but different masses (9.0, 4.5, and 3.0 for +1, +2, +3 charged tagged peptides, respectively), due to isotopic substitution in the mass tagging reagent. The ratios of the expression levels of selected quantified proteins in both cells are listed in Table 11.
We therefore applied this method to more complex samples BT474 and MCF7 membrane digests by increasing the number of mass tagging cycles and the amount of tagging reagents. The BT474 digest was labeled with the 12C tag, while the MCF7 digest was labeled with the 13C tag. The six-cycle tagging experiment yielded a total of 2024 protein identifications from the cell membrane digests. Figure 33 shows a breakdown of the number of total identifications and new identifications from both cell lines in each mass tagging cycle. As shown in the figure, the number of identifications in the first cycle only accounted for about 25% of total identifications because of increased mass tagging cycles. In addition, more than half of proteins identified in each cycle were new. In order to capture more peptides, the incubation time and amount of mass tags were increased (by factors of 4-fold and 2.5-fold respectively) in the fifth and sixth cycles. As a result, more new proteins were identified in the fifth and sixth cycles compared with the fourth cycle.
This report details development of a new quantitative proteomics approach using multistep mass tagging coupled with 2D LC-MS. Separation of the highly complex peptide mixtures encountered in proteomics represents a considerable challenge even for the multidimensional LC-separation format because of its finite peak capacity. In such a case, a single mass tagging cycle targeting only cysteine-containing peptides with excessive mass tags and enough long reaction time still possibly produces a highly complex mixture of tagged peptides. Another concern is that the excessive amount of mass tags may increase the potential for unexpected side reactions with noncysteine peptides.8
The advantages of the multistep tagging approach described here are immediately apparent. Firstly, it essentially adds another “dimension” of separation to the 2D LC system. The complexity of the tagged peptide mixture in each cycle is reduced, thereby potentially increasing the number of proteins identified and quantified. Secondly, the sample capacity is greatly increased. Theoretically any amount of starting material can be analyzed. Therefore, this method promises to allow increased amounts of starting materials when compared with other quantitative proteomic techniques. Thirdly, the number of tagging steps on the same sample can be extended. This is not possible with soluble ICAT reagents. Needless to say, increased sample loading capacity potentially allows greater possibility for low-abundance protein detection.
To evaluate the differential protein expression data obtained in this study, we compared previously published E. coli protein and mRNA abundance data with some of the protein ratios. For example, fadA, coding for 3-ketoacyl-CoA thiolase, is known to be de-repressed and subject to the increased expression during entry of cells into the stationary phase.13 This is supported by the ratio (1:2.601) obtained in this work (see Table 11).). It has been shown that aconitate hydrase B (coded by acnB), a major citric acid cycle enzyme, is upregulated in the exponential phase, declining upon entry into a stationary phase of growth when a different aconitase (unidentified in this study) is expressed.14 Other selected protein expression ratios were as expected based upon comparisons with previous reports.
In summary, this communication has focused on the advancement of the solid phase mass tagging method in protein identification, and also shows that the technique allows differential quantitation of proteins between related samples. The information from such studies may be used to uncover differences in underlying biological mechanisms. Based on the advantages outlined here, the multistep mass tagging approach should see more applications in quantitative global protein profiling and offers an attractive alternative to commercially available technologies.
This work is dedicated to the memory of our colleague and mentor Csaba Horváth. It was supported by Grant No. GM 20993 from the National Institutes of Health, U.S. Department of Health and Human Services, and Grant No. IRG 58-012-42 from the American Cancer Society. This project has been funded in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, under contract No. N01-HV-28186. We acknowledge a research collaboration with Thermo Electron Corporation.