|Home | About | Journals | Submit | Contact Us | Français|
Ectopic pregnancy (EP) and normal intrauterine pregnancy (IUP) serum proteomes were quantitatively compared to systematically identify candidate biomarkers. A 3-D biomarker discovery strategy consisting of abundant protein immunodepletion, SDS gels, LC-MS/MS, and label-free quantitation of MS signal intensities identified 70 candidate biomarkers with differences between groups greater than 2.5-fold. Further statistical analyses of peptide quantities were used to select the most promising 12 biomarkers for further study, which included known EP biomarkers, novel EP biomarkers (ADAM12 and ISM2), and five specific isoforms of the pregnancy specific beta-1-glycoprotein family. Technical replicates showed good reproducibility and protein intensities from the label-free discovery analysis compared favorably with reported abundance levels of several known reference serum proteins over at least three orders of magnitude. Similarly, relative abundances of candidate biomarkers from the label-free discovery analysis were consistent with relative abundances from pilot validation assays performed for five of the 12 most promising biomarkers using label-free multiple reaction monitoring of both the patient serum pools used for discovery and the individual samples that constituted these pools. These results demonstrate robust, reproducible, in-depth 3-D serum proteome discovery, and subsequent pilot-scale validation studies can be achieved readily using label-free quantitation strategies.
Unbiased quantitative proteome comparisons of biological fluids such as serum or plasma from patients and normal controls are generally thought to have great potential for discovering novel biomarkers for diseases such as cancers and clinical conditions such as ectopic pregnancy (EP). However, such discovery studies in serum have proven to be quite difficult and frequently disappointing due to factors that are now well recognized, including: the high complexity of serum proteomes; a wide protein abundance range spanning more than 10 orders of magnitude; the presence of most clinically useful biomarkers at very low levels; a high patient-to-patient variability; and potential biases due to variations in sample collection and processing.1–4 These factors produce a challenging dichotomy when developing biomarker discovery strategies. Specifically, serum's complexity and wide dynamic range, combined with the need to detect low-abundance proteins, requires that extensive fractionation be used in order to achieve a good depth of analysis, which limits throughput. However, patient-to-patient heterogeneity requires that relatively large numbers of patient samples be analyzed. Common compromises for dealing with these opposing factors include: use of mouse or in vitro models, pooling of patient samples for the discovery phase, and/or analyzing less than ideal numbers of patients in the discovery phase followed by evaluation of candidate biomarkers in larger numbers of patients.
EP is a clinical condition where biomarkers are urgently needed to improve early diagnosis and where discovery studies must be conducted using human specimens due to the lack of a good experimental model system.5–8 An EP occurs when the embryo implants at a site other than in the uterus, typically the fallopian tube. As the fetus grows, this condition becomes life-threatening due to potential tubal rupture and internal hemorrhage. The incidence of EP is increasing due to a number of factors, and it is now the second-most-common cause of maternal death in the first trimester of pregnancy.5, 7, 9–11 Although a few serum protein markers for EP have been proposed, it is currently diagnosed using a combination of trans-vaginal ultrasound and serial serum β-human chorionic gonadotrophin (β-hCG, gene name: CGB) levels.8–10, 12, 13 However, EP remains difficult to diagnose at an early stage, and approximately 50% of patients with this condition initially are misdiagnosed—resulting in significant morbidity and mortality.11, 14 Furthermore, nearly a third of all cases do not exhibit any clinical signs and 9% have no symptoms prior to tubal rupture.15
A wide range of protein separation methods have been used to fractionate serum proteomes with varying success.16, 17 One method used for simplifying serum or plasma is depletion of high-abundance proteins, and depletion of varying numbers of major proteins is now a first step in many serum or plasma biomarker discovery strategies.18–20 Another powerful protein fractionation method is 1-D SDS gel separation followed by in-gel trypsin digestion of slices that encompass the entire lane and LC-MS/MS. In several recent systematic comparisons of analytical methods, this method, which is often called GeLC-MS/MS, was shown to provide greater depth of analysis compared with most alternative fractionation methods.21, 22
We previously demonstrated that a 4-D analysis strategy comprised of immunodepletion of six major serum proteins, solution IEF of proteins, 1-D SDS PAGE, and LC-MS/MS, resulted in deep mining of human serum and plasma proteomes with substantial detection of known proteins in the low ng/ml range.18, 23 In a multi-laboratory comparative study, the 4-D method detected more blood proteins than any other method used in that study, although other extensive fractionation schemes with good depth of analysis for serum subsequently have been reported.16,17 While the 4-D method has good depth of analysis for serum, it is extremely low throughput due to the extensive fractionation at the protein level that results in 120 to 200 LC-MS/MS analyses per proteome. Also, the 4-D method is not readily compatible with quantitation using stable isotope tags because the most robust tagging strategies label peptides after protease cleavage; hence, any variations in the three sequential protein fractionation steps prior to trypsin digestion could result in artifactual changes that are not specific to the biology being tested.
Over the past several years, the use of label-free quantitation has gained favor as an alternative to stable isotope labeling strategies, particularly for samples such as human serum where metabolic labeling with stable isotopes is not an option. Label-free quantitation methods either estimate protein abundance changes based upon the number of MS/MS spectra observed for a given protein (spectral counts) or measure chromatographic peptide ion intensities using peak height or area.24–27 Although the use of spectral counts is simpler than global comparisons of MS signal intensities, the latter method is better suited for quantitative comparisons of low-abundance peptides because the resulting peptides infrequently and inconsistently result in MS/MS spectra that are properly annotated with the correct peptide sequence.
In the current study, we used a 3-D method to systematically compare sera from patients with EP and IUP to identify candidate EP biomarkers. The 3-D method consisted of immunodepletion of 20 abundant serum proteins followed by GeLC-MS/MS analysis, with subsequent label-free quantitative comparisons using Rosetta Elucidator software28 (v3.1, Rosetta Biosoftware, Seattle, WA) to align and compare data at the MS ion intensity level. This analysis identified 70 candidate biomarkers with greater than 2.5-fold difference between the EP and IUP groups, and a high-priority biomarker subset was selected based upon the statistical probability that annotated peptides could properly classify samples into the EP or IUP group. Pilot validation of several biomarkers was conducted using label-free multiple reaction monitoring (MRM) to analyze the individual samples that constituted the pools used for the initial discovery experiments. The results demonstrate that both label-free methods were reproducible and yielded consistent relative abundance changes, which resulted in identification of novel EP biomarkers as well as specific isoforms of a previously reported EP-related protein family.
200 proof molecular biology grade ethanol, LC-MS grade formic acid, and iodoacetamide were purchased from Sigma-Aldrich (St. Louis, MO). Sodium dodecyl sulfate (SDS) and Tris were purchased from Bio-Rad (Hercules, CA). Dithiothreitol (DTT) was obtained from GE Healthcare (Piscataway, NJ). HPLC grade acetonitrile was purchased from Thomas Scientific (Swedesboro, NJ). Sequencing grade modified trypsin was purchased from Promega (Madison, WI).
Serum was collected from nine patients with an ectopic pregnancy and nine matched controls with normal intrauterine pregnancies. Specimens were matched based on gestational age (range of 4 weeks, 2 days to 11 weeks, 3 days), hCG level (3821–52430 mIU/ml) and diagnosis (EP or IUP). Blood was collected by venipuncture into BD Vacutainer red/grey serum separator tubes (BD, Franklin Lakes, NJ), allowed to clot at RT, and centrifuged. Serum was then aliquoted, frozen, and stored at −80 °C.
Samples were depleted of 20 abundant serum proteins using a ProteoPrep20 Immunodepletion Column (Sigma-Aldrich). Typically, 100 μL of serum was filtered through a 0.22 μm microcentrifuge filter and injected onto the column. The flow-through fractions containing unbound proteins were collected, pooled, and precipitated with nine volumes of 200 proof ethanol, pre-chilled to −20 °C. Ethanol supernatants were carefully removed and protein pellets were frozen and stored at −20 °C until further use. Fractions containing affinity-bound abundant proteins were collected and pooled, neutralized with 1M NaOH, and frozen for possible future analysis.
Prior to 1-D SDS-PAGE, frozen protein pellets from ethanol precipitation of depleted serum were thawed briefly and re-suspended in 50 mM Tris-Cl, 1% SDS, pH 8.5. Samples were reduced with 20 mM DTT for 1 h at 37 °C and alkylated with 60 mM IAM in 50 mM Tris-Cl, pH 8.5 for 1 h at 37 °C. Alkylation was quenched with 50 mM DTT for 15 min at 37 °C. Following in-solution reduction and alkylation, samples were prepared for PAGE by addition of SDS sample buffer. For each sample, aliquots representing 10 μL of original serum per lane were loaded into 10-well 12% NuPAGE mini-gels (Invitrogen, Carlsbad, CA) and separated using MES running buffer until the tracking dye had migrated 2 cm. Gels were stained with Colloidal Blue (Invitrogen), and each lane was subsequently sliced into 21 uniform 1 mm slices using a custom razor-blade array. Corresponding slices from three lanes for each depleted serum sample were combined in single wells of a 96-well pierced plate (Biomachines, Inc., Carrboro, NC). Gel slices were digested overnight using 0.02 μg/mL modified trypsin. Following digestion, aliquots of corresponding fractions from three patients in each group were pooled to produce three EP and three IUP serum fraction pools. These pools and the remainder of individual sample digests were frozen and stored at −20 °C for future discovery and validation analyses, respectively.
For initial discovery of candidate biomarkers, pooled tryptic digests were analyzed in duplicate using an LTQ-Orbitrap XL mass spectrometer (Thermo Scientific, Waltham, MA) interfaced with a Nano-ACQUITY UPLC system (Waters, Milford, MA) with the column heater maintained at 40 °C. For each tryptic digest, 6 μL was injected onto a UPLC Symmetry trap column (180 μm i.d. × 2 cm packed with 5 μm C18 resin; Waters), and tryptic peptides were separated by RP-HPLC on a BEH C18 nanocapillary analytical column (75 μm i.d. × 25 cm, 1.7 μm particle size; Waters). Solvent A was Milli-Q (Millipore, Billerica, MA) water containing 0.1% formic acid, and Solvent B was ACN containing 0.1% formic acid. Peptides were eluted at 200 nL/min using an ACN gradient consisting of 5–28% B over 42 min, 28–50% B over 25.5 min, 50–80% B over 5 min, 80% B for 4.5 min before returning to 5% B over 0.5 min. The column was re-equilibrated using 5% B at 400 nl/min for 20 min before injecting the next sample. The mass spectrometer was set to scan m/z from 400 to 2000. The full MS scan was collected at 60,000 resolution in the Orbitrap in profile mode followed by data-dependant MS/MS scans on the three most abundant ions exceeding a minimum threshold of 1000, collected in the linear trap. Monoisotopic precursor selection was enabled and charge-state screening was enabled to reject z = 1 ions. Ions subjected to MS/MS were excluded from repeated analysis for 60 s. The order of sample analysis was randomized to prevent temporal experimental bias. Mass spectrometer, HPLC, and autoinjector performance were rigorously monitored to maintain mass accuracies within 2 ppm, retention times within a ±1.0 min window, and injection volumes within ± 10% to facilitate label-free pattern comparisons.
LC-MS and LCMS/MS data were analyzed using the Rosetta Elucidator system. A total of 252 raw MS spectra files were imported into the system (6 depleted serum pools × 21 fractions × duplicates); LC-MS data were acquired from 0–98 min, but based on elution profiles of peptides and density of ion signals, data for the label-free comparison was trimmed to 20–75 minutes and the m/z range was trimmed to 400–1800. Retention time (RT) alignment, feature identification (discrete ion signals), and feature extraction across the entire chromatographic time window were performed by the Elucidator software, essentially as described by others.29,30 DTAs were created with BioWorks v. 3.3.1 (Thermo Scientific) using high-quality features with z >1 and <5, and having peak scores greater than 0.7 and 0.8 for RT and m/z, respectively. Peak scores, as defined in the Rosetta Eludicator System User Guide, are correlation coefficients that compare the shape of a feature in the time and m/z dimensions to the shape of an ideal peak, with an ideal peak having a score of 1.31 DTAs were searched using the SEQUEST algorithm (v. 28, rev. 13, University of Washington, Seattle, WA) with a full tryptic constraint against a human UniRef100 protein sequence database (10/23/2007, 84,662 entries) to which commonly observed “contaminants” were added (trypsin, keratins, etc.). A decoy database was produced by reversing the protein sequence of each database entry and the entire reversed database was appended in front of the forward database. Peptide and protein information was assigned to features using the Protein and Peptide Tellers, which are Rosetta Biosoftware's re-implementations of the open-source ProteinProphet™ and PeptideProphet® programs,32, 33 respectively. Specifically, as described in the Rosetta Elucidator System User Guide, Peptide Teller validates peptides assigned to MS/MS spectra by search engines by computing probabilities that search results are correct in the dataset based on search scores and peptide properties. Protein Teller computes probabilities that proteins were present in a sample based on the combined probabilities of their corresponding peptides. Importantly, it deals with two issues critical for protein inference: First, correct peptides often correspond to multi-hit proteins whereas incorrect peptides most often correspond to single-hit proteins. This non-random grouping of peptides with their corresponding proteins can lead to an amplification of the false positive error rate at the protein level. Protein Teller counteracts this effect by penalizing peptides corresponding to single-hit proteins at an appropriate amount learned from each data set. Second, a substantial number of identified peptides are common to multiple database entries. This is especially true for human and other higher eukaryotic species, which usually contain alternative splice forms, large, homologous protein families, and partial sequences in the databases. Protein Teller apportions common peptides among all corresponding proteins to derive the simplest list of proteins that can explain the observed peptides.31 Data were filtered using Protein Teller scores of correct identification probability > 0.95 and Peptide Teller scores > 0.8.
The experiment was defined in the Elucidator System as having two treatment groups (EP, IUP). Each treatment group included three pools of three individual serum samples and two technical replicates per group. Several strategies and tools within the Elucidator System were used to analyze the data, including differences at the annotated peptide level, the protein level, and peptide trend plots. Specifically, the 2-D visual script shown in Supplemental Figure S1 utilized peptide annotation to sum feature intensities across gel slice fractions within each sample, and peptides significantly different between groups were defined using a two-way Analysis of Variance (ANOVA) with p<0.001. Peptides were grouped into consensus proteins using Protein Teller and protein level ratios were determined using those peptides that were significantly different between groups, as defined by ANOVA.
A subsequent independent manual analysis was conducted by exporting the peptide report results, which included values for technical replicates, into Microsoft Excel (Microsoft Corporation, Redmond, WA). Peptides were grouped into proteins based on protein description and pair-wise ratios between average intensities of IUP and EP were calculated for each peptide as well as the summed intensity for the protein. In addition, a further statistical test was developed independently to identify those peptides with the greatest discrimination power between groups, as summarized below.
We assumed peptide logarithmic expression levels in each sample were normally distributed and introduced two statistical measurements, sum-of-Z-score (sumZscores) and probability-of-misclassification (Pm), to objectively quantitate the separation between the two distributions. Given two normal distributions with means and variances() and () respectively, sumZscores computes the distance between the two means in terms of Z-scores, taking into account the widths of the distributions. Explicitly, we have the following expression for sumZscores,
On the other hand, the probability-of-misclassification (Pm) of a peptide represents the minimal theoretical error that would occur if we were to classify samples from a balanced mixture of two normal distributions into EP or IUP group by thresholding on the logarithmic expression level of that peptide. In practice, the optimal threshold value can be found by solving a quadratic equation for the point(s) where the two normal distributions yield equal density, and then select the one with lower classification error. The value for Pm is then computed as the corresponding minimal theoretical error. A detailed derivatization of Pm is described in Supporting Information.
Targeted LC-MS/MS analyses for proteins of interest were performed on a LTQ-Orbitrap XL mass spectrometer coupled to a Nano-ACQUITY UPLC system. Targeted analysis was used to: verify the initial peptide and protein identifications of putative biomarkers of interest, distinguish between related protein isoforms where needed, and increase the number of identified peptides where needed for subsequent quantitative assay development. Columns, solvents, and gradient used were as described above for LC-MS/MS. A list of m/z values representing the targeted peptides were generated and placed into the parent mass list of the MS method. The mass spectrometer was set to scan m/z from 360 to 2000 at 60,000 resolution in the Orbitrap followed by data-dependent ion trap MS/MS scans of up to the three most abundant ions from the parent mass list that exceed a minimum threshold of 500. Targeted ions were monitored throughout the entire run with an m/z tolerance of ±10 ppm. Dynamic exclusion was enabled with a repeat count of 2, repeat duration of 10 s, and exclusion duration of 10 s. Monoisotopic precursor selection was not enabled, and charge-state screening was set to reject singly charged ions and ions with unknown charge state.
MRM experiments were performed on a 4000 Q TRAP hybrid triple quadrupole/linear ion trap mass spectrometer (Applied Biosystems, Foster City, CA) interfaced with a NanoACQUITY UPLC system. Chromatography was performed with Solvent A (Milli-Q water with 0.1% formic acid) and Solvent B (acetonitrile with 0.1% formic acid). Typically, 5 μl of an appropriate tryptic digest was injected in duplicate on PicoFrit columns (75-μm i.d., 15-μm tip opening; New Objective, Woburn, MA) packed in house with 25 cm of Magic C18 3-μm reversed-phase resin (Michrom Bioresources, Auburn, CA). Peptides were eluted at 300 nL/min using an acetonitrile gradient consisting of 5–35% B over 15 min, 35–70% B over 5 min, 70% B for 5 min before returning to 5% B in 0.5 min. To minimize sample carryover, a blank was run between each sample. Data were acquired with a spray voltage of 2,800 V, curtain gas of 20 p.s.i., nebulizer gas of 10 p.s.i., and an interface heater temperature of 150 °C. At least three MRM transitions per peptide, and three peptides per protein were monitored and acquired at unit resolution in both Q1 and Q3 quadrupoles to maximize specificity. Scheduled MRM also was used to reduce the number of concurrent transitions and maximize the dwell time for each transition. The MRM detection window was set at 4 min, and target scan time was set at 1 s. The final MRM method included 60 optimized transitions for five target proteins. Data analysis was performed using MultiQuant version 1.1 software (AB/MDS Sciex, Foster City, CA). The most abundant transition for each peptide was used for quantification unless interference from the matrix was observed. In these cases, another transition free of interference was chosen for quantification.
The flow diagram in Figure 1 summarizes our 3-D method for quantitative comparisons of serum from EP and IUP patients. Major protein depletion followed by GeLC-MS/MS is an efficient approach to identify a wide range of proteins in complex biological fluids such as serum.16, 23, 34 In this study, the SDS gel separation was performed until the tracking dye migrated 2.0 cm. While performing longer gel separations and using a greater number of gel slices would further increase depth of proteome coverage, the major trade-offs are that throughput proportionally decreases and the complexity of the data set can exceed the capacity of existing software to perform quantitative comparisons.
A related strategic decision is whether patient samples should be analyzed individually or in pools. In this study we opted for a modest degree of pooling as a reasonable compromise between the total number of samples that could be analyzed and potential confounding effects of pooling (see below). However, the most critical factor in experimental design for an LC-MS-based comparative study is the method used to quantitatively compare different specimens such as human serum, which cannot be metabolically tagged with heavy isotopes. One advantage of software such as the Rosetta Elucidator system used in this study is that it can combine the intensities of signals for a given peptide across fractions as long as the peptide was annotated in at least one of the samples being compared.
In this study, depleted sera from nine EP and nine IUP patients were quantitatively compared by label-free LC-MS/MS analysis of pooled tryptic digests (Figure 1). Table 1 summarizes the scope of the experiment, which included a total of 252 LC-MS/MS runs for the discovery phase. All runs for a given gel slice were performed in a group starting at the top of gel to minimize variations in HPLC and mass spectrometer performance, although the order of performing analyses was randomized within gel slice groups to minimize the potential for experimental bias. These data produced approximately 1.1 million features, that is, discrete ion signals with unique elution times and m/z values. Retention time alignments and feature extractions across the entire chromatographic window where peptides eluted (20–75 min with a maximum 4 min window of variation) were performed within Elucidator using the Peak Teller algorithm. The software corrected for local retention time shifts across all runs for each fraction and removed noise and background. Supplemental Figures S4–S6 show retention time shifts among the 12 LC-MS/MS runs for three different gel slices run at the beginning (gel slice 1), middle (slice 10), and near the end (slice 20) of the entire experiment. Retention times typically varied by less than 1 min among the 12 runs for each fraction, with the greatest variation occurring early in the gradient where the most hydrophilic peptides eluted.
The Elucidator 2-D visual script shown in Supplemental Figure S1 was used for initial identification of apparently significant differences between EP and IUP specimens as described in Materials and Methods. This analysis resulted in identification of 70 putative candidate biomarkers (Supplemental Table S1) based on at least two identified peptides with p<0.001 (ANOVA) and at least 2.5-fold increases or decreases in the EP group compared to the IUP group. A 2.5-fold cut-off was selected for several reasons. First, label-free quantitation was expected to exhibit increased variation compared to other quantitation methods; hence a more conservative fold change of 2.5 was initially chosen rather than a typical 1.5–2.0 fold cutoff used in many proteomic studies. Second, proteins exhibiting more subtle differences in average values between clinical groups are unlikely to be good biomarkers because most blood biomarkers exhibit relatively wide ranges in concentration, even within clinical groups. However, inspection of resulting peptide intensities across all pools and duplicate LC-MS/MS runs showed that for some proteins, very large increases were observed in a single pool, with varying degrees of overlap between groups for the remaining pools. In addition, most of the putative biomarkers that were observed to be elevated in EP appeared to correlate with an observed higher hemolysis in EP pool 2 compared with all other pools. Consistent with this concern, the largest overall fold increases for EP were multiple database entries for the hemoglobin subunits (Supplemental Table S1). In addition, analysis of peptide trends for some putative biomarker candidates showed wide variations indicative of substantial noise at the peptide level for a number of proteins. The most common sources of this noise included: very weak signals, interference from unrelated incompletely resolved ions, minor variations in peptide elution, and/or imperfect alignment of corresponding peaks within a gel slice group.
To further prioritize candidate biomarkers based on their ability to distinguish between EP and IUP, we considered two additional statistical parameters, sumZscores and Pm for each identified peptide, rather than a strict fold change cutoff to identify candidate biomarkers (see Methods and Supporting Information). Interestingly, although sumZscore and Pm are distinct and independently defined, we observed an encouraging trend governing the lower bound on sumZscore based on both the current data set (Figure 2A) and simulated data. Specifically, as we restricted Pm to lower values, that is, filtering for peptides with good Pm scores, we also guaranteed a good lower bound on sumZscore (Figure 2). Hence, there is negligible benefit to considering both parameters over considering Pm alone. To identify the highest priority candidate biomarkers, we selected those proteins where at least 80% of the identified peptides had Pm<0.3 and detectable intensities for at least eight of the 12 data sets. This analysis identified nine high-priority candidate biomarkers as shown in Table 2. In addition, three proteins from the initial candidate biomarker list (PAPPA, CSH1, and PAEP) that failed the stringent Pm statistical test were added to the high-priority candidate biomarker list due to their previously reported association with EP.8, 13
Elucidator peptide trend plots were used to evaluate further the correlation of peptide intensities within a protein with EP and IUP and to visualize the effectiveness of our statistical tests. First, known common contaminants such as keratins and trypsin were removed and signals from duplicate analyses were averaged for all 8,438 high-confidence peptides (Peptide Teller probability > 0.8). Then, data were Z-score transformed to emphasize relative intensity changes and adjust for differences in signal intensity of different peptides. Representative peptide trends are shown in Figure 3. As expected, based upon the above analyses, ADAM12 and ISM2 show consistent differences between experimental groups and minimal variation in trends between peptides within these proteins. ISM2 had a single peptide that failed the probability of misclassification test out of 10 unique peptides annotated to this protein. In contrast, the peptide trends for PAEP were highly variable, with only two of seven peptides passing the probability of misclassification test. When peptide trends for putative biomarkers show such wide variability it becomes more difficult to predict whether such candidates will be useful biomarkers, as it is uncertain which subset of peptides most closely reflects the actual protein abundance levels. As noted above, this candidate was retained in our selected biomarker group both because it has been previously associated with EP and to test whether our probability of misclassification test is too restrictive and should be relaxed. Finally, SELENBP1 is an example of a putative biomarker from the Elucidator comparison with an overall significantly higher abundance in EP. Although most peptides annotated to this protein show a consistent trend, the difference between groups is primarily due to a very high value in the single EP sample with the most hemolysis. These data suggest that this protein will be less specific than the high-priority biomarkers discussed above.
Quantitative changes of all putative candidate biomarkers also were examined by summing peptide intensities for each protein. Comprehensive peptide intensity reports for aligned data, prior to combining replicates, were generated in the Elucidator System and exported to Excel for the 70 putative candidate biomarkers identified in the initial Elucidator analysis. Peptides were sorted based on annotated protein description, peptide intensities for candidate biomarkers were extracted and summed, and fold change values were calculated from combined average intensities for EP or IUP at the individual peptide and protein levels. Technical replicates for the 12 candidate biomarkers listed in Table 2 showed good reproducibility. CVs ranged from 0.25– 89% with 72% of samples having CVs less than 25%. The peptide sequences, individual sample intensity data, fold changes, and probability of misclassification (Pm) for these selected biomarkers are shown in Supplemental Table S2. The corresponding data for the other putative biomarkers listed in Supplemental Table S1 are shown in Supplemental Table S3.
To address closely related protein isoforms, the effects of potential incorrect assignment of shared peptides to the wrong isoform were evaluated for the selected candidate biomarkers in Table 2. All protein codes returned from the Rosetta Elucidator annotation were selected and these sequences were aligned to identify common and unique peptides. Fold changes were re-calculated considering only significant peptides and only isoform-specific significant peptides. The fold changes were very similar for all three approaches for all the high-priority biomarkers. In addition, all peptides from Table 2S were Blasted against the same database to identify additional isoforms supported by identified peptides. Any ambiguous isoform identifications are noted in the footnote in Table 2S.
Quantitative comparisons of individual technical replicates are shown in Figure 4 for representative candidate biomarkers and several reference non-candidate serum proteins. As expected, the trends at the protein level closely parallel those at the individual peptide level for those proteins where there was minimal noise and variability for the majority of peptides, such as ADAM12 and ISM2. These data further illustrate that overall protein intensities for duplicate runs were highly consistent. Evaluation of representative non-candidate serum proteins with known normal concentrations show very similar levels across all EP and IUP pools for the three most abundant proteins. Interestingly, there is very good agreement between known serum levels of these proteins and the observed protein intensities from the Elucidator analysis. Specifically, CFX at~10 μg/ml, CETP at ~2 μg/ml, and TIMP2 at ~0.1 μg/ml35, 36 represent sequential order of magnitude differences in known concentrations, and the observed protein intensities are approximately 109, 108 and 107, respectively. This illustrates excellent agreement between known concentrations and observed relative signals using label-free quantitation. Furthermore, out of a list of approximately 40 proteins with reported concentrations between 10–100 ng/mL,35 we have identified 8 proteins, including FTL (Figure. 4), indicating moderate capacity to detect proteins in the ng/ml range using this method.
There are a number of alternative methods of fractionating serum proteins after major protein depletion, including strong cation exchange or off-gel electrophoresis of peptides, or solution IEF of proteins. However, fractionation of intact proteins by 1-D SDS gels preserves information about protein size, thereby providing insights into some forms of protein processing, major post-translational modifications, or alternative splicoforms that are more likely to be missed by alternative fractionation methods.37 An interesting example in the current study is the observed molecular weight and peptide distribution of ADAM12 in serum (Figure 5). It is apparent from the distribution of unique peptides to distinct regions of the gel that ADAM12 is represented in these sera by Pro-domain and EC domain fragments, but not by either full-length protein or the intact extracellular portion of the protein. The observed domain sizes as determined by SDS gel migration are in good agreement with those previously reported for ADAM12 domains on 1-D gels.38 While the two identified fragments show similar relative abundances in the current data set, it remains to be determined whether this trend holds up when larger patient populations are evaluated. Furthermore, knowledge of the precise molecular form(s) of a protein that correlate with a disease or medical condition can be invaluable when setting up validation assays using either MRM or immunoassay-based methods.
It is well known that databases are constantly evolving as interpretations of genome sequences are refined. When the challenges of interpreting protein coding regions and gene variations are considered, it is not surprising that some database changes may not be positive, that is, some database updates in protein sequences may not correctly reflect the most commonly observed sequences in actual biological specimens. One such example is illustrated in Figure 4 where an ADAM12 peptide, SGDLWIPVK, was identified as a full tryptic peptide since it is preceded by an R. However, in a more recent version of the database, this R has been changed to a G. Hence, this peptide would no longer be identified as a full tryptic peptide with the most recent database version. The intensities for this peptide (Supplemental Table S2), as well as peptide trend plots (Figure 3), show that yields for this peptide in all patient samples are similar to other identified ADAM12 peptides, which suggests that the older version of the sequence more accurately reflects the actual common form of this protein. Further evaluation of these sequence differences in larger patient populations are needed to verify the sequence that occurs most frequently in the human population and whether the alternative sequence is a lower frequency single nucleotide polymorphism.
In an initial proof-of-principle independent test of the quantitative changes observed in the discovery phase, we used MRM analysis to further analyze the five of our 12 selected candidate biomarkers that were observed to be contained within gel slices 12–15. This group included a novel EP candidate biomarker identified in this study (ADAM12), two previously reported EP biomarkers that were ranked as high priority in the current study (CGA and CGB) and two previously reported EP biomarkers identified by the Elucidator workflow but with only a few high probability peptides (CSH1 and PAEP). For each gel slice, tryptic digests from the same nine depleted and fractionated IUP sera used in the discovery phase were pooled and used for targeted LC-MS/MS analysis in the Orbitrap mass spectrometer. A pool of IUP sera was selected because all targeted proteins of interest were observed to be higher in IUP compared with EP. Previously identified, as well as several theoretical tryptic peptides predicted to be suitable for MRM assays (no oxidation sensitive residues, readily cleavable tryptic boundaries, >6 and<25 residues), were analyzed using a parent mass list for the expected precursor ions as described in Methods. Peptides successfully identified in the targeted analysis were used to establish MRM assays. During MRM assay development using the same pooled IUP sample, at least five predicted strong transitions were tested and peptide identities were determined by the observed superposition of multiple transitions for each peptide of interest. Furthermore, the LC chromatographic systems used for the targeted analyses on the Orbitrap and the 4000Q MRM analyses were matched so that retention times were nearly identical on the two systems, thereby providing further confirmation that signals for the intended peptides were being quantitated in the MRM studies.
A scheduled MRM assay method was developed where at least three transitions per peptide and at least three peptides per protein could be confidently detected and quantified. This assay then was applied to quantitative analysis of the original EP and IUP pools as well as the nine individual EP and nine IUP sets of tryptic digests that were pooled for the original discovery experiments (Figure 6). These MRM assays were conducted using label-free quantitation, that is, integration of signals for transitions without normalizing to an internal standard peptide. While use of stable isotope-coded internal standard peptides are needed for MRM assay consistency over long time periods and for portability of assays between labs, we found that LC and instrument performance could be maintained at a consistent level for short time periods. For example, periodic evaluation of a reference protein digest in between experimental samples can be used to monitor performance and to ensure that CVs of reference peptides remain within acceptable limits. We have observed that CVs less than 25% can typically be maintained for a week or longer when low femtomole amounts of an external standard tryptic digest are injected. The use of label-free MRM analysis compared with stable isotope internal standards enables more rapid, economical initial screening of candidate biomarkers in modest-sized sample sets prior to setting up assays for the most promising candidate biomarkers using internal standards.
As illustrated in Figure 6, there is consistent agreement of relative intensity trends between the discovery phase label-free protein quantitation using the Elucidator quantitation and the results from MRM analysis of the same pools. Similarly, when individual samples were quantified and the results of the three samples comprising each pool were averaged, these values were highly consistent with those obtained for the pooled sera using both label-free methods. The MRM analyses of the five tested biomarkers on all individual samples are shown in Figure 7. Technical replicates of the individual EP and IUP samples showed good reproducibility. CVs ranged from 0 to 141%; however, only a single peptide from CSH1 in the EP set was highly variable due to low signal intensity. CVs for all other peptides were below 60% and the majority (94%) of peptides monitored had CVs less than 25%. Significance between groups was analyzed by unpaired t-test with Welch's correction (calculated using GraphPad Prism, v 5.03; GraphPad Software, LaJolla, CA). Not surprisingly, data from individual samples show more scatter and more overlap between groups than with pooled samples, with significant differences between groups for ADAM12, PAEP, and CGA (p ≤ 0.05). This partial overlap between groups, as well as substantial heterogeneity within groups, is a common problem encountered for most biomarkers.. For example, as indicated above, a single serum value of β-hCG (CGB), the current best diagnostic marker for EP, cannot completely segregate between EP and IUP due to substantial overlap of the ranges for EP and IUP specimens.9 We observed similar results in this small cohort for CGB, as well as CSH1, which are not significantly different between groups.. ADAM12, while significant when comparing mean intensities, also shows substantial variability among individual controls (IUP group) with some values overlapping the EP range. Hence, there will be false positives and negatives when using any of these biomarkers in isolation. The next step is clearly to test the candidate biomarkers on a larger, independent patient cohort, both individually and in multi-biomarker panels. It is anticipated that the most definitive diagnostic test will be a multiple biomarker test, as it is unlikely that any single biomarker will have non-overlapping ranges between EP and IUP.
An essential feature of label-free comparisons is that technical variations in sample processing, HPLC performance, sample injection, and mass spectrometer performance must be minimized over the entire course of the experiment. This study demonstrates the feasibility of maintaining consistent performance over more than 250 LC-MS/MS runs when using a 3-D discovery method for comparing sera from EP and IUP patients. However, analysis of the large volume of resulting data is complex. One critical factor when proteomes are fractionated is that the software utilized must be capable of matching and quantifying corresponding related ion currents across adjacent fractions because slight variations in distribution of proteins or peptides across fractions is inevitable in complex samples. The Rosetta Elucidator software used in this study combines data for a given peptide across fractions provided that at least one MS/MS spectra in each fraction resulted in the correct peptide identification. Furthermore, protein intensities are based upon the peptide identifications associated with the protein. Hence, although data alignment and quantification is conducted at the MS signal intensity level, correct annotation of peptides and grouping of peptides into consensus proteins is still critically important. Comparisons of alternative peptide score filtering and assignment of peptides to proteins showed that using the Peptide and Protein Tellers with relatively stringent filtering criteria minimized quantitative noise with identification of 70 candidate biomarkers that exhibited at least 2.5-fold differences between the EP and IUP groups. Further statistical analysis at the peptide level subsequently was used to select the most promising 12 candidate biomarker for future validation efforts in an independent patient cohort, which included known and novel EP biomarkers. This analysis also identified specific isoforms of some known proteins and specific proteolytically processed forms of ADAM12 that are EP biomarkers. Interestingly, label-free discovery analysis intensities for several known reference serum proteins compared favorably with their reported abundance levels, and relative abundances of candidate biomarkers from the label-free discovery analysis were consistent with label-free pilot MRM validation assay values for both serum pools and individual samples that constituted these pools. These results demonstrate robust, reproducible, in-depth 3-D serum proteome discovery, and subsequent pilot-scale validation studies readily can be achieved using label-free quantitation strategies.
This work was supported by National Institutes of Health grant HD063455 and by an institutional grant to the Wistar Institute (NCI Cancer Core Grant CA10815). We gratefully acknowledge the assistance of The Wistar Institute Proteomics Core and the Bioinformatics Core in this project, as well as the administrative assistance of Mea Fuller.