|Home | About | Journals | Submit | Contact Us | Français|
Spontaneous preterm birth (PTB) before 37 completed weeks of gestation resulting from preterm labor (PTL) is a leading contributor of perinatal morbidity and mortality. Early identification of at-risk women by reliable screening tests could alleviate this health issue; however, conventional methods such as obstetric history and clinical risk factors, uterine activity monitoring, biochemical markers, and cervical sonography for screening women at risk for PTB have proven unsuccessful in lowering the rate of PTB. Cervicovaginal fluid (CVF) might prove to be a useful, readily available biological fluid for identifying diagnostic PTB biomarkers. Human columnar epithelial endocervical-1 (End1) and vaginal (Vk2) cell secretomes were employed to generate a stable isotope labeled proteome (SILAP) standard to facilitate characterization and relative quantification of proteins present in CVF. The SILAP standard was prepared using stable isotope labeling by amino acids in cell culture (SILAC) of End1 and Vk2 through seven passages. The labeled secreted proteins from both cell lines were combined and characterized by liquid-chromatography-tandem mass spectrometry (LC-MS/MS). 1211 proteins were identified in the End1-Vk2 SILAP standard, with 236 proteins being consistently identified in each of the replicates analyzed. Individual proteins were found to contain < 0.5 % of the endogenous unlabeled forms. Identified proteins were screened to provide a set of fifteen candidates that have either previously been identified as potential PTB biomarkers or could be linked mechanistically to PTB. Stable isotope dilution LC-multiple reaction monitoring (MRM/MS) assays were then developed for conducting relative quantification of the fifteen candidate biomarkers in human CVF samples from term and PTB cases. Three proteins were significantly elevated in PTB cases (desmoplakin isoform 1, stratifin, and thrombospondin 1 precursor), providing a foundation for further validation in larger patient cohorts.
PTL is characterized by the presence of regular uterine contractions causing cervical dilation before term gestation (between 20 and 37 weeks).1 Spontaneous PTB before 37 completed weeks of gestation results from PTL and is a leading contributor of perinatal morbidity and mortality.1,2 According to a report from the National Institute of Medicine on PTB, approximately half a million babies are born preterm annually in the US alone.3 Infants born preterm have a significantly higher risk of contracting infections as well as suffering from neuro-developmental disorders, and this serious medical issue carries an economic burden of approximately $26 billion/year in the US.3 Early identification of at-risk women by reliable screening tests could alleviate this health issue; however, conventional methods such as obstetric history and clinical risk factors, uterine activity monitoring, biochemical markers, and cervical sonography for screening women at risk for PTB have proven unsuccessful in lowering the rate of PTB.2
Recent advances in the application of various proteomics-based platforms have facilitated the process of discovery of novel protein biomarkers of PTB.4-7 Mass spectrometry-based shotgun proteomics methodology for identifying tryptic peptides has become a standard method for characterizing proteomes of biological fluids as well as tissues.8-10 Further advancements in multidimensional protein separation techniques have made it possible to dig even deeper to identify low abundance proteins and thereby increase the dynamic range of detection in complex biological samples such as plasma and serum.11,12 In addition, the application of SILAC technology allows for the generation of entire labeled proteomes that can be used as internal standards to facilitate accurate quantification of peptides, and proteins thereof, in biological samples.13,14 Finally, LC-MRM/MS methodology provides speed, sensitivity, selectivity and the ability to quantitate multiple peptides simultaneously.15,16 The present study was designed to probe human CVF samples for candidate protein biomarkers by combining the quantification capabilities of SILAC to generate a Stable Isotope Labeled Proteome (SILAP) standard and LC-MRM/MS methodology with enhanced search capabilities of multidimensional LC/MS/MS.
During pregnancy, the fetus is present in the amniotic sac, and is connected to the placenta by the umbilical cord.17 The placenta is an ephemeral organ present only during pregnancy, and is implanted in the wall of the uterus, the major female reproductive organ.17 The uterus is connected on one end to the cervix, which opens into the vagina, and is attached to the fallopian tubes at the other end.17 Although the biology of PTB is not well understood, it is clear that the complex interactions of signaling pathways in these reproductive organs are ultimately responsible for controlling PTL and PTB, thereby making these organs potential sites for biomarker discovery.2,18-21 CVF is a complex mixture of secretions from the vagina, endocervix, endometrial decidua and amniochorion; and therefore serves as an important diagnostic site to monitor maternal and fetal health in pregnant women.22 Collection of CVF is minimally invasive and relatively safe, as compared to collection of amniotic fluid by amniocentesis or tissue excisions from the placenta or uterus, and therefore is expected to be more readily available and potentially useful as a source of biomarkers in pregnant women.22 End1 and Vk2 cells were grown in SILAC conditions to generate a SILAP standard containing the secreted proteins.13,14,23 The rationale behind using End1 and Vk2 cells was that they are “normal” transformed cells of human origin, have a secretory phenotype, and therefore their secreted proteins would model the proteins present in human CVF.23 The labeled End1-Vk2 secretome was characterized by three dimensional (3D)-LC-MS/MS, fifteen potential candidate protein biomarkers of PTB were chosen, and stable isotope dilution LC-MRM/MS assays were designed to accurately conduct relative quantification of these candidate biomarkers in human PTB and term CVF samples using the SILAP internal standard.
Human immortalized End1 and VK2 cells were obtained from American Type Culture Collection (ATCC, Manassas, VA). Cells were grown in stable isotope-labeled serum-free DMEM/F12 media containing [13C6 15N2]-lysine and [13C6 15N1]-leucine (Cambridge Isotopes, Cambridge, MA). The SILAC media was supplemented with 50 kU/L penicillin, 50 mg/L streptomycin, 10 mL/L Stemline® keratinocyte growth supplement (Sigma-Aldrich, St. Louis, MO) containing bovine pituitary extract, epidermal growth factor, insulin, transferrin, hydrocortisone, epinephrine, and calcium chloride. Cells were passaged seven times and then supernatant was collected every alternate day, filtered through a 0.22 μm filter, concentrated through a 5 kD MW cut-off spin-filter (Millipore, Billerica, MA), pooled and stored at -80°C until analyzed. The isotopic purity (at least 99.5 %) of the End1 and Vk2 SILAC secretomes was confirmed as described previously.14
SILAC supernatants from the End1 and VK2 cells were each double depleted of the six high abundance plasma proteins (albumin, transferrin, haptoglobin, anti-trypsin, IgG, IgA) using a multiple affinity removal system (MARS Hu6) affinity LC column (4.6 × 100 mm) from Agilent Technologies (Palo Alto, CA). Depletion was performed according to the manufacturer’s instructions. Depleted supernatants were then concentrated as described above, protein concentrations were estimated by Coomassie Protein Assay (Pierce Scientific, Milwaukee, WI), and stored at -80°C. Equal amounts of protein (100 μg/cell line) from the immuno-depleted End1 and Vk2 SILAC-labeled supernatants were mixed together in a 1:1 ratio, henceforth referred to as the End1-Vk2 SILAC secretome. Proteins were precipitated using a standard methanol/chloroform protocol. For every 100 μL of sample, 400 μL methanol, 100 μL chloroform and 300 μL water were added stepwise, with thorough mixing at each step. Samples were centrifuged for 10 min at 12,000 × g, and the upper phase was removed. The protein precipitate was washed with 300 μL of methanol and then centrifuged for 10 min at 12,000 × g. Methanol was decanted and the protein pellets were allowed to air dry. Proteins were resolubilized in a 6 M urea/2 M thiourea solution, reduced for 30 min with 20 mM dithiothreitol (DTT) at room temperature, and then alkylated for 20 min with 50 mM iodoacetamide (IAA) at room temperature in the dark. The protein mixture was then diluted tenfold with 50 mM ammonium bicarbonate buffer, to reduce the final concentration of urea below 1 M and to adjust the pH to approximately 8.5, and digested with sequencing grade modified trypsin (Promega, Madison, WI) at 37°C for 16 hrs. Trypsin was added at a 1:20 w/w ratio of enzyme to protein sample being digested. The tryptic digest was lyophilized in a vacuum concentrator (Jouan, Winchester, VA); pH was lowered to 3 with 10 % formic acid, and diluted in strong cation exchange (SCX) mobile phase A (25 mM ammonium formate, 25 % acetonitrile, pH 3). SCX chromatography was performed on a PolySulfoethyl A column (100 mm × 2.1 mm, 5 μm 300 Å, PolyLC, The Nest Group, Southborough, MA) attached to an HP 1100 HPLC system (Agilent Technologies, Santa Clara, CA). A linear gradient was performed over 65 min from 100 % mobile phase A to 100 % mobile phase B (800 mM ammonium formate, 25 % acetonitrile, pH 5.8). For each sample, thirty-three 2-min fractions were collected and pooled into 9 fractions as follows: Fractions 1 through 6 were discarded; fractions 7 through 12 were used as individual fractions; fractions 13 through 18 were pooled into a single fraction; fractions 19 through 25 were pooled into a single fraction; and fractions 25 through 32 were pooled into a single fraction.. These 9 fractions were lyophilized and stored at -80°C until further analysis.
Individual SCX fractions (9 fractions/replicate; three independent replicates) from trypsin-digested End1-Vk2 SILAC secretome were resuspended in 0.1 % formic acid, 5 % acetonitrile, and further separated by online reversed-phase microflow LC and analyzed by tandem mass spectrometry in positive electrospray ionization MS/MS mode. Separation was performed at a flow rate of 16 μL/min, on a monomeric C18 Higgins column (100 mm × 0.5 mm, 5 μm, 200 Å, The Nest Group, Southborough, MA) with an Eksigent LC system equipped with an autosampler (Dublin, CA), on-line to an LTQ mass spectrometer (Thermo Scientific, Waltham, MA). Mobile phases used for separation were 0.1 % formic acid in water (A) and 0.1 % formic acid in acetonitrile (B). A linear gradient (0-50 % mobile phase B) was applied for 80 min, followed by a 10 min wash at 100 % mobile phase B and then equilibration for 10 min with 0 % mobile phase B. The LTQ was operated in full scan (m/z 300-2000) data-dependent mode set to automatically switch between MS and MS/MS acquisition. The five most intense precursor ions were dynamically isolated for fragmentation in the ion trap using collision induced dissociation (CID) with normalized collision energy of 35 %.
The MS/MS spectra were searched against an indexed human RefSeq database (version updated 11/2007, 33,439 entries) with TurboSEQUEST™ (Thermo Scientific, Waltham, MA, version 27.12) and Mascot™ (Matrix Science, Boston, MA, version 2.2.03). Strict trypsin cleavage rules with maximum of two missed cleavages, mass accuracy of 1 Da for the precursor and fragment ion, and variable modifications of methionine oxidation, carboxyamidomethlyation on cysteine, [13C6 15N1]-leucine and [13C6 15N2]-lysine were applied in the search criteria. The SEQUEST™ and Mascot™ output files were integrated into Scaffold version 2.01 (Proteome Software, Portland, OR) for validating MS/MS based peptide and protein identifications. Assignment of peptide sequences was performed using the PeptideProphet™ algorithm.24 PeptideProphet™ accounts for the distribution of scores over an entire dataset to calculate the probability of a correct assignment for every peptide. PeptideProphet™ calculates false-positive error rates at specific probability score cutoff values for each dataset.25 A minimum PeptideProphet™ probability score of ≥ 0.5 was used to remove low probability peptides. At this cutoff, the estimated false-positive error rate was 10.8 %. Protein identifications were accepted with a minimum ProteinProphet™ probability score of ≥ 0.8 and at least two identified unique peptides.26 For this dataset, a ProteinProphet™ probability score of > 0.8 corresponded to a false-positive error rate of 4 %. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. The entire work-up of the End1-Vk2 SILAC secretome has been outlined in Figure 1.
The identified list of proteins that met the filtering criteria were analyzed for biological pathway association using the Kyoto Encyclopedia of Genes and Genomes (KEGG, release 47.1, updated 09/2008) database.27 KEGG is a database of manually curated biological pathways assigning proteins with known biological function to the different pathways, and is updated constantly. Since the raw MS/MS spectra were searched against the RefSeq database, the identified proteins were reported in terms of their gene, with their corresponding Genbank Geninfo Identifier (GI) accession numbers. The corresponding protein IDs were extracted from the GI accession numbers, and converted to the respective KEGG IDs using the Uniprot database.28 The KEGG IDs were then queried against each species-specific (Homo sapiens) biological pathway contained in the KEGG database to establish the biological pathway association of the identified proteins. The GI accession numbers that were not implicated to be associated with any known biological pathway in the KEGG database were searched against the Gene Ontology (GO) classification using the Database for Annotation, Visualization and Integrated Discovery (DAVID) Functional Annotation Tool (http://david.abcc.ncifcrf.gov) and were assigned the most relevant biological process/molecular function.
The End1-Vk2 SILAC secretome was characterized by full scan LC-MS/MS so that it could be used as a SILAP standard to quantitate candidate protein biomarkers in human CVF samples by LC-MRM/MS assay. Since the End1-Vk2 SILAP standard was stably labeled with [13C6 15N2]-lysine and [13C6 15N1]-leucine, it generated proteins and thereby peptides that were greater in mass than their endogenous counterparts due to a mass shift of 7 amu for each leucine and 8 amu for each lysine residue present in them. Therefore, the proteins and hence the peptides of the End1-Vk2 SILAP standard will be referred to as the “heavy” (isotopically labeled) forms as compared to the “light” (unlabeled) forms found naturally in the human CVF samples. Identified heavy proteins that met all the selection criteria described above were screened to create a set of fifteen candidates for designing LC-MRM/MS assays. Each of the proteins has been implicated previously as a potential biomarker of PTB (Table 1). MS/MS spectra of all the unique heavy peptides identified in these fifteen candidates were manually inspected, one best unique heavy peptide per protein was chosen, and the transition from the precursor ion to the most intense fragment ion for each of these fifteen heavy peptides was identified, thus generating a set of fifteen heavy MRM transitions. Peptides were chosen with a probability score of > 0.9, corresponding to a false-positive error rate of 2.1 %. A corresponding set of fifteen light MRM transitions for these selected peptides was generated using ProteinProspector® software (version 4.27, http://prospector.ucsf.edu). The LC-MRM/MS method was designed to analyze 10 transitions (light and heavy sets of 5 peptides) simultaneously in a single LC-MS run under the conditions described above, therefore each sample was injected three times to complete the 15-protein analysis in three sets. The order of sample injection was randomized in each MRM set to eliminate sample bias. Blank injections were made in each MRM set to check for any carry-over effects. The LC method used for the MRM assays was the same as that used for the LC-MS/MS analysis of End1-Vk2 SILAC secretome, as described above.
CVF samples were selected at random from among 2,100 subjects enrolled in a randomized controlled trial at the Hospital of the University of Pennsylvania. The study was approved by the HUP institutional review board and registered with www.clinicaltrials.gov (NCT00116974). Subjects who presented for prenatal care before 20 weeks’ gestation (confirmed by ultrasound) were enrolled in the study, and CVF samples were self-collected by subjects during routine prenatal visits between 24-28 weeks’ gestation. Samples were collected using Dacron-tipped swabs, placed in 500 μL phosphate-buffered saline, immediately flash frozen in liquid nitrogen and stored at -80°C. A nested case-control study was performed in which CVF samples were selected at random from five subjects who delivered preterm between 28-32 weeks’ gestation following spontaneous PTL and five subjects who delivered after 37 weeks’ gestation (controls). Proteins were extracted from these samples by the methanol/chloroform method as described above in the preparation of the End1-Vk2 SILAP standard, and stored at -80°C. For initial screening of fifteen candidate biomarkers by LC-MRM/MS assay, equal amounts of protein (200 μg/patient) from five control and five PTB subjects were spiked with an equal amount of undigested End1-Vk2 SILAP standard (200 μg/patient), reduced for 30 min with 20 mM DTT at room temperature, then alkylated for 20 min with 50 mM IAA at room temperature in the dark. Samples were then digested with trypsin (1:20 w/w ratio of enzyme to sample protein) at 37°C for 16 hrs, and desalted with Vydac-C18® macrospin cartridges (The Nest Group, Southborough, MA). Differences in the level of these 15 candidate biomarkers between the control and PTB subjects were compared statistically by non-parametric Mann-Whitney rank-sum test, with p-value < 0.05 considered as significant.
The flow chart depicting the process of generation and characterization of End1-Vk2 SILAP standard has been outlined in Figure 1. LC-MS/MS analysis (full scan data dependent acquisition) identified a combined total of 1211 proteins in the three replicates. A Venn-diagram describing the distribution of proteins identified in the individual replicates is shown in Figure 2. Individually, 692 proteins were identified in replicate 1, 420 in replicate 2, and 779 proteins in replicate 3. Of the 1,211 proteins identified, 236 were identified in all three replicates (supplementary data).
For the 236 proteins identified in all three replicates, biological association to known functional pathways was established using the KEGG database. Since multiple GI accession numbers can correspond to a single protein entry, and vice versa, the dataset of 236 proteins actually corresponded to a total of 307 GI accession numbers (data not shown). Of the 307 GI accession numbers, 129 were found to be associated with at least one biological pathway, and several with multiple pathways. Table 2 lists the total number of GI accessions that were found associated with at least one biological pathway contained in the KEGG database.
A panel of fifteen candidate protein biomarkers was created for the development of LC-MRM/MS assays from the 236-protein dataset. The 236 identified proteins were searched against previously published studies related to either proteins or genes identified in human CVF, amniotic fluid, placental or uterine tissue samples with regards to their relevance to PTB or other pregnancy-related conditions. These comparisons were utilized to select a panel of fifteen relevant proteins for developing a quantitative LC-MRM/MS assay to measure these proteins in human CVF samples (Table 1).
Table 3 displays the MS/MS fragmentation information of the fifteen proteins (one unique peptide/protein) required for developing the LC-MRM/MS assay. The MS/MS spectral data of each of the unique “heavy” peptides identified in the fifteen protein panel from the LC/MS/MS experiment of the End1-Vk2 SILAP standard was manually inspected to select the best one unique “heavy” peptide per protein. For each unique “heavy” peptide selected, the charge state and the mass to charge (m/z) values for the precursor ion as well as for the most intense fragment ion generated were obtained from the MS/MS data. These details were utilized to set up the “heavy” transitions for the LC/MRM/MS assays, as shown in Table 3. The corresponding “light” transitions for monitoring the fragmentation of the endogenous counterpart of these fifteen peptides in human CVF samples were computed using the ProteinProspector® software, and are represented in Table 3.
LC-MRM/MS assays were performed to quantitate the peptide (of the fifteen protein panel) levels in the CVF samples obtained from five control subjects that delivered at term and five PTB subjects that delivered pre-term (cases). Since each sample was spiked with the same quantity of End1-Vk2 SILAP standard, two MRM transitions were monitored for each of the fifteen peptides in the CVF samples, one corresponding to the “light” form (L, endogenous peptide present in the samples), and the other corresponding to the “heavy” form (H, labeled peptide from the spiked End1-Vk2 SILAP standard). The area ratio of the “light” to the “heavy” peptide (L/H) for each of the fifteen proteins was calculated in each sample in the two groups (control and PTB), and the mean L/H ratios for the respective groups were computed accordingly. The spiking of SILAP standard in each sample allowed for normalizing and thereby quantitating the levels of these fifteen proteins in terms of their respective peptides. The relative fold-change in expression levels of the fifteen peptides between the two patient groups were determined using the mean L/H values obtained for the PTB and control groups respectively, and the corresponding p-value for each peptide was calculated by applying the Mann-Whitney rank-sum statistical test. A p-value of < 0.05 was considered as statistically significant.
Table 4 summarizes the LC-MRM/MS assay data for the fifteen proteins quantified in terms of their fifteen peptides (one unique peptide/protein). The fold-change data from Table 4 shows that the levels of desmoplakin isoform 1, stratifin, thrombospondin 1 precursor, fatty acid binding protein 5 (psoriasis associated), extracellular matrix protein 1 isoform 1 precursor, secretory leukocyte peptidase inhibitor precursor, and plasminogen activator inhibitor-1 peptides were observed to be higher in the PTB subjects as compared to the control subjects. In contrast, fibronectin 1 isoform 3 preprotein, urokinase plasminogen activator preprotein, cathepsin L2 preprotein, tissue inhibitor of metalloproteinase 1 precursor, and calsyntenin 1 isoform 2 peptides levels were lower in the PTB group as compared to the control group. Cystatin C precursor peptide levels were observed to be the same between the two groups. Laminin alpha 3 subunit isoform 1 and lamin A/C isoform 2 peptide levels were below the limits of detection in both patient groups, and therefore their fold change could not be computed.
The p-value data from Table 4 showed that the calculated levels of three out of the fifteen proteins (desmoplakin isoform 1, stratifin, and thrombospondin 1 precursor) in the panel were found to be significantly different (p-value < 0.05) between the control and PTB groups. Desmoplakin isoform 1 peptide was quantified to be 70.7 fold higher in the PTB samples as compared to the control samples, and similarly, stratifin peptide (42.4 fold) and thrombospondin 1 precursor peptide (5.1 fold) levels were significantly greater in the PTB patient samples. Figure 3 displays the scatter-plot graph of the individual L/H area ratios calculated for desmoplakin isoform 1, stratifin, and thrombospondin 1 precursor peptides in the CVF samples obtained from the ten subjects (five controls and five PTB). Collectively, the data suggest that these three proteins could serve as potential markers for PTB, and should be investigated further in a larger cohort of patient samples.
LC-MRM/MS chromatograms for desmoplakin isoform 1 and cystatin C precursor peptides from ten CVF samples are shown in Figures Figures44 and and5.5. Chromatograms of the CVF samples obtained from the five control subjects are shown in panel A and those from the five PTB subjects are shown in panel B. Two MRM transitions are displayed in each patient sample chromatogram; obtained by monitoring the “light” (upper channel) and the “heavy” (lower channel) transitions resulting from the fragmentation of the “light” and “heavy” peptides respectively. The peak area (AA) computed for the “light” transition represents the quantity of the endogenous peptide, and thereby the corresponding protein, present in that CVF sample and this value is normalized using the peak area calculated for the corresponding “heavy” transition which arises from the labeled peptide present in the spiked End1-Vk2 SILAP standard. This normalization allows for correcting any kind of sample losses that may have occurred due to sample processing, since each patient sample was spiked with an equal quantity of the End1-Vk2 SILAP standard right at the beginning of the assay procedure. Figure 4A clearly demonstrates that the desmoplakin isoform 1 peptide was below detection limits (regarded as a zero value) in three out of five control patient samples. The remaining two control patient samples contained very low levels of this peptide. In contrast, the desmoplakin isoform 1 peptide levels were significantly elevated in all five of the PTB patient samples (Figure 4B). This differential expression suggests that desmoplakin isoform 1 protein might become elevated before PTL and PTB (Table 4). Levels of cystatin C precursor peptide did not differ when comparing control (Figure 5A, Table 4) to PTB (Figure 5, Table 4) subjects.
A multi-dimensional LC-MS/MS platform for protein discovery along with in vitro SILAC models for peptide quantification have been employed to identify and quantify novel protein biomarkers of PTB in human biological samples (CVF). Several biological fluids, including CVF, plasma and amniotic fluid, and tissues including placenta, myometrium, cervix and uterus have all been investigated in the quest for PTB biomarkers.4-7,10,21,22,29,30 These previous studies have applied proteomics technologies to identify numerous peptides and their corresponding proteins in these biological samples by employing data dependent acquisition of the MS/MS spectra with the hope of discovering novel biomarkers of PTB.4-6,22,29,30 Operating the mass spectrometer in data dependent acquisition mode in tandem with separation of peptides by LC allows for estimating the total number of times a peptide is identified by an MS/MS spectra and thereby calculating its spectral count.33 The spectral count value can provide an estimate of the relative abundance of the protein in the different samples, and this has been the basis for biomarker discovery in most of the previous efforts.33 In other words, these previous efforts have relied on qualitative analyses to establish differential expression patterns of proteins between the term and PTB samples for biomarker discovery. Not surprisingly, although these prior studies have identified and tested several potential biomarkers (such as fetal fibronectin, corticotrophin-releasing hormone, interleukin-6 (IL-6), C-reactive protein, CD163, thrombin-antithrombin complex, IL-8, SERPINH1, tumor necrosis factor-alpha, monocytic chemotactic protein-1, matrix metalloproteinase-8, ferritin, placental alkaline phosphatases, and relaxin), with the exception of fetal fibronectin, currently there are no reliable biomarkers for PTB in clinical use.2,34-50
In order to make biomarker discovery a more successful and efficient endeavor, it is imperative to study readily available biological samples in individuals before the onset of clinical symptoms, and to be able to quantitate the protein concentrations in the biological samples in question. A unique feature of the present study is that self-collected CVF samples were used from asymptomatic women at 24-28 weeks’ gestation, five of whom ultimately delivered preterm (before 32 weeks) and five of whom delivered without complications at term (after 37 weeks). In general, quantitative protein expression profiles in biological samples are based upon generating peptide standards to be spiked in the biological samples to accurately calculate each corresponding protein concentration in the samples.15,16 However, biological samples are composed of numerous proteins. For example, comprehensive proteomic analyses of human CVF identified from 150 to 685 unique proteins in three separate studies.5,22,29 These findings not only highlight the complex composition of biological samples, but also demonstrate the high variability in protein identification amongst different studies. Therefore, generating peptide standards for quantitating such a large number of proteins in complex biological samples is not only an ambitious but also an economically challenging task.
The current study has employed SILAC methodology to establish a cell model that can be employed to generate a relevant SILAP standard for precise and accurate relative quantification of proteins in complex biological samples. The biological sample investigated in this study was human CVF collected from pregnant women who later delivered at term or preterm. Since there are already several studies that have reported the protein composition of human CVF, rather than focusing efforts on protein identification in the CVF samples, the current study was designed to devise stable isotope LC-MRM/MS based methodology to quantitate multiple proteins simultaneously in the CVF samples.5,7,22,29 Two cell lines grown in SILAC conditions were used to generate a stable labeled secretome. This secretome was characterized by multi-dimensional liquid chromatography tandem mass spectrometry and employed as a SILAP standard. End1 and Vk2 cells are transformed “normal” human cell lines of endocervical and vaginal origin respectively with a secretory phenotype, and therefore were chosen as an in vitro model of proteins present in human CVF.23 The SILAP standard proved to be a rich source of potential biomarkers, with 1211 total proteins identified among all replicates analyzed. 236 of these proteins were consistently identified in all replicates (Figure 2 and Supplementary data).
In order to quantify these 236 proteins by LC-MRM/MS, it would be necessary to monitor at least 236 pairs of transitions (one pair of transition/peptide/protein; each pair for the light and heavy form of the peptide). This presents a significant challenge not only from an instrumentation perspective, but also from a data handling and data processing perspective. During the LC-MRM/MS method development on the LTQ, it was observed that ten MRM transitions could be monitored successfully in a single segment in a run without any significant loss of signal or sensitivity (data not shown). Therefore, the LC-MRM/MS assay was designed to monitor five pairs of transitions (corresponding to the light and heavy form of the peptide) per single run, and thus it would require at least 48 runs to complete the 236-protein quantification. It was also observed that, with the current methodology, at least 100 μg of protein starting material was required per run to allow for quantification. Therefore, 48 runs would require almost 5 mg of protein per human CVF sample, an almost impossible task considering the low protein yield in CVF samples. Moreover, these details are for performing only one replicate analysis. Therefore, it seemed prudent to test our methodology on a small subset of proteins and to establish proof of principle. Since this was a pilot study, it was decided to monitor one transition per peptide, and to quantify one peptide per protein, for a set of fifteen proteins to test our methodology. No doubt, further validation studies would be required to confirm this methodology, by monitoring multiple transitions/peptide and tracking multiple peptides/protein. However, this raised another issue of which fifteen proteins to select from the 236-protein dataset. For this purpose, the 236-protein dataset was subjected to KEGG pathway database analysis to determine their biological relevance. Table 2 summarizes the data obtained from the KEGG analysis, and shows that almost 40 % of the proteins in this dataset were found to be associated with at least one known biological pathway. This indicates that the 236-protein dataset of the End1-Vk2 SILAP standard is a biologically relevant model. Furthermore, these proteins were compared against the known literature pertaining to female reproductive physiology with regards to PTB and other pregnancy related conditions, to narrow down the fourteen protein list that could be considered most relevant to our study (Table 1). In addition, calsyntenin 1 was monitored (Table 1) because it is an abundant protein in both cell types that is involved in cell-cell communication (Table 2). LC-MRM/MS assays (Table 4) were designed for quantifying this initial set of fifteen proteins by adding the End1-Vk2 SILAP standard to the human CVF samples (Table 3). Seven out of the fifteen proteins were elevated in PTB subjects and five proteins were observed to be lower in PTB samples (Table 4). Two of the proteins were below limits of detection, and therefore considered as absent in both the groups, and one protein, cystatin C precursor was unaltered amongst the two groups. This clearly highlights the utility and the robustness of the current methodology to screen multiple proteins simultaneously in complex biological samples in a quantitative fashion. However, it is critical to point out the preliminary nature of this study and that the primary purpose of this study was to test our methodology of utilizing a SILAP standard as a tool for identifying and quantifying potential biomarkers simultaneously by LC-MRM/MS assays in complex biological samples, such as CVF. Future studies will be designed to validate the results obtained in this pilot study. Validation will be performed by monitoring multiple transitions per peptide, tracking multiple peptides/protein, and evaluating a larger cohort of patients. As the linear ion trap instruments are limited by the number of reactions that can be monitored simultaneously, the current methodology will be transferred and validated on a triple quadrupole-based MS, which will allow for monitoring of a significantly higher number of reactions simultaneously. Absolute quantification will then be performed for biomarker proteins that are validated by this approach. This future study would involve synthesizing multiple tryptic peptides for the protein, generating the standard curves for their concentrations, spiking them into human CVF samples as internal standards, and thereby calculating the absolute level of these peptides in CVF samples by LC-MRM/MS assays.
Three proteins (desmoplakin isoform 1, stratifin, thrombospondin 1 precursor) were demonstrated to be significantly (p-value < 0.05) elevated in the PTB subjects as compared to the controls (Figure 3). Although desmoplakin isoform 1 has not yet been associated with any of the known biological pathways in the KEGG database (Table 1), it is well established that desmoplakin-1 is an obligate component of functional desmosomes.51 Desmosomes are intercellular junctions that tightly link adjacent cells, and are involved in regulating cell-cell communication, cell motility and cell adhesion.51 Desmoplakin-1 has been previously identified in human CVF and has also been found to be localized in the human fetal and amniotic membranes.52 Although it is unclear as to how desmoplakin-1 may be involved in PTB, one study had reported desmoplakin, based on its spectral counts, to be significantly down-regulated in PTL.5 Interestingly, we have demonstrated that desmoplakin isoform 1 was significantly elevated, by almost 70-fold, in PTB subjects. Stratifin, also referred to as the 14-3-3σ protein, was found to be associated with cell growth and death pathways in the KEGG database (Table 1). Members of the 14-3-3 protein family, including stratifin, have been shown to be negative regulators of cell proliferation, and some of their effects are by virtue of their ability to inhibit protein kinase C and binding to various members of the growth factor signaling pathways and cell cycle regulators.53 Although the mechanistic link between stratifin and PTB is not established, its levels have been documented to be elevated in PTL.5 From the KEGG analysis, it was observed that thrombospondin-1 precursor is known to be associated with biological pathways under the cell communication as well as signal transduction and signaling molecules and interaction categories (Table 1). It is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions and has the ability to bind to several extracellular proteins.54 It is noteworthy to point out that thrombospondin-1 can bind to fibronectin, one of the fifteen proteins in our study that was observed to be lower in the PTB subjects as compared to the controls (Table 4). This observation suggests that there might be a possibility of interaction between these two proteins, and whether this interaction has any role in PTB remains to be investigated. Furthermore, in previous reports that studied the role of myometrial tissue in PTL in humans as well as mice models, elevated levels of thrombospondin-1 were associated with labor as well as PTL.55,56
In summary, we have shown that a SILAP standard generated from End1 and Vk2 human cell lines can be employed to quantify multiple proteins simultaneously by LC-MRM/MS in human CVF. In theory, the SILAP standard can be used quantify 236 proteins in human CVF samples. The pilot analysis of a set of fifteen proteins in a small set of term and PTB subjects revealed that three proteins (desmoplakin isoform 1, stratifin, thrombospondin 1 precursor) were significantly elevated in subjects who later experienced PTB (Table 4). Based on the known literature, this LC-MRM/MS data is suggestive of the potential of these three proteins as biomarkers of PTB. In contrast, cystatin C precursor protein levels are not an indicator of increased risk of PTB (Table 4), and therefore can serve as a negative control as well as a quality control biomarker to show that CVF samples were collected and stored correctly. Further studies are underway to validate the utility of desmoplakin isoform 1, stratifin, and thrombospondin 1 precursor as PTB markers in a larger dataset of subjects.
Supported by the Pennsylvania Department of Health SAP# 4100020720 and by NIH grants U01HD050088, P30ES013508, and UL1RR024134. We acknowledge the receipt of an Institute for Translational Medicine and Therapeutics Research Fellowship (K.H.Y.).