|Home | About | Journals | Submit | Contact Us | Français|
It is expected that clinically-obtainable fluids that are proximal to organs contain a repertoire of secreted proteins and shed cells reflective of the physiological state of that tissue, and thus represent potential sources for biomarker discovery, investigation of tissue-specific biology, and assay development. The prostate gland secretes many proteins into a prostatic fluid that combines with seminal vesicle fluids to promote sperm activation and function. Proximal fluids of the prostate that can be collected clinically are seminal plasma and expressed prostatic secretion (EPS) fluids. In the current study, MudPIT-based proteomics was applied to EPS obtained from nine men with prostate cancer and resulted in the confident identification of 916 unique proteins. Systematic bioinformatics analyses using publicly-available microarray data of 21 human tissues (Human Gene Atlas), the Human Protein Atlas database and other published proteomics data of shed/secreted proteins were performed to systematically analyze this comprehensive proteome. Therefore, we believe this data will be a valuable resource for the research community to study prostate biology and potentially assist in the identification of novel prostate cancer biomarkers. To further streamline this process, the entire data set was deposited to the Tranche repository for use by other researchers.
Proximal fluids are found adjacent to a given tissue or organ and contain a repertoire of secreted proteins and shed cells reflective of the physiological state of that tissue 1. Hence, proximal fluids are rapidly emerging as discovery sources of proteins, metabolites, and genetic biomarkers for cancers. Seminal plasma and expressed prostatic secretion (EPS) are prostate-proximal fluids that can be collected in the clinic. Seminal plasma consists of fluids originating not only from the prostate, but also from several other male accessory glands such as the seminal vesicles, epididymis, and Cowper’s gland. In aging men with prostatic disease, obtaining seminal fluids for diagnostic assays and screening purposes is highly dependent on patient age and disease status, and thus not an attractive assay medium. In contrast, EPS can be readily obtained in the clinic by forced (or expressed) ejection of prostatic fluids into the urethra following prostate massage. For evaluation of some types of prostatitis, a vigorous massage is done to force fluid through the urethra to collect a few drops from the tip of the penis for subsequent use in microorganism cultures. If a more gentle massage is performed in the context of a digital rectal exam (DRE) of the prostate, the EPS is then immediately collected in voided urine post-exam (termed post-DRE urine, or post-DRE EPS urine). Currently, a commercial genetic assay for prostate cancer detection, based on the presence of a non-coding RNA (PCA3), uses post-DRE EPS urines as a source of prostate-derived genetic material 2–4.
In the context of prostate cancer, undiluted EPS can be collected while the patient is under anesthesia, just prior to prostatectomy. In comparison to the clinical DRE massages, a more vigorous digital rectal massage can be done to force prostatic fluids into the urethra with subsequent collection from the penis. For prostate cancer biomarker discovery purposes, direct EPS has several ideal clinical features beyond its nature as a proximal fluid. Since EPS are collected just prior to prostatectomy, clinical information that led to the decision to perform a prostatectomy is readily available 1. Post-prostatectomy pathology staging and grading can also be obtained for comparison with the pre-surgical assessments. When linked with clinical outcome data, these direct EPS fluids thus have the potential to be used for discovery of both diagnostic and prognostic biomarkers for management of prostate cancer. A large number of pre-prostatectomy EPS fluids have been collected by our translational research group 1, which we term ‘direct EPS’ to distinguish them from other EPS designations.
In this study, we applied a MudPIT-based proteomics approach to obtain a detailed inventory of proteins present in pre-prostatectomy direct EPS samples. By using a high mass accuracy Orbitrap MS and rigorous identification criteria, we provide a high quality accounting of prostatic secretions to a degree not previously reported. To further increase the utility of the current proteomic resource, we have systematically integrated our data with a multitude of publicly available databases (i.e. human tissue mRNA microarray, the Human Protein Atlas, alternative proteomics data). Additionally, the entire proteomics data was deposited to the Tranche server (www.proteomecommons.com) for others to utilize. We hypothesize that this comprehensive proteomic analysis of direct EPS will provide an important resource to study prostate biology and/or facilitate the discovery of novel prostate cancer biomarkers.
Ultrapure-grade urea, ammonium acetate, and calcium chloride were from BioShop Canada Inc. (Burlington, ON, Canada). Ultrapure-grade iodoacetamide, DTT and formic acid were obtained from Sigma. HPLC-grade solvents (methanol, acetonitrile and water) were obtained from Fisher Scientific. Mass spectrometrygrade trypsin gold was from Promega (Madison, WI, USA). Solid-phase extraction C18 MacroSpin Columns were from The Nest Group, Inc. (Southboro, MA, USA).
Direct EPS samples were collected from patients following informed consent and use of Institutional Review Board approved protocols at Urology of Virginia, Sentara Medical Group and Eastern Virginia Medical School between 2007 and 2009 (Table 1). Direct EPS fluids were obtained by vigorous prostate massage from subjects under anaesthesia just prior to prostatectomy as a result of a prostate cancer diagnosis. Each sample (0.2–1 ml) was collected from the tip of the penis in a 10 ml tube, diluted with saline to 5 ml, and stored on ice until transport to the biorepository for processing (within 1 hour of collection). For processing, any particulate material was removed by low-speed centrifugation. Aliquots (1 ml) of the supernatant were stored at −80°C. Tumor grades, staging, prostate volumes and tumor volumes 5 were determined using standard pathological procedures. Patient information was recorded, including demographics, medical history, pathology results, and risk factors, and stored in a Caisis database system. All personal information or identifiers beyond diagnosis and lab results were not available to the laboratory investigators.
Direct EPS fluid corresponding to 100 µg of total protein (as measured by Bradford assay) were concentrated to ~ 50 µl using an Amicon Ultra spin filter (3kDa cut-off; Millipore) followed by digestion as stated below. Briefly, concentrated EPS corresponding to 100 µg of total protein was diluted with an equal volume of 8M urea, 4mM DTT, 50mM Tris-HCl, pH 8.5 and incubated at 37°C for 1 hour, followed by carbamidomethylation with 10mM iodoacetamide for 1 hour at 37°C in the dark. Samples were diluted with 50mM ammonium bicarbonate, pH 8.5 to ~ 1.5M urea. Calcium chloride was added to a final concentration of 1mM and the protein mixture digested with a 1:30 molar ratio of recombinant, proteomics-grade trypsin at 37°C overnight. The resulting peptide mixtures were solid phase-extracted with C18 MacroSpin Columns from The Nest Group, Inc. (Southboro, MA, USA). Samples were concentrated by speed-vac and stored at −80°C until used for MudPIT analysis.
A fully automated 9-cycle two-dimensional chromatography sequence was set up as previously described 6. Peptides were loaded on a 7cm pre-column (150 µm i.d.) containing a Kasil frit packed with 3.5 cm 5µMagic C18 100 Å reversed phase material (Michrom Bioresources) followed by 3.5 cm LunA® 5µSCX 100Å strong cation-exchange resin (Phenomenex, Torrance, CA). Samples were automatically loaded from a 96-well microplate autosampler using the EASY-nLC system (Proxeon Biosystems, Odense, Denmark) at 3 µl/minute. The pre-column was connected to an 8cm fused silica analytical column (75 µm i.d.; 5µ Magic C18 100 Å reversed phase material (Michrom Bioresources)) via a microsplitter tee (Proxeon Biosystems) to which a distal 2.3 kV spray voltage was applied. The analytical column was pulled to a fine electrospray emitter using a laser puller. For peptide separation on the analytical column, a water/acetonitrile gradient was applied at an effective flow rate of 400 nl/minute, controlled by the EASY-nLC system. Ammonium acetate salt bumps (8µl) were applied at the identical concentrations as above, using the 96-well microplate autosampler at a flow-rate of 3 µl/minute in a vented-column set-up.
All samples were analyzed on a LTQ-Orbitrap XL. The instrument method consisted of one MS full scan (400–1800 m/z) in the Orbitrap mass analyzer, an AGC target of 500,000 with a maximum ion injection of 500 ms, 1 µscan and a resolution of 60,000 and using the preview scan option. Five data-dependent MS/MS scans were performed in the linear ion trap using the five most intense ions at 35% normalized collision energy. The MS and MS/MS scans were obtained in parallel. AGC targets were 10,000 with a maximum ion injection time of 100 ms. A minimum ion intensity of 1,000 was required to trigger a MS/MS spectrum. The dynamic exclusion was applied using a maximum exclusion list of 500 with one repeat count with a repeat duration of 30 seconds and exclusion duration of 45 seconds.
Raw data was converted to m/z XML using ReAdW and searched by X!Tandem against a locally installed version of the human UniProt complete human proteome (www.uniprot.org) protein sequence database. The search was performed with a fragment ion mass tolerance of 0.4 Da, a parent ion mass tolerance of ±10 ppm. Complete tryptic digest was assumed. Carbamidomethylation of cysteine was specified as fixed, and oxidation of methionine as variable modification. A target/decoy search was performed to experimentally estimate the false positive rate and only proteins identified with two unique high quality peptide identifications were considered as previously reported 7–9 (1 decoy protein identified; FDR <0.5%). An in-house protein grouping algorithm was applied to satisfy the principles of parsimony 8.
Annotation mappings (i.e. GO, predicted transmembrane domains, signal peptides etc.) were performed using the ProteinCenter bioinformatics software (Proxeon Biosystems, Odense, Denmark). Comparison of EPS-detected proteins against other proteomic datasets (prostatic secretions 10, seminal plasma 11, urine 12, condition media 13) was accomplished by ProteinCenter. Proteins were sequence aligned (i.e. BLASTed) against each other and only proteins with at least 95% sequence identity were considered to match (i.e. protein clusters). Comparison to the Human Gene Atlas (mRNA microarray) 14 and the Human Protein Atlas (www.proteinatlas.org) 15 was accomplished via gene mapping using an in-house relational database. For the comparison to data in the Human Protein Atlas, only proteins with available IHC staining in normal or prostate cancer tissues were included. The staining criteria we used for our antibody selection were directly adapted from the Human Protein Atlas website and are based on rigorous multi-step antibody quality control criteria established by this consortium. We only included gene products for our comparisons with a medium to high IHC scoring intensity in either normal prostate or prostate cancer tissue. Detailed information regarding the quality assurance can be found at (http://www.proteinatlas.org/qc.php). Cluster analyses (Euclidean) were performed using the open-source tool Cluster 3.0 (http://bonsai.ims.utokyo. ac.jp/mdehoon/software/cluster/software.htm) and heat maps were displayed using TreeView (http://jtreeview.sourceforge.net).
Nine individual direct EPS samples from men with prostate cancer (Gleason 6–7, stages pT2a–pT3b) were selected for MudPIT-based proteomic analysis. To minimize variability introduced by sample handling, EPS samples were directly digested in-solution and analyzed by a 9-step MudPIT on a LTQ-Orbitrap XL mass spectrometer (see Materials and Methods). Rigorous protein identification criteria resulted in the identification of 916 unique proteins with high confidence (Fig. 1A/B), with 230 to 550 unique proteins identified per individual cancer patient sample (Fig. 1B). The list of 916 proteins (SI Tab. 1) obtained from the direct EPS cancer samples were compared to those reported in a recent publication by Lin and colleagues 10 who collected several EPS fluids using a more vigorous prostate massage procedure (Fig. 1C). Only twenty proteins were unique to the study of Li et al. 10 consisting primarily of blood-derived proteins, while the vast majority of proteins were uniquely identified in the current study; 816 unique protein clusters and 94 shared protein clusters (for mapping details see Materials and Methods). Based on the very low levels of hemoglobin and seminogelin proteins detected in our analysis, we conclude that the direct EPS samples have minimal contamination with blood and seminal vesicle derived proteins. Using cumulative spectral counts, albumin and immunoglobulins account for approximately 25% and 14% respectively of protein concentration totals in direct EPS. In relation to known prostate derived proteins, the next most abundant proteins accounting for an additional 25% total abundance were lactoferrin, prostate specific antigen, prostatic acid phosphatise, zinc-α2 glycoprotein and aminopeptidase N. Functional analyses of the EPS proteome using Gene Ontology, predicted transmembrane domains (TMD; TMAP algorithm) and signalling peptides (SP; PrediSi algorithm) revealed a wide variety of functional categories (SI Fig. 1). A significant number of detected proteins were membrane proteins and members of the extracellular region, including 34 CD molecules (SI Tab. 1). This was further supported by the fact that ~ 40% of the detected proteins had at least one predicted TMD and contained a predicted SP, indicative of targeting to the secretory pathway. A smaller number of proteins could be assigned to other intracellular organelles.
One goal of the current study was to provide the research community with a high quality, well-annotated resource of proteins expressed in EPS. This could improve our understanding of general prostate biology and guide the discovery of novel prostate cancer biomarkers. To accomplish this goal, we mapped our data to two publicly available resources; the Human Gene Atlas 14 which is based on the mRNA microarray expression profile of >50 human tissues, and the Human Protein Atlas (www.proteinatlas.org) 15, 16, a large resource of immunohistochemistry images and available human antibodies (Fig. 2A). The rationale for these comparisons was to identify proteins detected in EPS fluids that are also enriched in prostate tissue relative to other human tissues included in the microarray database. Also, integration of EPS proteins with the Human Protein Atlas database can catalogue the availability of validated antibodies, which could provide researchers with an important tool to further study the biological function of selected proteins.
Briefly, using only mRNAs with ≥3-fold above median expression in prostate tissue compared to 20 other human tissue types in the Human Gene Atlas, 279 EPS gene products were identified, implying a potential function specifically related to prostate biology. In the Human Protein Atlas, a total of 513 proteins detected in the EPS proteome currently have antibodies with varying staining intensities in human prostate tissue (for details see Materials and Methods). The identities of the EPS proteins mapped to the two public databases are available in SI Tables 2 and 3.
Next, we integrated the EPS proteome with three other published proteomic datasets 11–13. This included a recent publication of proteins secreted into the cell culture media by three established prostate cancer cell lines (PC3, LNCaP, and 22Rv1) 13 and two human body fluid datasets (urine and seminal plasma) directly related to EPS 11, 13. The rationale for the comparison to the cell secretions was to distinguish between proteins readily secreted by prostate cancer cells from proteins being secreted by other cell types of the prostate. This is not an ideal comparison, however, as the cell lines are more reflective of metastatic prostate cancers and the direct EPS samples were from lower grade Gleason 6 and 7 cancers. An additional caveat is that these cells have numerous genomic alterations and have been grown under culture conditions for many passages distant from the primary isolates and might not truly reflect an in vivo setting. Briefly, approximately 300 proteins (Fig. 3) were shared between the EPS proteome and each of the three prostate cancer cell lines (see SI Tables 4–6 for detailed mappings). Likewise, approximately 500 proteins were shared between the EPS proteome and normal human urine and seminal plasma (Fig. 3) (see SI Tables 7 and 8). Several well known prostate markers (i.e. PSA, ACPP) were shared among the EPS proteome and the described prostate cancer cell lines. The data provided in this study could therefore be used to mechanistically investigate proteins identified in vivo (EPS) using well established prostate cancer cell lines. Interestingly, several well known prostate markers (i.e. MSMB, TMPRSS2) were only identified in the EPS proteome and the other two available body fluids (see below), strengthening the rationale of using in vivo obtainable fluids to study prostate/prostate cancer biology. It is likely that proteins exclusively detected in the EPS proteome, as compared to proteins detected in healthy human urine and seminal plasma, are specifically enriched in secretions of the prostate under pathological conditions, and were therefore absent or at a very low concentration, in the other body fluids.
To fully utilize the power of multi-dataset comparison we integrated all publicly available data resources with our current EPS proteome (Fig. 4), as an example of a potential data mining strategy. Briefly, we focused on gene products strongly enriched in normal prostate tissue using the described microarray dataset (10-fold above median expression) that were also detected in our EPS proteome and had antibodies through the Human Protein Atlas database. This integration resulted in 54 unique gene products satisfying all described integration criteria (Fig. 4 and SI Tab. 9). A heat map display of this integration is presented in Figure 4, including available mappings to other public proteome datasets. Although, all gene products are strongly expressed in prostate tissue, most showed additional expression in several other tissues. Known prostate markers (vertical line), such as PSA, ACPP and MSMB were readily identified. Several identified gene products have commercially available ELISAs, indicated by arrows, that we have confirmed for the use in EPS samples (data not shown). Comparison to other published proteomics datasets is also indicated in the presented heat map and this data could be used to further guide potential follow-up analyses. As all 54 proteins displayed in the current heat map have available immunohistochemistry images at the Human Protein Atlas, we systematically screened the available images for both normal prostate tissue and prostate cancer tissue. Several proteins with intriguing staining patters are discussed below, demonstrating an example of how the current data could be used by others.
Briefly, several proteins were identified which displayed a trend towards differential expression in normal prostate compared to prostate cancer tissue using available IHC images, although patient variability and image quality differences were also observed (Fig. 5). Neprilysin (MME, CD10) showed a trend towards decreased expression in prostate cancer tissue. Using a commercially available ELISA kit we have recently seen a similar trend using a limited number of EPS samples from aggressive prostate cancer as compared to indolent disease (data not shown). Alternatively, GDF15 and FASN showed a trend towards increased expression in prostate cancer tissues (Fig. 5). As we have now linked our entire data to available mRNA microarray expression, Human Protein Atlas antibody staining and other published, related proteomics datasets, we believe this data will provide an important resource for the prostate biology/cancer community to guide the investigation of other proteins in hypothesis driven experiments.
The cumulative list of identified proteins in the direct EPS samples currently represents the largest compilation resource of proteins present in prostatic fluid secretions from men with confirmed prostate cancers. In comparison to other proteomics studies aimed at discovering prostate function or prostate cancer biomarkers using cell secretions 13, plasma 17, 18, or tissue 19, 20, the current study used a relevant organ-proximal body fluid for the detection of secreted proteins in vivo. For this comprehensive analysis we used direct EPS fluids, because these fluids most likely contain secreted, prostate-specific proteins in a significantly higher concentration as compared to urine or plasma samples. The relatively high expression of many of these proteins like PSA and ACPP makes the fluid amenable to all current forms of mass spectrometry analysis with reduced front-end processing or purification.
Current early screening for prostate cancer relies on PSA detection in serum 21. Over time, this screening has dramatically reduced the presentation of high-grade disease, yet other studies have indicated that the majority of early-detected prostate cancers are not lethal and may be considered clinically insignificant 22–24. Thus, current methods to establish the risk of progression and prognosis of prostate cancers remain suboptimal, and a large number of patients are over-treated with a significant negative financial impact on health care 22, 23, 25. Thus, new diagnostics are needed that could predict disease course and discriminate indolent tumors from aggressive tumors, allowing for more appropriate treatment options to be offered to newly diagnosed prostate cancer patients. Targeting identification of potential new biomarkers in proximal fluids of the prostate like direct EPS and post-DRE EPS urines could address these clinical needs.
With the availability of modern mass spectrometers and high performance peptide separations like MudPIT 6, 26 we are now able to routinely identify hundreds to thousands of proteins in complex biological samples. This can result in a significant challenge for biomarker validation strategies, as the development of ELISA or MRM-MS assays for all identified proteins is not feasible. Equally challenging is the selection of the most likely candidate biomarkers from the large list of identified proteins. To guide the selection criteria of promising candidate proteins involved in prostate function and prostate cancer biomarker discovery, we have integrated our EPS proteome with a large variety of publicly available datasets. To further streamline MRM-MS assay development, we have deposited the entire MS data to the Tranche database for others to mine, as we have previously reported 7, 8, 27. Cumulatively, this EPS proteome dataset should facilitate the selection and confirmation of individual biomarker candidates in follow up studies by our group and other investigators.
T.K. is supported through the Canadian Research Chairs Program. This work was supported in parts by grants from Prostate Cancer Canada (2009-454) and the Canadian Institute of Health Research (MOP-93772) to T.K. and J.A.M., and in part by grants from the National Institutes of Health to R.R.D. (R01 CA135087, R21 CA137704) and O.J.S. (U01 CA085067). This research was funded in part by the Ontario Ministry of Health and Long Term Care. The views expressed do not necessarily reflect those of the OMOHLTC. Y.K. is a recipient of a Paul Starita Graduate Student Fellowships.