|Home | About | Journals | Submit | Contact Us | Français|
Recently, it was demonstrated that proteins can be translated from alternative open reading frames (altORFs), increasing the size of the actual proteome. Top-down mass spectrometry-based proteomics allows the identification of intact proteins containing post-translational modifications (PTMs) as well as truncated forms translated from reference ORFs or altORFs.
Top-down tissue microproteomics was applied on benign, tumor and necrotic-fibrotic regions of serous ovarian cancer biopsies, identifying proteins exhibiting region-specific cellular localization and PTMs. The regions of interest (ROIs) were determined by MALDI mass spectrometry imaging and spatial segmentation.
Analysis with a customized protein sequence database containing reference and alternative proteins (altprots) identified 15 altprots, including alternative G protein nucleolar 1 (AltGNL1) found in the tumor, and translated from an altORF nested within the GNL1 canonical coding sequence. Co-expression of GNL1 and altGNL1 was validated by transfection in HEK293 and HeLa cells with an expression plasmid containing a GNL1-FLAG(V5) construct. Western blot and immunofluorescence experiments confirmed constitutive co-expression of altGNL1-V5 with GNL1-FLAG.
Taken together, our approach provides means to evaluate protein changes in the case of serous ovarian cancer, allowing the detection of potential markers that have never been considered.
With the recent advances in mass spectrometry (MS) based-proteomics, the application of top-down MS-based proteomic strategies now allows the analysis of complex protein mixtures in their intact state without the need for enzymatic digestion (Tran et al., 2011). In a study by Ye et al., top-down MS-based proteomics coupled to Matrix Assisted Laser Desorption Ionization (MALDI) MS imaging (MALDI-MSI) of a rat brain post-treated with the NMDA receptor antagonist MK801 revealed 34 proteins with their specific post-translational modifications (PTMs) (Ye et al., 2014). Recently, we performed MALDI-MSI coupled to top-down tissue microproteomics on 3 rat brain regions and demonstrated the possibility to identify specific proteoforms linked to the physiology of the tissue region; several unique markers were identified showing different proteoforms of brain-specific proteins (data not shown). In this work, we investigated the pathological heterogeneity in ovarian serous cancer tumor microenvironment utilizing a top-down microproteomics approach. Specifically, we investigated proteome microenvironment alterations aiming to delineate and characterize specific protein profiles in benign, tumor and necrotic/fibrotic tumor regions by taking into account their PTMs and assessing their cleaved forms.
Our assessment also takes into account the identification of alternative proteins (AltProts) (Vanderperre et al., 2013). AltProts are translated from alternative open reading frames (AltORFs). AltORFs can have different localizations: they can overlap annotated protein-coding sequences in a different reading frame, or can be present within untranslated regions (UTRs) of mature mRNAs (Mouilleron et al., 2016, Vanderperre et al., 2013). Thus, alternative proteins are completely different from annotated or reference proteins (Mouilleron et al., 2016, Vanderperre et al., 2013). AltORFs may also be present in transcripts annotated as non-coding RNAs (ncRNAs). Indeed, proteins translated from non-annotated AltORFs were detected in our previous studies by MS. Some of these alternative translation products have also been validated biologically and assessed for their biological activity. For example, we have shown that AltMRVI1 is translated from an AltORF overlapping the MRVI1 coding sequence in a different reading frame and interacts with BRCA1 (Vanderperre et al., 2013). Translation of AltORFs in addition to annotated coding sequences opens the door to proteins that cannot be detected using conventional protein databases. Thus, due to their intriguing role, we aimed at investigating the profiles of the “hidden proteome” and assess their contribution in serous ovarian cancer. Additionally, these AltProts are mainly small proteins and the top-down proteomics strategy seems to be a better alternative rather than the shotgun proteomics for their detection. This is so because, even if the shotgun approach remains the most efficient strategy for high throughput proteomics, the identification of small proteins in this approach can be hampered due to the low amount of generated tryptic peptides and the generally fewer presence of enzyme cleavage sites. Therefore, top-down proteomics offers a good alternative to identify small proteins or truncated forms as well as some PTMs from the reference or the hidden proteome. Overall, our aim is to identify and characterize reference and altprots as potential markers for serous ovarian cancer pathology.
The ovarian biopsies were obtained from patients of the Centre Oscar Lambret (Lille, France) and from the CHRU de Lille Pathology Department. All experiments were approved by the local Ethics Committee (CPP Nord Ouest IV 12/10) in accordance with the French and European legislation on this topic. Methods of collection for human ovaries were performed in accordance with procedures that were approved by the Ethics Committee of the CHRU Lille. The study adhered to the principles of the Declaration of Helsinki and the Guidelines for Good Clinical Practice. All patients gave written informed consent before enrolment. The flash-frozen biopsies were stored at − 80 °C until use.
We first performed MS imaging of lipids in order to perform spatial segmentation analysis to identify regions of interest (ROIs), which were then subjected to liquid micro-junction (LMJ) (Quanico et al., 2013, Wisztorski et al., 2016) or parafilm-assisted manual microdissection (PAM) (Franck et al., 2013, Quanico et al., 2014) methods of microextraction. LMJ and PAM were followed by top-down proteomics for protein identification from necrotic/fibrotic tumor, tumor and benign (B) regions (technical triplicate). Reference and alternative proteins were then identified and localized in the 3 tissue regions of the ovarian serous cancer biopsies.
MS grade water (H2O), acetonitrile (ACN), methanol (MeOH), ethanol (EtOH), and chloroform were purchased from Biosolve (Valkensvaard, Netherlands). The cleavable detergent ProteaseMAX was purchased from Promega (Charbonnieres, France). Parafilm M was purchased from Pechiney Plastic Packaging (Chicago, Illinois). 2,5-dihydroxybenzoic acid (DHB), sodium dodecyl sulphate (SDS), dl-dithiothreitol (DTT) trifluoroacetic acid (TFA) and formic acid (FA) were purchased from Sigma (Saint-Quentin-Fallavier, France).
For MALDI-MSI experiments, tissues were cut in 10-μm slices using a cryostat (Leica Microsystems, Nanterre, France) and mounted on Indium Tin Oxide (ITO)-coated glass slides (LaserBio Labs, Sophia-Antipolis, France) by finger-thawing. For LMJ and PAM, consecutive tissue slices were also obtained but with a 30-μm thickness. For LMJ, the tissues were mounted on a polylysine glass slide. For PAM, on the other hand, the tissue sections were mounted on a parafilm M-covered glass slide (Franck et al., 2013, Quanico et al., 2014). After tissue section preparation, the sections were immediately dehydrated under vacuum at room temperature for 20 min. The slides were then scanned using a Nikon scanner and stored at − 80 °C until use.
DHB matrix (50 mg/mL) dissolved in 6:4 (v/v) MeOH/0.1% TFA in water was manually sprayed at a flow rate of 300 μL/h using a syringe pump connected to an electrospray nebulizer. The nebulizer was connected to a nitrogen line operated at 1 bar. The nebulizer was moved uniformly throughout the tissue until crystallization was sufficient to ensure optimal lipid detection. The tissue was then analyzed using an UltraFlex II MALDI-TOF/TOF (Time Of Flight) mass spectrometer equipped with a Smartbeam Nd-YAG laser (355 nm) and controlled by FlexControl software (Bruker Daltonics, Bremen, Germany). Lipid image acquisition was performed in positive reflector mode within an m/z range of 50 to 900 at a 300 μm resolution, and the obtained spectra were averaged from 300 laser shots per pixel. Peak detection and spatial segmentation analysis were then performed on the acquired images using SCiLS Software 2015b (SCiLS Lab GmbH, Bremen, Germany). For spatial segmentation, the Bisecting k-Means approach with Correlation as the distance metric was used. Spectra were subjected to median normalization and medium denoising prior to peak picking. After analysis, the ROIs were determined by selecting segments where the correlation distance metric is significantly distant from the other.
To ensure minimal protein hydrolysis by endogenous proteases, every step from buffer preparation to nanoflow Liquid Chromatography (nanoLC)-MS/MS analysis was carried out within the same day with on-ice conservation in between sample processing steps. A 0.1% (v/v) aliquot of temperature- and acid-cleavable commercial detergent (ProteaseMAX) was prepared from a 1% (v/v) stock solution prepared in 50 μM DTT and immediately stored at − 20 °C until use according to manufacturer's recommendations. Aliquots were processed within the day of sample extraction to ensure minimal degradation of the detergent over time, and remaining solutions were discarded.
To ensure optimal protein extraction, lipids were removed from the tissue section by immersing the glass slide consecutively for 1 min each in 70% EtOH and in 95% EtOH then 30 s in chloroform with complete solvent evaporation under reduced pressure at room temperature between each washing step. The slide was then scanned again as washing steps improve structure visibility. The slide used for LMJ microextraction was placed inside a TriVersa NanoMate (Advion, Ithaca, NY, USA) instrument. Proteins were then extracted from every ROI by completing six cycles of extraction composed of the following steps: 1) aspirate 1.5 μL of detergent solution, 2) dispense 0.8 μL of the extraction buffer on the surface of the selected ROI with 10 iterations of up-and-down pipetting, 3) aspirate 2.5 μL volume, and 4) expel 4 μL from the pipette tip into a clean tube to ensure complete retrieval of the initial 1.5 μL volume. Per ROI, the final collected volume was 9 μL. Each extract was immediately stored on-ice until further processing.
ROIs generated from spatial segmentation of MS images were cut using a scalpel. The pieces of parafilm M containing the tissue were then placed in a tube containing 10 μL of the extraction buffer and stored on-ice until further processing.
The extracts obtained with the LMJ or PAM strategy were sonicated for 5 min and the proteins were denatured at 55 °C for 15 min. The tubes were then quickly centrifuged to collect the extracts at the bottom of the tube. For extracts obtained using the PAM strategy, the parafilm pieces were carefully removed from the tubes using a pipette tip and the extracts were incubated at 95 °C for 10 min to ensure complete detergent dissociation. The tubes were then quickly centrifuged and stored on ice. 11 μL of 10% ACN in 0.4% FA in water were added to each tube so that the final ACN concentration is equal to the concentration of ACN at the beginning of the LC gradient. Samples were subjected to nanoLC-MS/MS analysis on the same day of sample preparation and were kept in the autosampler with the thermostat set at 4 °C.
5 μL of the sample was loaded onto a 2 cm * 150 μm internal diameter IntegraFrit sample trap-column (New Objective, Woburn, Massachusetts, USA) at a maximum pressure 280 bar using a Proxeon EASY nLC-II (Proxeon, Thermo Scientific, Bremen, Germany). Proteins were separated on a 15 cm * 100 μm internal diameter PLRP-S column (Varian, Palo Alto, California, USA) with a linear gradient of ACN from 5 to 55% for 110 min and 55% to 90% for 25 min and a flow rate of 300 nL/min.
Data were acquired on a Q-Exactive mass spectrometer (Thermo Scientific, Bremen, Germany) equipped with a nanoESI (Electrospray Ionization) source (Proxeon, Thermo Scientific, Bremen, Germany) and a PicoTip nanospray emitter (New Objective, Woburn, Massachusetts, USA). Data were acquired in data-dependent mode using a top 3 strategy. Full scans were acquired by averaging 4 microscans at 70,000 resolution within a m/z mass range of 800–2000 with an AGC target of 1 * 106 and a maximum accumulation time of 200 ms. The three most abundant ions with charge superior than 3 or unassigned were selected for fragmentation. Precursors were selected within a 15 m/z selection window by the quadrupole and fragmented with a Normalized Collision Energy (NCE) of 25; the (Automatic Gain Control) AGC target was set to 1 * 106 with a maximum accumulation time of 500 ms. For each MS/MS spectrum, two microscans at a resolution of 70,000 at m/z 400 were acquired and averaged. Dynamic exclusion was set to 20 s.
RAW files were processed with ProSight PC 3.0 (Thermo Fisher Scientific, Bremen, Germany). Spectral data were deisotoped using the cRAWler algorithm and searched against the complex Homo sapiens ProSightPC database version 2014_07 containing every canonical protein and its known PTMs. Files were searched with the “absolute mass” then “biomarker” search modes (Kellie et al., 2010) in ProsightPC considering every PTM available in the complex database. A second search was performed to detect altORF products with a concatenated database composed of the H. sapiens UniProt Reference proteome (canonical and isoforms) of 01.16.2015 and an in-silico translated database of the H. sapiens of the transcripts from GenBank containing every ORF with potential protein product that had at least 29 amino acids with the same search strategy. Identification was considered positive when one of the two strategies gave an expected score (E-value) that was lower than 10− 4.
Raw files were also processed with ProSightPC 3.0 (Thermo Scientific) and Proteome Discoverer 2.1 (Thermo Scientific) utilizing the ProSightPD 1.0 node. Spectra were then searched using a three-tiered search tree. The first search was an Absolute Mass search with MS1 tolerance of 100 Da and MS2 tolerance of 10 ppm, against the complex Homo sapiens ProSightPC database version 2014_07 containing every canonical protein and its known PTMs. The second search was a ProSight Biomarker search with MS1 tolerance of 10 ppm, MS2 tolerance of 10 ppm, against the same database. Lastly, a second Absolute Mass search was performed with MS1 tolerance of 1000 Da, MS2 tolerance of 10 ppm, using Delta M mode, against the same database.
False discovery rates (FDR) were estimated as described previously (Kellie et al., 2012). Briefly, data were searched using scrambled protein sequences as decoys using identical strategies as above (absolute and biomarker modes). Logarithmic P-score distributions of decoy protein hits were analyzed for each search mode (absolute and biomarker) separately. Area under score distributions were calculated to reach 5% of total distribution starting from the best score (highest –log P), thus giving P-score cutoffs at 5% FDR for each search strategy. Proteins that had greater P-scores were removed from identification files.
UniProt accession numbers from each ovarian tissue technical replicate were combined and exported to UniProt “Retrieve/ID mapping” tool to recover files with accession numbers, Gene names and protein names (Supplementary Data 1). Venn diagrams were then generated by entering the UniProt combined accession number of each region into the University of Gent Venn diagram Webtool. The mass spectrometry top-down proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (Vizcaíno et al., 2016) partner repository with the dataset identifier PXD005420.
The gene names of identified proteins were used as input to retrieve a network from STRING (Szklarczyk et al., 2015). The Elsevier's Pathway Studio version 10.0 (Ariadne Genomics, Elsevier) was used to deduce relationships among differentially expressed proteomics protein candidates using the Ariadne ResNet database (Bonnet et al., 2009, Yuryev et al., 2009). “Subnetwork Enrichment Analysis” (SNEA) algorithm was selected to extract statistically significant altered biological and functional pathways pertaining to each identified set of protein hits among the different groups. SNEA utilizes Fisher's statistical test set to determine if there are nonrandom associations between two categorical variables organized by specific relationship. Integrated Venn diagram analysis was performed using “the InteractiVenn”: a web-based tool for the analysis of complex data sets (Heberle et al., 2015).
To validate the altprot product identified by top-down proteomics, one of the identified altprots was selected and cloned in the context of its reference protein. The plasmid contained the canonical G protein nucleolar 1 (GNL1) coding sequence with a C-terminal FLAG tag and the AltGNL1 coding sequence nested within the GNL1 coding sequence, but in a frameshifted ORF with a C-terminal V5 tag. The DNA sequence was built using Gblocks which were assembled using the Gibson assembly (Gibson et al., 2009) protocol using the NEBuilder HiFi DNA Assembly Cloning Kit (New England BioLabs, Ipswich, Massachusetts, USA) according to manufacturer's recommendation into a pcDNA 3.1-expression vector (Invitrogen, Carlsbad, California, USA). GNL1 and AltGNL1 co-expression was validated by western blot in HEK 293 cells transfected with polyethylenimine (PEI, Sigma) as transfection reagent (Hsu and Uludağ, 2012). Briefly, cells were grown in complete Dulbecco's Modified Eagle's Medium (DMEM, Wisent, St-Bruno, Québec, Canada) into a 6-well plate until 70–80% confluent. 1.6 μg plasmidic DNA was mixed into 80 μL of serum-free DMEM and 8 μL of 0.1% (w/v) PEI. The mixture was then gently mixed and let stand at room temperature for 10 min. 480 μL of complete media was then added to the mixture immediately and dropwise onto cells without media renewal. After 24 h of transfection, the cells were washed twice with phosphate-buffered saline (PBS) and lysed using 4% SDS. The lysate was sonicated and centrifuged and the protein content was estimated using Bicinchoninic Acid (BCA) assay (Thermo Scientific). 100 μg of protein was denatured and loaded onto a 15% SDS-PAGE gel. After migration, the proteins were transferred onto a polyvinyldiene difluoride membrane. The membrane was then blocked using 2.5% milk–supplemented, Tris-buffered saline with tween 20 (TBST, 1/1000 v/v). The membrane was rinsed with PBS and probed with anti-FLAG M2 mouse antibody (F1804, Sigma) and anti-V5 mouse antibody (V8012, Sigma) overnight. The membrane was then washed three times with TBST and rinsed with PBS and probed with anti-mouse horseradish peroxidase antibody (7076S, Cell signaling technology) and visualized.
GNL1 and AltGNL1 co-expression was also validated at the cellular level by immunofluorescence. HeLa cells were transfected with GeneCellIn transfection reagent (BioCellChallenge, Toulon, France). Briefly, 25,000 HeLa cells per well were seeded into a 24-well plate and let to grow in complete DMEM media for 24 h. The cells were then transfected by adding 250 ng of plasmidic DNA into 100 μL of serum-free DMEM media and 1.5 μL of GeneCellIn transfection reagent. The mixture was let at room temperature for 15 min and then added dropwise to cells without media renewal. After 24 h of transfection, the cells were rinsed twice with PBS and fixed with 4% paraformaldehyde for 20 min and rinsed twice again. The cell membranes were permeabilized using 0.15% Triton X-100 for 5 min, then rinsed twice with PBS and then twice with Normal Goat Serum (NGS) blocking buffer for 20 min. Anti-FLAG rabbit antibody (F7425, Sigma) and anti-V5 mouse antibody were then added and cells were incubated overnight at 4 °C. The cells were then rinsed twice with NGS and probed with anti-Mouse 488 antibody (A11017, Invitrogen) and anti-rabbit 568 antibody (A21069, Invitrogen) for an hour. The cells were rinsed twice with PBS and incubated with 4′,6′-diamidino-2-phenylindole (DAPI) for 30 min. The cells were rinsed twice with PBS, mounted on a microscope slide with SlowFade (Thermo Scientific) and sealed. The slides were stored in the dark at 4 °C until observation via confocal microscopy.
MALDI-MSI was performed on ovarian high grade serous carcinoma sections in order to perform non-supervised spatial segmentation analysis and identification of ROIs (Fig. 1a). The tissue was thus successfully classified by MALDI-MSI with three main clusters (Fig. 1b). One was associated with the benign region (blue-cyan), whereas the two others matched the tumor and necrotic/fibrotic tumor regions (brown-red and orange-yellow, respectively). The presence of the three regions was confirmed by a pathologist (Fig. 1b). A total of 18 samples from the three clustered regions were extracted and analyzed in triplicate employing the two extraction strategies (Fig. 1c). This resulted in the identification of 150 proteins in LMJ and 149 in PAM (Fig. 1c) at an estimated FDR of 5% for reference and “reference plus AltProts” concatenated protein databases (Supplementary Fig. 1a and b, respectively). The distribution using LMJ or PAM is as follows: 41 vs. 47 specific proteins in tumor regions, 24 vs. 27 in necrotic/fibrotic tumor regions, and 37 vs. 19 in benign regions (Fig. 1c). Overall, 61 proteins are specific to the tumor region, 44 to the necrotic/fibrotic tumor region and 48 to the benign region. Thus, 237 proteins were identified by combining the data (see Supplementary Data 1) which is, to our knowledge, the highest number of identified proteins for tissue top-down microproteomics. From the list of identified proteins, some were already known to be involved in ovarian cancer (Table 1) but the ones we identified are mainly fragments of proteins e.g. a fragment of 58 amino acid residues derived from KRT8 (Fig. 1d). Among the identified proteins some are particularly interesting due to the fact that they are found in both tumor and necrotic-fibrotic tumor regions e.g. gamma-synuclein, Lupus la protein (SSB), Nucleophosmin (NPM1), Nuclease-sensitive element-binding protein 1 (YBX1), Probable ATP-dependent RNA helicase DDX17 (DDX17), and Hematological and neurological expressed 1-like protein (JPT2). Others are found specifically in benign and necrotic/fibrotic tumor regions, such as salivary acidic proline-rich phosphoprotein ½ (PRH1). G antigen 7 (GAGE7), High mobility group protein B1 (HMGB1), Glycogen synthase (GYS1), G antigen 2B/2C (GAGE2B), and Cilia- and flagella-associated protein 44 (CFAP44) are only found in the necrotic/fibrotic tumor region.
STRING protein analysis of the tumor region associated with GO term analyses led to the identification of two major pathways (RNA binding (GO: 0003727), and poly(A) RNA binding (GO: 0044822). Cellular component GO overrepresentation analysis revealed that the proteins identified in the tumor are mainly found in exosomes - extracellular vesicles (42.3%) and in the nucleus (57.6%). In the necrotic/fibrotic tumor region, 50% of the proteins are found in the extracellular exosomes and in vesicles, 16% are in the nucleus and 34% in various organelles. In benign regions, 50% of the proteins are involved in cellular traffic, 12.3% are cytoplasmic and 27.7% are membrane-bound proteins. Global subnetwork analyses in both tumor, necrotic/fibrotic tumor and benign regions clearly showed differences in protein pathway involvement (Fig. 2). In the benign region, protein pathways are mainly implicated in cell survival, growth, motion, adhesion, differentiation and vascularization (Fig. 2a), whereas in the necrotic/fibrotic tumor region the proteins are mainly implicated in apoptosis, inflammation, neoplasm, acute phase reaction and oxidative stress (Fig. 2b). Tumor subnetwork global analysis showed pathways in neoplasm, autophagy, apoptosis, cell proliferation and tumor immunity (Fig. 2c).
Subnetwork enrichment analysis confirmed the global analysis (Fig. 3). In the benign region, the subnetworks revealed implication in muscle contraction and cell differentiation (Fig. 3a). For necrotic/fibrotic tumor, the subnetworks are involved in Rho Kinase, RNA processing and Rhabdomyocyte disease pathways (Fig. 3b). Tumor subnetworks revealed implication in necrosis, interferon regulatory factor signaling pathway, myonecrosis, and cancer and T cell hypo responsiveness (Fig. 3c). Global network analysis between benign and necrotic/fibrotic tumor regions (Fig. 3d) revealed proteins involved in apoptosis, contraction and actin organization pathways. The same analysis between tumor and necrotic/fibrotic tumor revealed proteins involved in neoplasm and Smooth Muscle Cell (SMC) proliferation pathways. Comparison of proteins from tumor and benign regions showed proteins involved in cell death, cell growth, keloid and muscle cell differentiation.
We previously identified 6 AltProts using the shot-gun proteomic approach i.e. AltADCY1, AltCCDC152, AltKART34, AltMOBKL2B, AltPALLD, AltSMCHD1 (Vanderperre et al., 2013). With the top-down microproteomics approach, 15 unknown proteins were identified in patient biopsies including: AltApol6, AltCMBL, AltTLR5, AltPKHD1L1, AltLARS2-AS1, AtltSERPINE1, AltCSNK1A1L, AltGPC5, AltLTB4R, AltTMP1, AltGRAMD4, AltMTHFR, AltAGAP1, AltGNL1 and AltRP11-576E20.1 (Table 2). Six altprots were identified in the benign region (AltTLR5, AltPKHD1L1, AtltSERPINE1, AltGPC5, AltGRAMD4, AltAGAP1), 5 in the necrotic/fibrotic tumor region (AltApol6, AltLARS2-AS1, AltLTB4R, AltTMP1, AltMTHFR) and 4 in the tumor (AltCMBL, AltGNL1, AltRP11-576E20.1, AltCSNK1A1L). The function of these proteins remains unknown. AltGNL1 was selected for further analysis (Fig. 4) based on immunofluorescence data provided by the Human Protein Atlas confirming the presence of its reference protein GNL1 in ovarian cancer tissue. In order to validate the co-expression of GNL1 and non-annotated AltGNL1 proteins from the same gene, we transfected cells with an expression plasmid containing a GNL1-FLAG(V5) construct in HEK 293 cells (Fig. 5). In this construct, the Flag and V5 tags are in-frame with GNL1 and AltGNL1, respectively. Both GNL1FLAG and AltGNL1V5 are expressed and detected with anti-FLAG and anti-V5 antibodies, respectively (Fig. 5a & b). Co-expression at single cell level was confirmed by immunofluorescence (Fig. 5c). AltGNL1 displays a nuclear localization whereas GNL1 is present in the cytosol.
This work involves the use of tissue microproteomics to characterize the local proteome in three regions (necrotic/fibrotic tumor, tumor and benign region) of human ovarian cancer. These regions were analyzed by MALDI-MSI and discerned by spatial segmentation analysis (Alexandrov et al., 2011, Bonnel et al., 2011, Bruand et al., 2011), and the proteins were microextracted utilizing LMJ and PAM approaches (Franck et al., 2013, Quanico et al., 2014, Quanico et al., 2013, Wisztorski et al., 2016). A total of 237 gene products within the three regions were identified. 61 proteins were specific to the tumor region, 44 to the necrotic/fibrotic tumor region, and 48 to the benign region. The extracted protein profiles from the 3 regions are clearly different and subnetwork analysis revealed a possible progression in the nature of the protein pathways involved in the 3 regions. These results suggest a mechanism in cancer progression from benign to tumor and necrotic/fibrotic tumor regions by a progressive switch in the cell phenotype because we detected proteins common to these regions e.g. SSB, NPM1, YBX1, DDX17, HN1L or PHR1, HMGB1, GYS1, GAGE2B, CFAP44. Utilizing a systems biology approach, pathways implicated in muscle proliferation, cell differentiation, actin, cytoskeleton disorganization, apoptosis, neoplasia, and necrosis with Rho kinase activation are enriched and are likely to be involved in the switch in cell phenotype. In addition, T cell response is observed to be inhibited, leading a tolerant immune response towards the tumor. These results are consistent with spatial segmentation analysis showing that the tumor and necrotic-fibrotic tumor regions had a close histological molecular profile distinct from that of benign regions (see cluster tree, Fig. 1a).
Tissue top-down microproteomics gives insight on the tumor microenvironment with the identification of proteins involved in cancer processes, diagnosis and/or progression (Table 1). For example, the C-terminal fragment (aa425–483) of Cytokeratin-8 (KRT8) has been detected in our experiments in the necrotic/fibrotic tumor and tumor regions (Fig. 1d). KRT8 was previously referenced as a potential biomarker for ovarian cancer (Wang et al., 2012). We demonstrate here that in cancer regions, it is not the complete protein that is present but a C-terminal fragment of 58 amino acid residues. We previously obtained similar results for the C-terminal fragment of the immunoproteasome 11S, PA28 or Reg alpha, a marker for Grade III-IV serous ovarian cancer (Lemaire et al., 2007), as well as for Grade I and tumor relapse (Longuespée et al., 2012). Similarly, a fragment (aa55–72) of Cytokeratin-7 (KRT7) was detected in the tumor region. KRT7 is already a marker for ovarian adenocarcinoma (Chu et al., 2000, Waldemarson et al., 2012), but here we demonstrate that, in fact, the fragment composed of 17 amino acid residues is potentially the actual marker in ovarian tumor. KRT8 and 7 were also reported to be highly expressed in ovarian cancer cell lines (Chu et al., 2000). Protein S100-A11 was detected in the tumor and necrotic/fibrotic tumor regions and was also observed as being particularly highly expressed in ovarian cancer (Liu et al., 2015). The pro-inflammatory cytokine Macrophage migration inhibitory factor (MIF) was detected in the necrotic/fibrotic tumor and tumor regions (Fig. 3d). MIF is already a potential biomarker for ovarian cancer and is associated with tumor growth, metastasis and poor prognosis (Simpson et al., 2012). This protein is also a serum biomarker that distinguishes benign from malignant ovarian tumors in combination with other biomarkers (Agarwal et al., 2007, Krockenberger et al., 2008), and is associated with loss of p53 suppressor activity (Hudson et al., 1999), inhibiting apoptosis and DNA damage repair. Several other proteins already linked to cancer were also identified, including nitrilase-1(Nit1), melanoma antigen family D 2 (MAGED2), Zyxin (ZYX), and ATX1 antioxidant protein 1 homolog (ATOX1). Nit1 is a negative regulator in primary T cells and is classified as a tumor suppressor in association with the fragile histidine-triad protein Fhit (Semba et al., 2006) over-produced in non-small cell lung cancer (NSCLC) and may be a therapeutic target in ovarian cancer (Croce et al., 1999). MAGED2 is also over-expressed in NSCLC (Sienel et al., 2004). Zyxin, a Smad3-mediated TGF-β1 signaling target, regulates cancer cell motility and epithelial-mesenchymal transition during lung cancer development and progression (Beaino et al., 2014, Mise et al., 2012). Interestingly, some proteins identified in the present work have not yet been identified by the Cancer Network Galaxy (TCNG) e.g. Dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit 4 (OST4), Signal recognition particle receptor subunit alpha (SRPRA), and U6 snRNA-associated Sm-like protein LSm8 (LSM8).
In addition to reference proteins, we identified altprots by top-down microproteomics. 6 altprots were detected in the benign region, 5 in the necrotic/fibrotic tumor region, and 4 in the tumor region. None of these 15 altprots were previously identified. Genes coding for these altprots are annotated as genes coding for receptors (TLR5, LTB4R, AGAP1R), enzymes (CMBL, SERPINE1, MTHF, CSNK1A1L, TMP1), or cytoplasmic or nuclear proteins (Apol6, GRAMD4, GNL1, PKHD1L1). AltLARS2-AS1 and AltRP11-576E20.1 are expressed from genes annotated as non-coding genes, and thus should be re-annotated. We focused our interest on AltProts detected in the cancer region, specifically AltGNL1. Indeed, the reference GNL1 protein was previously detected in ovarian cancer according to the Protein Atlas. None of the other 13 reference proteins were previously identified in proteomic or genomic large-scale studies on ovarian cancer. We validated the co-expression of the reference GNL1 protein with its AltProt AltGNL1 (Fig. 5b–c). Our results clearly demonstrate that both proteins are co-expressed from a single mRNA expressed from a cDNA construct. Immunofluorescence experiments showed that AltGNL1 displays nuclear localization whereas GNL1 is present in the cytosol (Fig. 5c). Our results confirm the presence of a hidden proteome which can constitute a reservoir of potential biomarkers and therapeutic targets.
Taken together, our results show that top-down microproteomics coupled with MALDI MSI can be used to detect proteins expressed from altORFs. These proteins can be used as putative diagnostic biomarkers that may have been missed in conventional proteomics approaches utilizing reference protein databases only. Our approach will be useful to determine the function of altprots in health and disease.
The following are the supplementary data related to this article.
FDR estimation via interrogation of scrambled databases. Area of densities of highest negative logarithm of P-score distributions gave P-score cutoffs for absolute and biomarker search modes at 5% FDR. Results were generated using “reference” (a) and concatenated “reference and AltORFs” protein databases (b). P-score cutoffs were reported on density plots.
List of identified proteins by top-down microproteomics using LMJ and PAM approaches.
Supported by grants from Région Nord Pas-de-Calais (CM/YB N°2015.2097/12) and PROTEO (FRQNT-RS-188158) (V. Delcourt), University Lille 1 (BQR to Dr. Julien Franck, 2015), Canadian Institutes for Health Research (MOP-136962) and Canada Research Chairs in Functional Proteomics and Discovery of New Protein (Prof. X. Roucou), PRISM (Prof. M. Salzet), Ministère de l'Enseignement Supérieur et de la Recherche via Institut Universitaire de France (ESRS 0900500E) (Prof. I. Fournier), SIRIC ONCOLille (Prof. I. Founier), and Grant INCa-DGOS-Inserm 6041.
The authors declare no competing financial interests in this work.
MS IF and JF designed the study.
VD, JF, MW and JQ performed all experiments.
JPG and JFJ provided the technical support.
FK performed the subnetwork analyses.
XR has provided the human altORF database.
EL, FN, YMR are clinicians that have provided the biopsies, performed the pathological analyses and the correlation with clinical data.
IF, MS, VD and JF wrote the manuscript.
IF, MS, XR provided funding for this work.
All authors have read and corrected the manuscript.