Proteomic techniques allow researchers to perform detailed analyses of cellular states and many studies are published each year, which highlight large numbers of proteins quantified in different samples. However, currently few data sets make it into public databases with sufficient metadata to allow other groups to verify findings, perform data mining or integrate different data sets. The Proteomics Standards Initiative has released a series of "Minimum Information About a Proteomics Experiment" guideline documents (MIAPE modules) and accompanying data exchange formats. This article focuses on proteomic studies based on gel electrophoresis and demonstrates how the corresponding MIAPE modules can be fulfilled and data deposited in public databases, using a new experimental data set as an example.
We have performed a study of the effects of an anabolic agent (salbutamol) at two different time points on the protein complement of rat skeletal muscle cells, quantified by difference gel electrophoresis. In the DIGE study, a total of 31 non-redundant proteins were identified as being potentially modulated at 24 h post treatment and 110 non redundant proteins at 96 h post-treatment. Several categories of function have been highlighted as strongly enriched, providing candidate proteins for further study. We also use the study as an example of best practice for data deposition.
We have deposited all data sets from this study in public databases for further analysis by the community. We also describe more generally how gel-based protein identification data sets can now be deposited in the PRoteomics IDEntifications database (PRIDE), using a new software tool, the PRIDESpotMapper, which we developed to work in conjunction with the PRIDE Converter application. We also demonstrate how the ProteoRed MIAPE generator tool can be used to create and share a complete and compliant set of MIAPE reports for this experiment and others.
The global analysis of proteins is now feasible due to improvements in techniques such as two-dimensional gel electrophoresis (2-DE), mass spectrometry, yeast two-hybrid
systems and the development of bioinformatics applications. The experiments form
the basis of proteomics, and present significant challenges in data analysis, storage and
querying. We argue that a standard format for proteome data is required to enable
the storage, exchange and subsequent re-analysis of large datasets. We describe the
criteria that must be met for the development of a standard for proteomics. We have
developed a model to represent data from 2-DE experiments, including difference
gel electrophoresis along with image analysis and statistical analysis across multiple
gels. This part of proteomics analysis is not represented in current proposals for
proteomics standards. We are working with the Proteomics Standards Initiative to
develop a model encompassing biological sample origin, experimental protocols, a
number of separation techniques and mass spectrometry. The standard format will
facilitate the development of central repositories of data, enabling results to be verified
or re-analysed, and the correlation of results produced by different research groups
using a variety of laboratory techniques.
Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database.
The raw NPC proteome experiment data were captured into one XML document with Human Proteome Markup Language (HUP-ML) editor and imported into native XML database Xindice. The 2D/MS repository of NPC proteome was constructed with Apache, PHP and Xindice to provide access to the database via Internet. On our website, two methods, keyword query and click query, were provided at the same time to access the entries of the NPC proteome database.
Our 2D/MS repository can be used to share the raw NPC proteomics data that are generated from gel-based proteomics experiments. The database, as well as the PHP source codes for constructing users' own proteome repository, can be accessed at .
Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions.
The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration.
The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.
Proteomics is rapidly evolving into a high-throughput technology, in which substantial and systematic studies are conducted on samples from a wide range of physiological, developmental, or pathological conditions. Reference maps from 2D gels are widely circulated. However, there is, as yet, no formally accepted standard representation to support the sharing of proteomics data, and little systematic dissemination of comprehensive proteomic data sets.
This paper describes the design, implementation and use of a Proteome Experimental Data Repository (PEDRo), which makes comprehensive proteomics data sets available for browsing, searching and downloading. It is also serves to extend the debate on the level of detail at which proteomics data should be captured, the sorts of facilities that should be provided by proteome data management systems, and the techniques by which such facilities can be made available.
The PEDRo database provides access to a collection of comprehensive descriptions of experimental data sets in proteomics. Not only are these data sets interesting in and of themselves, they also provide a useful early validation of the PEDRo data model, which has served as a starting point for the ongoing standardisation activity through the Proteome Standards Initiative of the Human Proteome Organisation.
The ProteoRed multicentric experiment was designed to test each laboratory abilities to perform quantitative proteomic analysis, compare methodologies and inter-lab reproducibility for relative quantitative analysis of proteomes and to evaluate data reporting and data sharing tools (MIAPE documents, standard formats, public repositories). The experiment consist in analyzing two different samples (A and B), which contain an identical matrix of E. Coli proteins plus four standard proteins (CYC_HORSE, MYG_HORSE, ALDOA_RABIT, BSA_BOVIN), spiked in different amounts. The samples are designed primarily to be analyzed by LC-MS, although DIGE analysis could be also possible. Each lab will have the choice to test their preferred method for quantitative comparison of the two samples. However, to set as much standardization and reproducibility as possible in terms of data analysis, data sharing, protocols information, results and reporting we propose the OmicsHub Proteomics server to be the single platform to integrate the protein identification steps of the MS multicentric experiment and serve as a web repository. After the “In-lab” analysis is performed, with the laboratory's own tools, every lab is able to load its experiment (protocols, parameters, instruments, etc.) and import its spectrum data via web into the OmicsHub Proteomics analysis and managment server. Every experiment in OmicsHub is automatically stored following the PRIDE standard format. The OmicsHub Proteomics software tool performs the workflow tasks of Protein identification (using the search engines Mascot and Phenyx), Protein annotation, Protein grouping, FDR filtering (allowing the use of a local decoy database, designed ad hoc for this experiment) and Reporting of the protein identification results in a systematic and centralized manner. The OmicsHub Proteomics allows the researchers at ProteoRed consortium to perform its multicentric study with full reproducibility, standardization and experiment comparison; reducing time and data management complexity prior to the final quantification analysis.
The Molecular INTeraction Database (MINT, http://mint.bio.uniroma2.it/mint/) is a public repository for protein–protein interactions (PPI) reported in peer-reviewed journals. The database grows steadily over the years and at September 2011 contains approximately 235 000 binary interactions captured from over 4750 publications. The web interface allows the users to search, visualize and download interactions data. MINT is one of the members of the International Molecular Exchange consortium (IMEx) and adopts the Molecular Interaction Ontology of the Proteomics Standard Initiative (PSI-MI) standards for curation and data exchange. MINT data are freely accessible and downloadable at http://mint.bio.uniroma2.it/mint/download.do. We report here the growth of the database, the major changes in curation policy and a new algorithm to assign a confidence to each interaction.
In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies.
We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates.
Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved.
The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics and to facilitate data comparison, exchange and
verification. Rapid progress has been made in the development of common standards
for data exchange in the fields of both mass spectrometry and protein–protein interactions
since the first PSI meeting . Both hardware and software manufacturers
have agreed to work to ensure that a proteomics-specific extension is created for the
emerging ASTM mass spectrometry standard and the data model for a proteomics
experiment has advanced significantly. The Protein–Protein Interactions (PPI) group
expects to publish the Level 1 PSI data exchange format for protein–protein interactions
by early summer this year, and discussion as to the additional content of Level
2 has been initiated.
Controlled vocabularies (CVs), i.e. a collection of predefined terms describing a modeling domain, used for the semantic annotation of data, and ontologies are used in structured data formats and databases to avoid inconsistencies in annotation, to have a unique (and preferably short) accession number and to give researchers and computer algorithms the possibility for more expressive semantic annotation of data. The Human Proteome Organization (HUPO)–Proteomics Standards Initiative (PSI) makes extensive use of ontologies/CVs in their data formats. The PSI-Mass Spectrometry (MS) CV contains all the terms used in the PSI MS–related data standards. The CV contains a logical hierarchical structure to ensure ease of maintenance and the development of software that makes use of complex semantics. The CV contains terms required for a complete description of an MS analysis pipeline used in proteomics, including sample labeling, digestion enzymes, instrumentation parts and parameters, software used for identification and quantification of peptides/proteins and the parameters and scores used to determine their significance. Owing to the range of topics covered by the CV, collaborative development across several PSI working groups, including proteomics research groups, instrument manufacturers and software vendors, was necessary. In this article, we describe the overall structure of the CV, the process by which it has been developed and is maintained and the dependencies on other ontologies.
Database URL: http://psidev.cvs.sourceforge.net/viewvc/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo
The Human Proteome Organization (HUPO) Proteomics Standard Initiative has been tasked with developing file formats for storing raw data (mzML) and the results of spectral processing (protein identification and quantification) from proteomics experiments (mzIndentML). In order to fully characterize complex experiments, special data types have been designed. Standardized file formats will promote visualization, validation and dissemination of data independent of the vendor-specific binary data storage files. Innovative programmatic solutions for robust and efficient data access to standardized file formats will contribute to more rapid wide-scale acceptance of these file formats by the proteomics community.
In this work, we compare algorithms for accessing spectral data in the mzML file format. As an XML file, mzML files allow efficient parsing of data structures when using XML-specific class types. These classes provide only sequential access to files. However, random access to spectral data is needed in many algorithmic applications for processing proteomics datasets. Here, we demonstrate implementation of memory streams to convert a sequential access into random access. Our application preserves the elegant XML parsing capabilities. Benchmarking file access times in sequential and random access modes show that while for small number of spectra the random access is more time efficient, when retrieving large number of spectra sequential access becomes more efficient. We also provide comparisons to other file accessing methods from academia and industry.
mzML; XML; Sequential file access; Random file access; Proteomics datasets
The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches.
We have developed the MAss SPECTRometry Analysis System (MASPECTRAS), a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio). The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). MASPECTRAS is freely available at
Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community.
We report the release of mzIdentML, an exchange standard for peptide and protein identification data, designed by the Proteomics Standards Initiative. The format was developed by the Proteomics Standards Initiative in collaboration with instrument and software vendors, and the developers of the major open-source projects in proteomics. Software implementations have been developed to enable conversion from most popular proprietary and open-source formats, and mzIdentML will soon be supported by the major public repositories. These developments enable proteomics scientists to start working with the standard for exchanging and publishing data sets in support of publications and they provide a stable platform for bioinformatics groups and commercial software vendors to work with a single file format for identification data.
We present 2DDB, a bioinformatics solution for storage, integration and analysis of quantitative proteomics data. As the data complexity and the rate with which it is produced increases in the proteomics field, the need for flexible analysis software increases.
2DDB is based on a core data model describing fundamentals such as experiment description and identified proteins. The extended data models are built on top of the core data model to capture more specific aspects of the data. A number of public databases and bioinformatical tools have been integrated giving the user access to large amounts of relevant data. A statistical and graphical package, R, is used for statistical and graphical analysis. The current implementation handles quantitative data from 2D gel electrophoresis and multidimensional liquid chromatography/mass spectrometry experiments.
The software has successfully been employed in a number of projects ranging from quantitative liquid-chromatography-mass spectrometry based analysis of transforming growth factor-beta stimulated fi-broblasts to 2D gel electrophoresis/mass spectrometry analysis of biopsies from human cervix. The software is available for download at SourceForge.
Many proteomics initiatives require a seamless bioinformatics integration of a range of analytical steps between sample collection and systems modeling immediately assessable to the participants involved in the process. Proteomics profiling by 2D gel electrophoresis to the putative identification of differentially expressed proteins by comparison of mass spectrometry results with reference databases, includes many components of sample processing, not just analysis and interpretation, are regularly revisited and updated. In order for such updates and dissemination of data, a suitable data structure is needed. However, there are no such data structures currently available for the storing of data for multiple gels generated through a single proteomic experiments in a single XML file. This paper proposes a data structure based on XML standards to fill the void that exists between data generated by proteomics experiments and storing of data.
In order to address the resulting procedural fluidity we have adopted and implemented a data model centered on the concept of annotated gel (AG) as the format for delivery and management of 2D Gel electrophoresis results. An eXtensible Markup Language (XML) schema is proposed to manage, analyze and disseminate annotated 2D Gel electrophoresis results. The structure of AG objects is formally represented using XML, resulting in the definition of the AGML syntax presented here.
The proposed schema accommodates data on the electrophoresis results as well as the mass-spectrometry analysis of selected gel spots. A web-based software library is being developed to handle data storage, analysis and graphic representation. Computational tools described will be made available at . Our development of AGML provides a simple data structure for storing 2D gel electrophoresis data.
In spite of two-dimensional gel electrophoresis (2-DE) being an effective and widely used method to screen the proteome, its data standardization has still not matured to the level of microarray genomics data or mass spectrometry approaches. The trend toward identifying encompassing data standards has been expanding from genomics to transcriptomics, and more recently to proteomics. The relative success of genomic and transcriptomic data standardization has enabled the development of central repositories such as GenBank and Gene Expression Omnibus. An equivalent 2-DE-centric data structure would similarly have to include a balance among raw data, basic feature detection results, sufficiency in the description of the experimental context and methods, and an overall structure that facilitates a diversity of usages, from central reposition to local data representation in LIMs systems.
Results & Conclusion
Achieving such a balance can only be accomplished through several iterations involving bioinformaticians, bench molecular biologists, and the manufacturers of the equipment and commercial software from which the data is primarily generated. Such an encompassing data structure is described here, developed as the mature successor to the well established and broadly used earlier version. A public repository, AGML Central, is configured with a suite of tools for the conversion from a variety of popular formats, web-based visualization, and interoperation with other tools and repositories, and is particularly mass-spectrometry oriented with I/O for annotation and data analysis.
Differential proteome studies are a powerful tool for the analysis of differences between two sample states. A challenge encountered in any proteome study is the reproducibility of the sample preparation and data analysis. The significance analysis of the results and the extent to which changes can reliably be detected are affected by this.
We studied the changes of the proteome during cell differentiation using a combination of large format 2D gel electrophoresis, image analysis, and mass spectrometry.
The basis for any analysis is the reproducibility of the results and the study design. Firstly, the reproducibility of large-format 2D gel electrophoresis was shown. Two samples of the same patient were analyzed using three replicate gels each. The spot quantitation of the two samples was found to be in good agreement. The relative mean standard deviation of the spot intensities within the replicate gels was 20% coefficient of variance. This allows us to analyze changes in the protein spot intensity that are smaller than a factor of two. The study design was optimized in order to account for technical and biological variation.
In the main study, 1800–2000 spots were quantified per gel. The large patient heterogeneity did not allow us to use a strict fold-change criterion for the selection of significantly changed spots between the two sample states. The variation of the spot intensity in one patient group was very much dependent on the nature of each individual protein. Therefore, a student’s t-test was employed to calculate the statistical significance for each spot. A total of 31 protein spots were found to be changed upon differentiation. Of these, 17 spots were unique for one of the samples, and another 14 spots were found to be highly significant (P = 99.9%). The effect of the Bonferroni correction and the false discovery rate is evaluated.
We have developed the Yale Protein Expression Database (YPED) to address
the storage, retrieval, and integrated analysis of proteomics data generated
by Yale's Keck Protein Chemistry and Mass Spectrometry Facility. YPED
is Web-accessible and currently handles sample requisition, result
reporting and sample comparison for ICAT, DIGE and MUDPIT samples. Sample
descriptions are compatible with the evolving MIAPE standards. Peptides
and proteins identified using Sequest or Mascot are validated
with the Trans-Proteomic Pipeline developed at the Institute
of Systems Biology and data from the resulting XML file are stored in
the database. Researchers can view, subset and download their data through
a secure Web interface.
Technological advances in mass spectrometry and other detection methods are leading to larger and larger proteomics datasets. However, when papers describing such information are published the enormous volume of data can typically only be provided as supplementary data in a tabular form through the journal website. Several journals in the proteomics field, together with the Human Proteome Organization's (HUPO) Proteomics Standards Initiative and institutions such as the Institute for Systems Biology are working towards standardizing the reporting of proteomics data, but just defining standards is only a means towards an end for sharing data. Data repositories such as ProteomeCommons.org and the Open Proteomics Database allow for public access to proteomics data but provide little, if any, interpretation.
Results & conclusion
Here we describe PrestOMIC, an open source application for storing mass spectrometry-based proteomic data in a relational database and for providing a user-friendly, searchable and customizable browser interface to share one's data with the scientific community. The underlying database and all associated applications are built on other existing open source tools, allowing PrestOMIC to be modified as the data standards evolve. We then use PrestOMIC to present a recently published dataset from our group through our website.
Two-dimensional electrophoresis is an established method used to study differences in protein expression caused by, for example, a disease state or drug treatment. Conventional methods require the separation of one sample on each individual gel. This approach exposes the data to a high level of system variation, such as gel-to-gel variation caused by experimental factors. Quantification of protein differences can thus be uncertain and lead to false biological conclusions. Two alternatives to reduce this variation are (1) increase the number of replicates or (2) use the Ettan difference gel electrophoresis (DIGE) system. The Ettan DIGE system is a well-established technology in proteomics, which uses an internal standard for between-gel normalization. By pre-labeling samples prior to 2D electrophoresis with three spectrally resolvable CyDye DIGE Fluor dyes, electrophoretic co-migration of three protein samples on the same 2D gel is possible. This approach significantly reduces the number of replicate gels needed to ensure reproducibility and reliability of the differential expression analysis.
Here, we present results from a differential expression analysis experiment performed with Escherichia coli samples grown under different conditions. We demonstrate that by using DIGE and the DeCyder 2D co-detection algorithm, the numbers of replicates are significantly reduced and the system variability is minimized compared to conventional electrophoresis with post-stained gels.
Despite the growing volumes of proteomic data, integration of the underlying results remains problematic owing to differences in formats, data captured, protein accessions and services available from the individual repositories. To address this, we present the ISPIDER Central Proteomic Database search (http://www.ispider.manchester.ac.uk/cgi-bin/ProteomicSearch.pl), an integration service offering novel search capabilities over leading, mature, proteomic repositories including PRoteomics IDEntifications database (PRIDE), PepSeeker, PeptideAtlas and the Global Proteome Machine. It enables users to search for proteins and peptides that have been characterised in mass spectrometry-based proteomics experiments from different groups, stored in different databases, and view the collated results with specialist viewers/clients. In order to overcome limitations imposed by the great variability in protein accessions used by individual laboratories, the European Bioinformatics Institute's Protein Identifier Cross-Reference (PICR) service is used to resolve accessions from different sequence repositories. Custom-built clients allow users to view peptide/protein identifications in different contexts from multiple experiments and repositories, as well as integration with the Dasty2 client supporting any annotations available from Distributed Annotation System servers. Further information on the protein hits may also be added via external web services able to take a protein as input. This web server offers the first truly integrated access to proteomics repositories and provides a unique service to biologists interested in mass spectrometry-based proteomics.
The number of microarray and other high-throughput experiments on primary repositories keeps increasing as do the size and complexity of the results in response to biomedical investigations. Initiatives have been started on standardization of content, object model, exchange format and ontology. However, there are backlogs and inability to exchange data between microarray repositories, which indicate that there is a great need for a standard format and data management.
We have introduced a metadata framework that includes a metadata card and semantic nets that make experimental results visible, understandable and usable. These are encoded in syntax encoding schemes and represented in RDF (Resource Description Frame-word), can be integrated with other metadata cards and semantic nets, and can be exchanged, shared and queried. We demonstrated the performance and potential benefits through a case study on a selected microarray repository. We concluded that the backlogs can be reduced and that exchange of information and asking of knowledge discovery questions can become possible with the use of this metadata framework.
Knowledge discovery; Metadata card; Metadata registry; Microarray; Semantic net
PRIDE, the ‘PRoteomics IDEntifications database’ () is a database of protein and peptide identifications that have been described in the scientific literature. These identifications will typically be from specific species, tissues and sub-cellular locations, perhaps under specific disease conditions. Any post-translational modifications that have been identified on individual peptides can be described. These identifications may be annotated with supporting mass spectra. At the time of writing, PRIDE includes the full set of identifications as submitted by individual laboratories participating in the HUPO Plasma Proteome Project and a profile of the human platelet proteome submitted by the University of Ghent in Belgium. By late 2005 PRIDE is expected to contain the identifications and spectra generated by the HUPO Brain Proteome Project. Proteomics laboratories are encouraged to submit their identifications and spectra to PRIDE to support their manuscript submissions to proteomics journals. Data can be submitted in PRIDE XML format if identifications are included or mzData format if the submitter is depositing mass spectra without identifications. PRIDE is a web application, so submission, searching and data retrieval can all be performed using an internet browser. PRIDE can be searched by experiment accession number, protein accession number, literature reference and sample parameters including species, tissue, sub-cellular location and disease state. Data can be retrieved as machine-readable PRIDE or mzData XML (the latter for mass spectra without identifications), or as human-readable HTML.
We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for reporting protein and peptide identifications, developed by the HUPO proteomics standards initiative (PSI). mzIdentML files do not contain spectra data but contain references to different kinds of external MS data files. As a key functionality, all parsers implement a common interface that supports the various methods used by mzIdentML to reference external spectra. Thus, when developing software for mzIdentML, programmers no longer have to support multiple MS data file formats but only this one interface. The library (which includes a viewer) is open source and, together with detailed documentation, can be downloaded from http://code.google.com/p/jmzreader/.
Bioinformatics; Data standard; Java; MS data processing; Proteomics standards initiative
The field of organellar proteomics has emerged as an attempt to minimize the complexity of the proteomics data obtained from whole cell and tissue extracts while maximizing the resolution on the protein composition of a single subcellular compartment. Standard methods involve lengthy density-based gradient and/or immunoaffinity purification steps followed by extraction, one-dimensional or two-dimensional gel electrophoresis, gel staining, in-gel tryptic digestion and protein identification by mass spectrometry. In this paper, we present an alternate approach to purify subcellular organelles containing a fluorescent reporter molecule. The gel-free procedure involves fluorescence-assisted sorting of the secretory granules followed by gentle extraction in a buffer compatible with tryptic digestion and mass-spectrometry. Once the subcellular organelle labeled, this procedure can be done in a single day, requires no major modification to any instrumentation and can be readily adapted to the study of other organelles. When applied to corticotrope secretory granules, it led to a much enriched granular fraction from which numerous proteins could be identified through mass spectrometry.
Corticotropes; FACS; FAOS; Protein secretion; Secretory granules