Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways.
S. cerevisiae; selected reaction monitoring; SRM; MRM; spectral library; peptide library; mass spectrometric map; protein QTL
Bulk degradation of cytoplasmic material is mediated by a highly conserved intracellular trafficking pathway termed autophagy. This pathway is characterized by the formation of double-membrane vesicles termed autophagosomes engulfing the substrate and transporting it to the vacuole/lysosome for breakdown and recycling. The Atg1/ULK1 kinase is essential for this process; however, little is known about its targets and the means by which it controls autophagy. Here we have screened for Atg1 kinase substrates using consensus peptide arrays and identified three components of the autophagy machinery. The multimembrane-spanning protein Atg9 is a direct target of this kinase essential for autophagy. Phosphorylated Atg9 is then required for the efficient recruitment of Atg8 and Atg18 to the site of autophagosome formation and subsequent expansion of the isolation membrane, a prerequisite for a functioning autophagy pathway. These findings show that the Atg1 kinase acts early in autophagy by regulating the outgrowth of autophagosomal membranes.
•The Atg1 kinase phosphorylation consensus was identified on peptide arrays•Atg9 is a direct target of the Atg1/ULK1 kinase in vitro and in vivo•Atg9 phosphorylation recruits Atg18 and Atg8 to the PAS•Atg9 phosphorylation is required for isolation membrane expansion/autophagy function
Autophagy function is pivotal to cell health. Papinski et al. identify the phosphorylation consensus of the central kinase in this pathway, Atg1. The autophagy-related protein Atg9 is a direct target of Atg1. Atg9 phosphorylation by Atg1 is required for autophagosome formation. This finding sheds light on how Atg1 controls autophagy.
Pyrococcus Furiosus (Pfu) is an excellent organism to generate reference samples for proteomics labs because of its moderately sized genome and very little sequence duplication within the genome. We demonstrated a stable and consistent method to prepare proteins in bulk that eliminates growth and preparation as a source of uncertainty in the standard. We performed several proteomic studies in different laboratories using each laboratory's specific workflow as well as separate and integrated data analysis. This study demonstrated that a Pfu whole cell lysate provides suitable protein sample complexity to not only validate proteomic methods, work flows, and benchmark new instruments, but also to facilitate comparison of experimental data generated over time and across instruments or labs.
Pyrococcus furiosus (Pfu); Proteomics; Protein Complex Standard; MudPIT; OFFGEL electrophoresis; Directed LC-MS/MS
Affinity purification coupled with mass spectrometry (AP-MS) is now a widely used approach for the identification of protein-protein interactions. However, for any given protein of interest, determining which of the identified polypeptides represent bona fide interactors versus those that are background contaminants (e.g. proteins that interact with the solid-phase support, affinity reagent or epitope tag) is a challenging task. While the standard approach is to identify nonspecific interactions using one or more negative controls, most small-scale AP-MS studies do not capture a complete, accurate background protein set. Fortunately, negative controls are largely bait-independent. Hence, aggregating negative controls from multiple AP-MS studies can increase coverage and improve the characterization of background associated with a given experimental protocol. Here we present the Contaminant Repository for Affinity Purification (the CRAPome) and describe the use of this resource to score protein-protein interactions. The repository (currently available for Homo sapiens and Saccharomyces cerevisiae) and computational tools are freely available online at www.crapome.org.
Topoisomerase I (TOP1) inhibitors are an important class of anticancer drugs. The cytotoxicity of TOP1 inhibitors can be modulated by replication fork reversal, in a process that requires PARP activity. Whether regressed forks can efficiently restart and the factors required to restart fork progression after fork reversal are still unknown. Here we combined biochemical and electron microscopy approaches with single-molecule DNA fiber analysis, to identify a key role for human RECQ1 helicase in replication fork restart after TOP1 inhibition, not shared by other human RecQ proteins. We show that the poly(ADPribosyl)ation activity of PARP1 stabilizes forks in their regressed state by limiting their restart by RECQ1. These studies provide new mechanistic insights into the roles of RECQ1 and PARP in DNA replication and offer molecular perspectives to potentiate chemotherapeutic regimens based on TOP1 inhibition.
Adoption of targeted mass spectrometry (MS) approaches such as multiple reaction monitoring (MRM) to study biological and biomedical questions is well underway in the proteomics community. Successful application depends on the ability to generate reliable assays that uniquely and confidently identify target peptides in a sample. Unfortunately, there is a wide range of criteria being applied to say that an assay has been successfully developed. There is no consensus on what criteria are acceptable and little understanding of the impact of variable criteria on the quality of the results generated. Publications describing targeted MS assays for peptides frequently do not contain sufficient information for readers to establish confidence that the tests work as intended or to be able to apply the tests described in their own labs. Guidance must be developed so that targeted MS assays with established performance can be made widely distributed and applied by many labs worldwide. To begin to address the problems and their solutions, a workshop was held at the National Institutes of Health with representatives from the multiple communities developing and employing targeted MS assays. Participants discussed the analytical goals of their experiments and the experimental evidence needed to establish that the assays they develop work as intended and are achieving the required levels of performance. Using this “fit-for-purpose” approach, the group defined three tiers of assays distinguished by their performance and extent of analytical characterization. Computational and statistical tools useful for the analysis of targeted MS results were described. Participants also detailed the information that authors need to provide in their manuscripts to enable reviewers and readers to clearly understand what procedures were performed and to evaluate the reliability of the peptide or protein quantification measurements reported. This paper presents a summary of the meeting and recommendations.
Application of a kinetic model of miRNA-mediated gene regulation to experimental data sets shows that the timescale of regulation is slower than previously assumed, due to bottlenecks imposed by miRNA turnover in the RNA-induced silencing complex and by slow protein decay.
A mathematical model links the dynamics of miRNA expression and loading into the Argonaute protein to the dynamics of miRNA targets.Loading of miRNAs into Argonaute and the slow decay of proteins impose two bottlenecks on the speed of miRNA-mediated regulation.Accelerated miRNA turnover is necessary for regulating target expression on the timescale of a day.
MiRNAs are post-transcriptional regulators that contribute to the establishment and maintenance of gene expression patterns. Although their biogenesis and decay appear to be under complex control, the implications of miRNA expression dynamics for the processes that they regulate are not well understood. We derived a mathematical model of miRNA-mediated gene regulation, inferred its parameters from experimental data sets, and found that the model describes well time-dependent changes in mRNA, protein and ribosome density levels measured upon miRNA transfection and induction. The inferred parameters indicate that the timescale of miRNA-dependent regulation is slower than initially thought. Delays in miRNA loading into Argonaute proteins and the slow decay of proteins relative to mRNAs can explain the typically small changes in protein levels observed upon miRNA transfection. For miRNAs to regulate protein expression on the timescale of a day, as miRNAs involved in cell-cycle regulation do, accelerated miRNA turnover is necessary.
gene expression regulation; kinetics; miRNAs; modeling; protein turnover
Deciphering physiological changes that mediate transition of Mycobacterium tuberculosis between replicating and nonreplicating states is essential to understanding how the pathogen can persist in an individual host for decades. We have combined RNA sequencing (RNA-seq) of 5′ triphosphate-enriched libraries with regular RNA-seq to characterize the architecture and expression of M. tuberculosis promoters. We identified over 4,000 transcriptional start sites (TSSs). Strikingly, for 26% of the genes with a primary TSS, the site of transcriptional initiation overlapped with the annotated start codon, generating leaderless transcripts lacking a 5′ UTR and, hence, the Shine-Dalgarno sequence commonly used to initiate ribosomal engagement in eubacteria. Genes encoding proteins with active growth functions were markedly depleted from the leaderless transcriptome, and there was a significant increase in the overall representation of leaderless mRNAs in a starvation model of growth arrest. The high percentage of leaderless genes may have particular importance in the physiology of nonreplicating M. tuberculosis.
•A resource for the identification of in vitro active promoters in M. tuberculosis•A quarter of all genes in M. tuberculosis are expressed as leaderless mRNAs•Leaderless mRNAs are differentially associated with toxin-antitoxin modules•Abundance of leaderless mRNAs increases during starvation-induced growth arrest
In this study, Cortes, Young, and colleagues report genome-wide mapping of transcriptional start sites combined with RNA sequencing and shotgun proteomics in the human pathogen Mycobacterium tuberculosis. A striking finding is a high proportion of genes expressed in the form of leaderless transcripts lacking the Shine-Dalgarno sequence conventionally required for translation. The distribution of functional gene classes between leaderless and Shine-Dalgarno transcriptomes suggests that changes in the specificity of translation may play a role in bacterial adaptation during infection.
Public repositories for proteomics data have accelerated proteomics research by enabling more efficient cross-analyses of datasets, supporting the creation of protein and peptide compendia of experimental results, supporting the development and testing of new software tools, and facilitating the manuscript review process. The repositories available to date have been designed to accommodate either shotgun experiments or generic proteomic data files. Here, we describe a new kind of proteomic data repository for the collection and representation of data from selected reaction monitoring (SRM) measurements. The PeptideAtlas SRM Experiment Library (PASSEL) allows researchers to easily submit proteomic data sets generated by SRM. The raw data are automatically processed in a uniform manner and the results are stored in a database, where they may be downloaded or browsed via a web interface that includes a chromatogram viewer. PASSEL enables cross-analysis of SRM data, supports optimization of SRM data collection, and facilitates the review process of SRM data. Further, PASSEL will help in the assessment of proteotypic peptide performance in a wide array of samples containing the same peptide, as well as across multiple experimental protocols.
data repository; MRM; software; SRM; targeted proteomics
Serum biomarkers can improve diagnosis and treatment of malignant pleural mesothelioma (MPM). However, the evaluation of potential new serum biomarker candidates is hampered by a lack of assay technologies for their clinical evaluation. Here we followed a hypothesis-driven targeted proteomics strategy for the identification and clinical evaluation of MPM candidate biomarkers in serum of patient cohorts.
Based on the hypothesis that cell surface exposed glycoproteins are prone to be released from tumor-cells to the circulatory system, we screened the surfaceome of model cell lines for potential MPM candidate biomarkers. Selected Reaction Monitoring (SRM) assay technology allowed for the direct evaluation of the newly identified candidates in serum. Our evaluation of 51 candidate biomarkers in the context of a training and an independent validation set revealed a reproducible glycopeptide signature of MPM in serum which complemented the MPM biomarker mesothelin.
Our study shows that SRM assay technology enables the direct clinical evaluation of protein-derived candidate biomarker panels for which clinically reliable ELISA’s currently do not exist.
Malignant pleural mesothelioma; Selected reaction monitoring; Surfaceome; Targeted proteomics; Serum biomarkers
Chemical cross-links identified by mass spectrometry generate distance restraints that reveal low-resolution structural information on proteins and protein complexes. The technology to reliably generate such data has become mature and robust enough to shift the focus to the question of how these distance restraints can be best integrated into molecular modeling calculations. Here, we introduce three workflows for incorporating distance restraints generated by chemical cross-linking and mass spectrometry into ROSETTA protocols for comparative and de novo modeling and protein-protein docking. We demonstrate that the cross-link validation and visualization software Xwalk facilitates successful cross-link data integration. Besides the protocols we introduce XLdb, a database of chemical cross-links from 14 different publications with 506 intra-protein and 62 inter-protein cross-links, where each cross-link can be mapped on an experimental structure from the Protein Data Bank. Finally, we demonstrate on a protein-protein docking reference data set the impact of virtual cross-links on protein docking calculations and show that an inter-protein cross-link can reduce on average the RMSD of a docking prediction by 5.0 Å. The methods and results presented here provide guidelines for the effective integration of chemical cross-link data in molecular modeling calculations and should advance the structural analysis of particularly large and transient protein complexes via hybrid structural biology methods.
The ATP-dependent chromatin-remodeling complex SWR1 exchanges a variant histone H2A.Z/H2B dimer for a canonical H2A/H2B dimer at nucleosomes flanking histone-depleted regions, such as promoters. This localization of H2A.Z is conserved throughout eukaryotes. SWR1 is a 1 megadalton complex containing 14 different polypeptides, including the AAA+ ATPases Rvb1 and Rvb2. Using electron microscopy, we obtained the three-dimensional structure of SWR1 and mapped its major functional components. Our data show that SWR1 contains a single heterohexameric Rvb1/Rvb2 ring that, together with the catalytic subunit Swr1, brackets two independently assembled multisubunit modules. We also show that SWR1 undergoes a large conformational change upon engaging a limited region of the nucleosome core particle. Our work suggests an important structural role for the Rvbs and a distinct substrate-handling mode by SWR1, thereby providing a structural framework for understanding the complex dimer-exchange reaction.
•SWR1 consists of four structurally discrete functional modules•Rvb1 and Rvb2 assemble into a single, heterohexameric ring in SWR1•SWR1 undergoes a large conformational change upon nucleosome binding•SWR1 forms limited contact with the nucleosome core particle
Structural analysis of the 14 subunit SWR1 chromatin remodeler identifies the orientation of its functional modules and reveals conformational changes induced by nucleosome binding.
The rigorous testing of hypotheses on suitable sample cohorts is a major limitation in translational research. This is particularly the case for the validation of protein biomarkers where the lack of accurate, reproducible and sensitive assays for most proteins has precluded the systematic assessment of hundreds of potential marker proteins described in the literature.
Here, we describe a high throughput method for the development and refinement of selected reaction monitoring (SRM) assays for human proteins. The method was applied to generate such assays for more than 1000 cancer-associated proteins, which are functionally related to candidate cancer driver mutations. We used the assays to determine the detectability of the target proteins in two clinically relevant samples, plasma and urine. 182 proteins were detected in depleted plasma, spanning five orders of magnitude in abundance and reaching below a concentration of 10 ng/mL. The narrower concentration range of proteins in urine allowed the detection of 408 proteins. Moreover, we demonstrate that these SRM assays allow the reproducible quantification of 34 biomarker candidates across 84 patient plasma samples. Through public access to the entire assay library, which will also be expandable in the future, researchers will be able to target their cancer-associated proteins of interest in any sample type using the detectability information in plasma and urine as a guide. The generated reference map of SRM assays for cancer-associated proteins is a valuable resource for accelerating and planning biomarker verification studies.
Quantitative measurement of proteins involved in insulin signaling and central metabolism in C57BL/6J and 129Sv mice subjected to a sustained high-fat diet reveals that the two strains diverge early in their response to the feeding regimen.
Quantitative targeted protein measurements were designed to quantify murine proteins covering the insulin-signaling pathway
and the lipid and carbohydrate metabolism and used to compare the differential effect of a persistent high-fat diet in C57BL/6J
and 129Sv mouse strains.Differential effect of a persistent high-fat diet were compared in C57BL/6J and 129Sv mouse strains.Differences in protein abundances suggest that peroxisomal β-oxidation is actively promoted in fatty C57BL/6J mice whereas
lipogenesis activation dominates the response of 129Sv mice.Most strain-specific changes were apparent early in the regimen when phenotypic changes were already set, but not yet very
pronounced and they allow a clear discrimination of the mouse strains at an early stage during the long-term high-fat diet.Persistent high-fat diet also alters the transient changes that normally occur in C57BL/6J and 129Sv mice in response to fasting
or food intake.
The metabolic syndrome is a collection of risk factors including obesity, insulin resistance and hepatic steatosis, which occur together and increase the risk of diseases such as diabetes, cardiovascular disease and cancer. In spite of intense research, the complex etiology of insulin resistance and its association with the accumulation of triacylglycerides in the liver and with hepatic steatosis remains not completely understood. Here, we performed quantitative measurements of 144 proteins involved in the insulin-signaling pathway and central metabolism in liver homogenates of two genetically well-defined mouse strains C57BL/6J and 129Sv that were subjected to a sustained high-fat diet. We used targeted mass spectrometry by selected reaction monitoring (SRM) to generate accurate and reproducible quantitation of the targeted proteins across 36 different samples (12 conditions and 3 biological replicates), generating one of the largest quantitative targeted proteomics data sets in mammalian tissues. Our results revealed rapid response to high-fat diet that diverged early in the feeding regimen, and evidenced a response to high-fat diet dominated by the activation of peroxisomal β-oxidation in C57BL/6J and by lipogenesis in 129Sv mice.
liver; metabolic syndrome; NAFLD; proteomics; SRM
Affinity purification coupled with mass spectrometry (AP-MS) is now a widely used approach for the identification of protein-protein interactions. However, for any given protein of interest, determining which of the identified polypeptides represent bona fide interactors versus those that are background contaminants (e.g. proteins that interact with the solid-phase support, affinity reagent or epitope tag) is a challenging task. While the standard approach is to identify nonspecific interactions using one or more negative controls, most small-scale AP-MS studies do not capture a complete, accurate background protein set. Fortunately, since negative controls are largely bait-independent, we reasoned that the negative controls generated by the proteomics research community could be developed as a resource for scoring AP-MS data.
Here we present the Contaminant Repository for Affinity Purification (The CRAPome), currently containing AP-MS data from 343 control purifications conducted by 11 different research groups (www.crapome.org). Users employ an intuitive graphical user interface to explore the database, by either querying one protein at a time, downloading background contaminant lists for selected experimental conditions, or uploading their own data (alongside their own negative controls when available) and performing data analysis. The CRAPome database scores contaminants vs. true interactors based on semi-quantitative mass spectrometry data (normalized spectral counts) embedded in most mass spectrometry experiments. The Significance Analysis of INTeractome (SAINT) scoring scheme, in addition to a simpler Fold Change calculation (FC score) are used to score user-supplied data and return a ranked list of putative interactors. We also describe database structure and composition, provide examples of the use of this resource to filter contaminants with properly chosen controls, and demonstrate the utility of the scoring scheme for identifying bona fide interaction partners. The CRAPome accommodates a variety of purification schemes and, while currently focused on human data, will be expanded to other species.
Metabolome, proteome and physiology measurements were combined with mathematical modeling to unravel the temporal regulation of the metabolic fluxes during the diauxic shift in Saccharomyces cerevisiae.
The diauxic shift involves three main events: a reduction in the glycolytic flux and the production of storage compounds before glucose depletion; the reversion of carbon flow through glycolysis and onset of the glyoxylate cycle operation upon glucose exhaustion; and the shutting down of the pentose phosphate (PP) pathway with a change in the source of NADPH regeneration.The redistribution of fluxes toward the production of storage compounds prior glucose depletion drives glycolytic reactions closer to equilibrium, which is essential for the reversion of fluxes upon glucose exhaustion.The onset of the glyoxylate cycle is quantitatively more important than the activation of the tricarboxylic acid cycle for growth on ethanol.Flux through the PP pathway is halted in the later stages of the adaptation and NADPH regeneration is taken over by NADP-dependent enzymes in the glyoxylate cycle and ethanol metabolism.
The diauxic shift in Saccharomyces cerevisiae is an ideal model to study how eukaryotic cells readjust their metabolism from glycolytic to gluconeogenic operation. In this work, we generated time-resolved physiological data, quantitative metabolome (69 intracellular metabolites) and proteome (72 enzymes) profiles. We found that the diauxic shift is accomplished by three key events that are temporally organized: (i) a reduction in the glycolytic flux and the production of storage compounds before glucose depletion, mediated by downregulation of phosphofructokinase and pyruvate kinase reactions; (ii) upon glucose exhaustion, the reversion of carbon flow through glycolysis and onset of the glyoxylate cycle operation triggered by an increased expression of the enzymes that catalyze the malate synthase and cytosolic citrate synthase reactions; and (iii) in the later stages of the adaptation, the shutting down of the pentose phosphate pathway with a change in NADPH regeneration. Moreover, we identified the transcription factors associated with the observed changes in protein abundances. Taken together, our results represent an important contribution toward a systems-level understanding of how this adaptation is realized.
diauxic shift; fluxome; metabolome; proteome; Saccharomyces cerevisiae
We report a high quality and system-wide proteome catalogue covering 71% (3,542 proteins) of the predicted genes of fission yeast, Schizosaccharomyces pombe, presenting the largest protein dataset to date for this important model organism. We obtained this high proteome and peptide (11.4 peptides/protein) coverage by a combination of extensive sample fractionation, high resolution Orbitrap mass spectrometry, and combined database searching using the iProphet software as part of the Trans-Proteomics Pipeline. All raw and processed data are made accessible in the S. pombe PeptideAtlas. The identified proteins showed no biases in functional properties and allowed global estimation of protein abundances. The high coverage of the PeptideAtlas allowed correlation with transcriptomic data in a system-wide manner indicating that post-transcriptional processes control the levels of at least half of all identified proteins. Interestingly, the correlation was not equally tight for all functional categories ranging from rs >0.80 for proteins involved in translation to rs <0.45 for signal transduction proteins. Moreover, many proteins involved in DNA damage repair could not be detected in the PeptideAtlas despite their high mRNA levels, strengthening the translation-on-demand hypothesis for members of this protein class. In summary, the extensive and publicly available S. pombe PeptideAtlas together with the generated proteotypic peptide spectral library will be a useful resource for future targeted, in-depth, and quantitative proteomic studies on this microorganism.
Mammalian host response to pathogens is associated with fluctuations in high abundant proteins in body fluids as well as in regulation of proteins expressed in relatively low copy numbers like cytokines secreted from immune cells and endothelium. Hence, efficient monitoring of proteins associated with host response to pathogens remains a challenging task. In this paper we present a targeted proteome analysis of a panel of 20 proteins that are widely believed to be key players and indicators of bovine host response to mastitis pathogens. Stable isotope labeled variants of two concordant proteotypic peptides from each of these 20 proteins were obtained through the QconCAT method. We present the quantotypic properties of these 40 proteotypic peptides, and discuss their application to research in host pathogen interactions. Our results clearly demonstrate a robust monitoring of 17 targeted host-response proteins. Twelve of these were readily quantified in a simple extraction of mammary gland tissues, while the expression levels of the remaining proteins were too low for direct and stable quantification; hence their accurate quantification requires further fractionation of mammary gland tissues.
SRM; QconCAT assay; quantification; proteomics; quantotypic peptides; mastitis
A strategy is presented that combines metabolic fluxes with targeted phosphoproteomics measurements to drive testable hypotheses for the functionality of post-translational regulation in S. cerevisiae central metabolism.
Discovery-driven mass spectrometry phosphoproteomics identified 35 differentially phosphorylated enzymes of yeast central metabolism.Phosphoenzymes are predominant in upper glycolysis, around the pyruvate node and in carbohydrate storage pathways.A targeted phosphoproteomics method was developed to quantify total, phospho and non-phosphoprotein directly from crude cell extracts.Correlation of phosphoprotein levels with metabolic fluxes across conditions provided functional evidence for five novel phosphoregulated enzymes.Functional follow-ups demonstrated the inhibitory role of phosphorylation in controlling metabolic fluxes realised by Gpd1, Pda1 and Pfk2.
As a frequent post-translational modification, protein phosphorylation regulates many cellular processes. Although several hundred phosphorylation sites have been mapped to metabolic enzymes in Saccharomyces cerevisiae, functionality was demonstrated for few of them. Here, we describe a novel approach to identify in vivo functionality of enzyme phosphorylation by combining flux analysis with proteomics and phosphoproteomics. Focusing on the network of 204 enzymes that constitute the yeast central carbon and amino-acid metabolism, we combined protein and phosphoprotein levels to identify 35 enzymes that change their degree of phosphorylation during growth under five conditions. Correlations between previously determined intracellular fluxes and phosphoprotein abundances provided first functional evidence for five novel phosphoregulated enzymes in this network, adding to nine known phosphoenzymes. For the pyruvate dehydrogenase complex E1 α subunit Pda1 and the newly identified phosphoregulated glycerol-3-phosphate dehydrogenase Gpd1 and phosphofructose-1-kinase complex β subunit Pfk2, we then validated functionality of specific phosphosites through absolute peptide quantification by targeted mass spectrometry, metabolomics and physiological flux analysis in mutants with genetically removed phosphosites. These results demonstrate the role of phosphorylation in controlling the metabolic flux realised by these three enzymes.
metabolic flux; metabolism; phosphoproteome; post-translational regulation; selected reaction monitoring
Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) is widely used for quantitative proteomic investigations. The typical output of such studies is a list of identified and quantified peptides. The biological and clinical interest is, however, usually focused on quantitative conclusions at the protein level. Furthermore, many investigations ask complex biological questions by studying multiple interrelated experimental conditions. Therefore, there is a need in the field for generic statistical models to quantify protein levels even in complex study designs.
We propose a general statistical modeling approach for protein quantification in arbitrary complex experimental designs, such as time course studies, or those involving multiple experimental factors. The approach summarizes the quantitative experimental information from all the features and all the conditions that pertain to a protein. It enables both protein significance analysis between conditions, and protein quantification in individual samples or conditions. We implement the approach in an open-source R-based software package MSstats suitable for researchers with a limited statistics and programming background.
We demonstrate, using as examples two experimental investigations with complex designs, that a simultaneous statistical modeling of all the relevant features and conditions yields a higher sensitivity of protein significance analysis and a higher accuracy of protein quantification as compared to commonly employed alternatives. The software is available at http://www.stat.purdue.edu/~ovitek/Software.html.
Label-free LC-MS/MS; linear mixed effects models; protein quantification; quantitative proteomics; statistical design of experiments
Polycomb Repressive Complex 2 (PRC2) is essential for gene silencing, establishing transcriptional repression of specific genes by tri-methylating Lysine 27 of histone H3, a process mediated by cofactors such as AEBP2. In spite of its biological importance, little is known about PRC2 architecture and subunit organization. Here, we present the first three-dimensional electron microscopy structure of the human PRC2 complex bound to its cofactor AEBP2. Using a novel internal protein tagging-method, in combination with isotopic chemical cross-linking and mass spectrometry, we have localized all the PRC2 subunits and their functional domains and generated a detailed map of interactions. The position and stabilization effect of AEBP2 suggests an allosteric role of this cofactor in regulating gene silencing. Regions in PRC2 that interact with modified histone tails are localized near the methyltransferase site, suggesting a molecular mechanism for the chromatin-based regulation of PRC2 activity.
Protein complexes—stable structures that contain two or more proteins—have an important role in the biochemical processes that are associated with the expression of genes. Some help to silence genes, whereas others are involved in the activation of genes. The importance of such complexes is emphasized by the fact that mice die as embryos, or are born with serious defects, if they do not possess the protein complex known as Polycomb Repressive Complex 2, or PRC2 for short.
It is known that the core of this complex, which is found in species that range from Drosophila to humans, is composed of four different proteins, and that the structures of two of these have been determined with atomic precision. It is also known that PRC2 requires a particular protein co-factor (called AEBP2) to perform this function. Moreover, it has been established that PRC2 silences genes by adding two or three methyl (CH3) groups to a particular amino acid (Lysine 27) in one of the proteins (histone H3) that DNA strands wrap around in the nucleus of cells. However, despite its biological importance, little is known about the detailed architecture of PRC2.
Ciferri et al. shed new light on the structure of this complex by using electron microscopy to produce the first three-dimensional image of the human PRC2 complex bound to its cofactor. By incorporating various protein tags into the co-factor and the four subunits of the PRC2, and by employing mass spectrometry and other techniques, Ciferri et al. were able to identify 60 or so interaction sites within the PRC2-cofactor system, and to determine their locations within the overall structure.
The results show that the cofactor stabilizes the architecture of the complex by binding to it at a central hinge point. In particular, the protein domains within the PRC2 that interact with the histone markers are close to the site that transfer the methyl groups, which helps to explain how the gene silencing activity of the PRC2 complex is regulated. The results should pave the way to a more complete understanding of how PRC2 and its cofactor are able to silence genes.
cryo-EM; Gene silencing; labeling; chemical cross-linking; chromatin; Human
Data on absolute molecule numbers will empower the modeling, understanding, and comparison of cellular functions and biological systems. We quantified transcriptomes and proteomes in fission yeast during cellular proliferation and quiescence. This rich resource provides the first comprehensive reference for all RNA and most protein concentrations in a eukaryote under two key physiological conditions. The integrated data set supports quantitative biology and affords unique insights into cell regulation. Although mRNAs are typically expressed in a narrow range above 1 copy/cell, most long, noncoding RNAs, except for a distinct subset, are tightly repressed below 1 copy/cell. Cell-cycle-regulated transcription tunes mRNA numbers to phase-specific requirements but can also bring about more switch-like expression. Proteins greatly exceed mRNAs in abundance and dynamic range, and concentrations are regulated to functional demands. Upon transition to quiescence, the proteome changes substantially, but, in stark contrast to mRNAs, proteins do not uniformly decrease but scale with cell volume.
► Cellular numbers for all RNAs and most proteins during proliferation and quiescence ► Cells contain 1-10 copies of most mRNAs and ∼100–1 million copies of most proteins ► Distinct subset of long noncoding RNAs is expressed above 1 copy/cell ► Quiescent cells show ∼4-fold lower RNA concentrations and highly remodeled proteome
Quantitative RNA-seq and mass spectrometry in two cellular states are used to show that proteins greatly exceed mRNAs in abundance and dynamic range in yeast, and concentrations are regulated to functional demands. Upon transition to quiescence, the proteome changes substantially, but in contrast to mRNAs, proteins do not uniformly decrease but scale with cell volume.
Computational prediction methods for the identification of microRNA (miRNA) target genes benefit from efficient experimental validation strategies. Here we present a large-scale targeted proteomics approach to validate such predicted miRNA targets in Caenorhabditis elegans. Using selected reaction monitoring (SRM), we quantified 161 proteins of interest in extracts from wild-type and let-7 mutant worms. We demonstrate by independent experimental downstream analyses such as genetic interaction, as well as exemplarily performed polysomal profiling and luciferase assays, that validation by targeted proteomics significantly enriches for biologically relevant let-7 interactors. For example, we show that the zinc finger protein ZTF-7 is a bona fide let-7 miRNA target. We also validated a set of predicted miR-58 targets, demonstrating that this approach is adaptable to multiple miRNAs of interest. We propose that targeted mass spectrometry can be applied generally to validate candidate lists generated by computational methods or by large-scale experiments, and that the described strategy can easily be adapted to other organisms.
targeted proteomics; microRNA targets; let-7; C. elegans; selected / multiple reaction monitoring; SRM / MRM
The melanoma growth stimulatory activity/growth-regulated protein, CXCL1, is constitutively expressed at high levels during inflammation and progression of melanocytes into malignant melanoma. It has been shown previously that CXCL1 overexpression in melanoma cells is due to increased transcription as well as stability of the CXCL1 message. The transcription of CXCL1 is regulated through several cis-acting elements including Sp1, NF-κB, HMGI(Y), and the immediate upstream region (IUR) element (nucleotides −94 to −78), which lies immediately upstream to the nuclear factor κB (NF-κB) element. Previously, it has been shown that the IUR is necessary for basal and cytokine-induced transcription of the CXCL1 gene. UV cross-linking and Southwestern blot analyses indicate that the IUR oligonucleotide probe selectively binds a 115-kDa protein. In this study, the IUR element has been further characterized. We show here that proximity of the IUR element to the adjacent NF-κB element is critical to its function as a positive regulatory element. Using binding site oligonucleotide affinity chromatography, we have selectively purified the 115-kDa IUR-F. Mass spectrometry/mass spectrometry/matrix-assisted laser desorption ionization/time of flight spectroscopy and amino acid analysis as well as microcapillary reverse phase chromatography electrospray ionization tandem mass spectrometry identified this protein as the 114-kDa poly(ADP-ribose) polymerase (PARP1). Furthermore, 3-aminobenzamide, an inhibitor of PARP-specific ADP-ribosylation, inhibits CXCL1 promoter activity and reduces levels of CXCL1 mRNA. The data point to the possibility that PARP may be a coactivator of CXCL1 transcription.