Search tips
Search criteria

Results 1-25 (1408424)

Clipboard (0)

Related Articles

1.  Quantification of mRNA and protein and integration with protein turnover in a bacterium 
Determination of the average cellular copy number of 400 proteins under different growth conditions and integration with protein turnover and absolute mRNA levels reveals the dynamics of protein expression in the genome-reduced bacterium Mycoplasma pneumoniae.
Our study provides a fine-grained, quantitative picture to unprecedented detail in an established model organism for systems-wide studies.Our integrative approach reveals a novel, dynamic view on the processes, interactions and regulations underlying the central dogma pathway and the composition of protein complexes.Simulations using our quantitative data on mRNA, protein and turnover show how an organism copes with stochastic noise in gene expression in vivo.Our data serve as an important resource for colleagues both within our field of research and in related disciplines.
A hallmark of Systems Biology is the integration of diverse, large quantitative data sets with the aim to gain novel insights into how biological processes work. We measured individual mRNA and protein abundances as well as protein turnover in the bacterium Mycoplasma pneumoniae. This human pathogen is an ideal model organism for organism-wide studies. It can be readily cultured under laboratory conditions and it has a very small genome with only 690 protein-coding genes. This comparably low complexity allows for the exhaustive analysis of major cellular biomolecules avoiding constrains introduced by limitations of available analysis techniques.
Using a recently developed mass spectrometry-based approach, we determined the average cellular copy number for over 400 individual proteins under different growth and stress conditions. The 20 most abundant proteins, including Elongation factor Tu, cellular chaperones, and proteins involved in metabolizing glucose, the major energy source of M. pneumoniae account for nearly 44% of the total cellular protein mass. We observed abundance changes of many expected and several unexpected proteins in response to cellular stress, such as heat shock, DNA damage and osmotic stress, as well as along batch culture growth over 4 days.
Integration of the protein abundance data with quantitative mRNA measurements revealed a modest correlation between these two classes of biomolecules. However, for several classical stress-induced proteins, we observed a correlated induction of mRNA and protein in response to heat shock. A focused analysis of mRNA–protein abundance dynamics during batch culture growth suggested that the regulation of gene expression is largely decoupled from protein dynamics in M. pneumoniae, indicating extensive post-transcriptional and post-translational regulation influencing the cellular mRNA–protein ratios.
To investigate the factors influencing the cellular protein abundance, we measured individual protein turnover rates by mass spectrometry using a label-chase approach involving stable isotope-labelled amino acids. The average half-life of a protein in M. pneumoniae is 23 h. Based on the measured quantitative mRNA data, the protein abundances and their half-lives, we established an ordinary differential equations model for the estimation of individual in vivo protein degradation and translation efficiency rates. We found out that translation efficiency rather than protein turnover is the dominating factor influencing protein abundance. Using our abundance and turnover data, we additionally performed stochastic simulations of gene expression. We observed that long protein half-life and low translational efficiency buffers gene expression noise propagating from low cellular mRNA levels in vivo.
We compared the abundance ratios of proteins associating into complexes in vivo with their expected functional stoichiometries. We observed that for stable protein complexes, such as the GroEL/ES chaperonin or DNA gyrase, our measured abundance ratios reflected the expected subunit stoichiometries. More dynamic protein complexes, such as the DnaK/J/GrpE chaperone system or RNA polymerase, showed several unusual subunit ratios, pointing towards transient interaction of sub-stoichiometric subunits for function. A detailed, quantitative analysis of the ribosome, the largest cellular protein complex, revealed large abundance differences of the 51 subunits. This observation indicates a multi-functionality for several, abundant ribosomal proteins.
Finally, a comparison of the determined average cellular protein abundances with a different pathogenic bacterium, Leptospira interrogans, revealed that cellular protein abundances closely reflect their respective lifestyles.
Our study represents an organism-wide, quantitative analysis of cellular protein abundances. Integrating our proteomics data with determined mRNA levels and protein turnover rates reveals insights into the dynamic interplay and regulation of mRNA and proteins, the central biomolecules of a cell.
Biological function and cellular responses to environmental perturbations are regulated by a complex interplay of DNA, RNA, proteins and metabolites inside cells. To understand these central processes in living systems at the molecular level, we integrated experimentally determined abundance data for mRNA, proteins, as well as individual protein half-lives from the genome-reduced bacterium Mycoplasma pneumoniae. We provide a fine-grained, quantitative analysis of basic intracellular processes under various external conditions. Proteome composition changes in response to cellular perturbations reveal specific stress response strategies. The regulation of gene expression is largely decoupled from protein dynamics and translation efficiency has a higher regulatory impact on protein abundance than protein turnover. Stochastic simulations using in vivo data show how low translation efficiency and long protein half-lives effectively reduce biological noise in gene expression. Protein abundances are regulated in functional units, such as complexes or pathways, and reflect cellular lifestyles. Our study provides a detailed integrative analysis of average cellular protein abundances and the dynamic interplay of mRNA and proteins, the central biomolecules of a cell.
PMCID: PMC3159969  PMID: 21772259
mRNA–protein; Mycoplasma pneumoniae; protein homeostasis; protein turnover; quantitative proteomics
2.  The quantitative proteomes of human-induced pluripotent stem cells and embryonic stem cells 
An in-depth proteomic comparison of human-induced pluripotent stem cells, and their parent fibroblast cells, with embryonic stem cells shows that the reprogramming process comprehensively remodels protein expression levels, creating cells that closely resemble natural stem cells.
We present here a large proteomic characterization of human embryonic stem cells, human-induced pluripotent stem cells and their parental fibroblasts cell lines.Overall, 97.8% of the 2683 quantified proteins in four experiments showed no significant differences in abundance between hESC and hiPSC highlighting the high similarity of these pluripotent cell lines.In total, 58 proteins were found significantly differentially expressed between hiPSCs and hESCs. The observed low overlap of these proteins with previous transcriptomic studies suggests that those differences do no reflect a recurrent molecular signature.
Human embryonic stem cells (hESCs) are capable of self-renewal and multi-lineage differentiation. However, the use of hESCs for clinical treatment entails ethical issues as they are derived from human embryos. Recently, reprogramming of somatic cells to an embryonic stem cell-like state, named induced pluripotent stem cells (iPSCs), was achieved through ectopic expression of defined factors. In addition to their clinical potential, hiPSCs represent a unique tool to develop cellular models for human diseases as well. Although current functional assays (e.g., tetraploid complementation) have confirmed the pluripotency of hiPSCs, there might still be significant differences (e.g., differentiation potential) when compared with their natural hESCs counterparts. Consequently, an extensive molecular characterization to address differences and similarities between these two pluripotent cell lines seems to be a prerequisite before any clinical application is conducted. Despite that great efforts, mainly at the genomic levels, have been made to address how similar hESCs and hiPSCs are, the definite answer to this fundamental question is currently still debated. Direct assessment of protein levels has yet to be incorporated into these integrative systems-level analyses. Protein levels are tuned by intricate mechanisms of gene expression regulation and it has recently been documented that mRNA and protein levels poorly correlate in mouse ESCs. Here, we use in-depth quantitative proteomics to gain insights into the differences and similarities in the protein content of two hiPS cell lines, their precursor IMR90 and 4Skin fibroblast cell lines and one hES cell line, providing novel molecular signatures that may assist in filling a gap in the understanding of pluripotency.
To study the degree of similarity, at the protein level, between hiPSCs and hESCs, four MS-based proteomic experiments were designed that use our in-house developed triplex dimethyl labeling chemistry followed by extensive fractionation by strong cation exchange (SCX) chromatography to reduce the sample complexity. High-resolution LC-MS/MS with dedicated fragmentation schemes (i.e., electron transfer dissociation, collision-induced dissociation and higher-energy collision dissociation) was subsequently used to maximize peptide identification rates. A total of 348 LC-MS/MS analyses (including technical and biological replicates) were performed. We confidently identified 1 593 446 peptide spectrum matches (peptide FDR<1%) corresponding to 10 628 unique protein groups (protein FDR∼4%). Using the extracted ion chromatograms, we also estimated the absolute abundance of the proteins within the samples spanning six orders of magnitude. To the best of our knowledge, the coverage obtained in this study represents the largest achieved by any proteomics screen on pluripotent cells.
Most importantly, our results indicate that the reprogramming process remodeled the proteome of both fibroblast cell lines to a profile that closely resembles the pluripotent hESCs proteome: 97.8% of the quantified proteins (2638 proteins in all four experiments) showed nonsignificant changes. Nevertheless, a small fraction of 58 proteins, mainly related to metabolism, antigen processing and cell adhesion, was found significantly regulated between hiPSCs and hESCs. A comparison of the regulated proteins to previously published transcriptomic studies showed a low overlap, highlighting the emerging notion that differences between both pluripotent cell lines rather reflect experimental conditions than a recurrent molecular signature. On the other side, the inclusion of the two parental fibroblast cell lines in our analysis allowed us to study changes in the proteome at both the starting and end points of the reprogramming process. As expected, the vast majority of the proteins (73.4%) showed differential expression between the parental fibroblasts and the reprogrammed pluripotent cells.
To find out if the differences observed in our study were a consequence of transcriptional or translational regulation, we performed paired genome-wide gene expression analyses on the same six samples that were used for the proteomic profiling. Overall, we observed a good correlation between mRNA and protein levels (r∼0.7). These results further authenticated the proteomic measurements and implied a high degree of control at the transcriptional level. Nevertheless, numerous genes were found uncorrelated highlighting the necessity of complementing transcriptomic-based approaches with proteomics.
Assessing relevant molecular differences between human-induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs) is important, given that such differences may impact their potential therapeutic use. Controversy surrounds recent gene expression studies comparing hiPSCs and hESCs. Here, we present an in-depth quantitative mass spectrometry-based analysis of hESCs, two different hiPSCs and their precursor fibroblast cell lines. Our comparisons confirmed the high similarity of hESCs and hiPSCS at the proteome level as 97.8% of the proteins were found unchanged. Nevertheless, a small group of 58 proteins, mainly related to metabolism, antigen processing and cell adhesion, was found significantly differentially expressed between hiPSCs and hESCs. A comparison of the regulated proteins with previously published transcriptomic studies showed a low overlap, highlighting the emerging notion that differences between both pluripotent cell lines rather reflect experimental conditions than a recurrent molecular signature.
PMCID: PMC3261715  PMID: 22108792
human embryonic stem cells; human-induced pluripotent stem cells; proteomics; quantitation
3.  A dynamic model of proteome changes reveals new roles for transcript alteration in yeast 
By characterizing dynamic changes in yeast protein abundance following osmotic shock, this study shows that the correlation between protein and mRNA differs for transcripts that increase versus decrease in abundance, and reveals physiological reasons for these differences.
The correlation between protein and mRNA change is very high at transcripts that increase in abundance, but negligible at reduced transcripts following NaCl shock.Modeling and experimental data suggest that reducing levels of high-abundance transcripts helps to direct translational machinery to newly made transcripts.The transient burst of transcript increase serves to accelerate changes in protein abundance.Post-transcriptional regulation of protein abundance is pervasive, although most of the variance in protein change is explained by changes in mRNA abundance.
Natural microenvironments change rapidly, and living creatures must respond quickly and efficiently to thrive within this flux. At all cellular levels—signaling, transcription, translation, metabolism, cell growth, and division—the response is dynamic and coordinated. Some aspects of this response, such as dynamic changes of the transcriptome, are well understood. But other aspects, like the response of the proteome, have remained obscured primarily because of previous limitations in technology. Without coordinated time-course data, it has remained impossible to correctly characterize the correlations and dependencies between these two essential levels of cell biology.
This work presents an extended picture of the coordinated response of the transcriptome and proteome as cells respond to an abrupt environmental change. To assay proteomic dynamics, we developed a strategy for large-scale, multiplexed quantitation using isobaric tags and high mass accuracy mass spectrometry. This sensitive yet efficient platform allows for the expedient collection of quantitative time-course proteomic data at six time points, sufficiently reproducible to permit meaningful interpretation of variation across biological replicates. Time-course transcriptome data were generated from paired biological samples, allowing us to examine the relationships between changes in mRNA and protein for each gene in terms of direction and intensity, as well as the characteristics of the temporal profiles for each gene.
It was immediately obvious that a single measure of correlation across the entire data set was a meaningless metric. We therefore analyzed relationships between mRNA and protein for different subsets of data. In response to osmotic shock, hundreds of transcripts are highly induced, and their temporal pattern reveals a transient peak of maximal induction, which resolves into a new elevated level as cells acclimate (Figure 2). For this group of genes, there is extremely high correlation between peak mRNA change and protein change (R2∼0.8). But the dynamics of the molecules differ: while mRNA levels transiently overshoot their final levels, proteins gradually rise in abundance toward their new, elevated state. We observed, however, that a measure of efficiency connects the two profiles. The time it takes for a protein to acclimate to its new state correlates with the magnitude of the excess mRNA induction. Thus, the cell imparts an urgency to protein induction by transiently producing excess transcript.
The most surprising result, however, involves transcripts that decrease in abundance. In response to osmotic shock, the cell transiently reduces over 600 transcripts, many of which are among the most highly expressed in unstressed cells. But protein levels for these genes remain, for the most part, almost completely unchanged. The stark absence of protein repression is independent of basal protein abundance, independent of reported protein half-lives, reproducible across biological replicates, and validated by quantitative western blots. Furthermore, since we do detect a handful of proteins whose abundance is significantly reduced, our technology is capable of identifying protein loss. Thus, we conclude that transcript reduction serves another purpose besides reducing protein levels.
To explore alternate interpretations of the consequence of transcriptional repression, we devised a mass-action kinetic model, which describes protein changes based on mRNA dynamics in the context of transient changes in the rates of cell division. The model successfully recapitulated the observed data, allowing us to alter modeling parameters to test various hypotheses.
In response to osmotic shock, overall rates of translation temporarily decrease and cell growth transiently arrests before resuming at a slower rate. We reasoned that mRNA reduction might lower the rate of new protein synthesis, but that retarded production is balanced by reduced cell division. We explored both aspects of this logic with our model.
As expected, removing cell division from our model led to a calculated decrease of protein levels, indicating that reduced growth is necessary for maintaining protein levels. However, when we computationally held mRNA levels stable and calculated protein levels in the absence of mRNA repression, we did not find the expected increase in protein abundance.
We then considered the possibility that one function of the regulated repression of these highly abundant transcripts was to liberate proteins essential for translation, such as ribosomes or translation initiation factors. To explore this, we examined a mutant lacking the Dot6p/Tod6p transcriptional repressors, which fails to properly repress ∼250 genes in response to osmotic shock. In the wild type, the mRNA for a Dot6p/Tod6p target (ARX1) decreased seven-fold, and the remaining transcript was generally unassociated with poly-ribosomes. In the mutant, however, the mRNA levels were reduced only two-fold, while the remaining transcript continued to bind ribosomes. Therefore, failure to reduce transcript levels led to a persistent association with poly-ribosomes, thereby consuming translational machinery.
Our hypothesis is, therefore, that widespread changes in the transcriptome promote efficient translation of new proteins. Transcript increase serves to increase abundance of the encoded proteins, while reduction of some of the most abundant and highly translated mRNAs supports this project by liberating translational capacity. While it is not clear what factors are the limiting elements, it is clear that a full picture of cellular biology requires exploring the dynamics of the cellular response.
The transcriptome and proteome change dynamically as cells respond to environmental stress; however, prior proteomic studies reported poor correlation between mRNA and protein, rendering their relationships unclear. To address this, we combined high mass accuracy mass spectrometry with isobaric tagging to quantify dynamic changes in ∼2500 Saccharomyces cerevisiae proteins, in biological triplicate and with paired mRNA samples, as cells acclimated to high osmolarity. Surprisingly, while transcript induction correlated extremely well with protein increase, transcript reduction produced little to no change in the corresponding proteins. We constructed a mathematical model of dynamic protein changes and propose that the lack of protein reduction is explained by cell-division arrest, while transcript reduction supports redistribution of translational machinery. Furthermore, the transient ‘burst' of mRNA induction after stress serves to accelerate change in the corresponding protein levels. We identified several classes of post-transcriptional regulation, but show that most of the variance in protein changes is explained by mRNA. Our results present a picture of the coordinated physiological responses at the levels of mRNA, protein, protein-synthetic capacity, and cellular growth.
PMCID: PMC3159980  PMID: 21772262
dynamics; modeling; proteomics; stress; transcriptomics
4.  Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans 
Nature  2009;460(7256):762-765.
Mass spectrometry based methods for relative proteome quantification have broadly impacted life science research. However, important research directions, particularly those involving mathematical modeling and simulation of biological processes, also critically depend on absolutely quantitative data, i.e. knowledge of the concentration of the expressed proteins as a function of cellular state. Until now, absolute protein concentration measurements of a significant fraction of the proteome (73%) have only been derived from genetically altered S. cerevisiae cells 1, a technique that is not directly portable from yeast to other species. In this study we developed and applied a mass spectrometry based strategy to determine the absolute quantity i.e. the average number of protein copies per cell in a cell population, for a significant fraction of the proteome in genetically unperturbed cells. Applying the technology to the human pathogen Leptospira interrogans, a spirochete responsible for Leptospirosis 4, we generated an absolute protein abundance scale for 83% of the mass spectrometry detectable proteome, from cells at different states. Taking advantage of the unique cellular dimensions of L. interrogans, we used cryo electron tomography (cryoET) morphological measurements to verify at the single cell level the average absolute abundance values of selected proteins determined by mass spectrometry on a population of cells. As the strategy is relatively fast and applicable to any cell type we expect that it will become a cornerstone of quantitative biology and systems biology.
PMCID: PMC2723184  PMID: 19606093
5.  Estimation of Absolute Protein Quantities of Unlabeled Samples by Selected Reaction Monitoring Mass Spectrometry* 
Molecular & Cellular Proteomics : MCP  2011;11(3):M111.013987.
For many research questions in modern molecular and systems biology, information about absolute protein quantities is imperative. This information includes, for example, kinetic modeling of processes, protein turnover determinations, stoichiometric investigations of protein complexes, or quantitative comparisons of different proteins within one sample or across samples. To date, the vast majority of proteomic studies are limited to providing relative quantitative comparisons of protein levels between limited numbers of samples. Here we describe and demonstrate the utility of a targeting MS technique for the estimation of absolute protein abundance in unlabeled and nonfractionated cell lysates. The method is based on selected reaction monitoring (SRM) mass spectrometry and the “best flyer” hypothesis, which assumes that the specific MS signal intensity of the most intense tryptic peptides per protein is approximately constant throughout a whole proteome. SRM-targeted best flyer peptides were selected for each protein from the peptide precursor ion signal intensities from directed MS data. The most intense transitions per peptide were selected from full MS/MS scans of crude synthetic analogs. We used Monte Carlo cross-validation to systematically investigate the accuracy of the technique as a function of the number of measured best flyer peptides and the number of SRM transitions per peptide. We found that a linear model based on the two most intense transitions of the three best flying peptides per proteins (TopPep3/TopTra2) generated optimal results with a cross-correlated mean fold error of 1.8 and a squared Pearson coefficient R2 of 0.88. Applying the optimized model to lysates of the microbe Leptospira interrogans, we detected significant protein abundance changes of 39 target proteins upon antibiotic treatment, which correlate well with literature values. The described method is generally applicable and exploits the inherent performance advantages of SRM, such as high sensitivity, selectivity, reproducibility, and dynamic range, and estimates absolute protein concentrations of selected proteins at minimized costs.
PMCID: PMC3316728  PMID: 22101334
6.  Proteome-wide systems analysis of a cellulosic biofuel-producing microbe 
We apply mass spectrometry-based ReDi proteomics to quantify the Clostridium phytofermentans proteome during fermentation of cellulosic substrates. ReDi proteomics gives accurate, low-cost quantification of an extra and intracellular microbial proteome. When combined with physiological measurements, these methods form a general systems biology strategy to evaluate the efficiency of cellulosic bioconversion and to identify enzyme targets to engineer for improving this process.C. phytofermentans expressed more than 100 carbohydrate-active enzymes, of which distinct subsets were upregulated on cellulose and hemicellulose. Numerous extracellular enzymes cleave insoluble plant polysaccharides into oligosaccharides, which are transported into the cell to be further degraded by intracellular carbohydratases. Sugars are catabolized by EMP glycolysis incorporating alternative glycolytic enzymes to maximize the ATP yield of anaerobic metabolism.During cellulosic fermentation, cells adhered to the substrate and altered metabolic processes such as upregulation of tryptophan and nicotinamide synthesis proteins and repression of proteins for fatty acid metabolism and cell motility. These diverse metabolic changes highlight how a systems approach can identify novel ways to optimize cellulosic fermentation.
Cellulose is the world's most abundant renewable, biological energy source (Leschine, 1995). Microbial fermentation of cellulosic biomass could sustainably provide enough ethanol for 65% of US ground transportation fuel at current levels (Somerville, 2006). However, cellulose in plant biomass is packaged into a crystalline matrix, making biomass deconstruction a key roadblock to using it as a feedstock (Houghton et al, 2006). A promising strategy to overcome biomass recalcitrance is consolidated bioprocessing (Lynd et al, 2002), which uses microbes such as Clostridium phytofermentans to both secrete enzymes to depolymerize biomass and then ferment the resulting hexose and pentose sugars to a biofuel such as ethanol. The C. phytofermentans genome encodes 161 carbohydrate-active enzymes (CAZy) including 108 glycoside hydrolases spread across 39 families (Cantarel et al, 2009), highlighting the elaborate set of enzymes needed to breakdown different cellulosic polysaccharides. Faced with the complexity of metabolizing biomass, systems biology strategies are needed to comprehensively identify which cellulolytic and metabolic enzymes are used to ferment different cellulosic substrates.
This study presents a systems-level analysis of how C. phytofermentans ferments different cellulosic substrates that incorporates quantitative mass spectrometry-based proteomics of over 2500 proteins. Protein concentrations within each carbon source treatment were calculated by machine learning-supported spectral counting (Absolute Protein EXpression, APEX) (Lu et al, 2007). Protein levels on hemicellulose and cellulose relative to glucose were determined using reductive methylation (Hsu et al, 2003; Boersema et al, 2009), here called ReDi labeling, to chemically incorporate hydrogen or deuterium isotopes at lysines and N-terminal amines of tryptic peptides. We show that ReDi proteomics gives accurate, low-cost quantification of a microbial proteome and can be used to discern extracellular proteins. Further, we combine these quantitative proteomics with detailed measurements of growth, biomass consumption, fermentation product analyses, and electron microscopy. Together, these methods form a general strategy to evaluate the efficiency of cellulosic bioconversion and to identify enzyme targets to engineer for improving this process (Figure 1).
We found that fermentation of cellulosic substrates by C. phytofermentans involves secretion of numerous CAZy as well as proteins for binding of extracellular solutes, proteolysis, and motility. The most highly expressed protein in the proteome is a secreted protein that appears to compose a surface layer to support the cell and anchor cell surface proteins, including some enzymes for plant degradation. Once the secreted CAZy cleave insoluble plant polysaccharides into oligosaccharides, they are taken into the cell to be further degraded by intracellular CAZy, enabling more efficient sugar transport, conserving energy by phosphorolytic cleavage, and ensuring the sugar monomers were not available to competing microbes. Sugars are catabolized by EMP glycolysis incorporating reversible, PPi-dependent glycolytic enzymes, and pyruvate ferredoxin oxidoreductase. The genome encodes seven alcohol dehydrogenases, among which two iron-dependent enzymes are highly expressed and likely facilitate the high ethanol yields. Growth on cellulose also resulted in indirect changes such as increased tryptophan and nicotinamide synthesis and repression of fatty acid synthesis. We distilled the data into a model showing the highly expressed enzymes enabling efficient cellulosic fermentation by C. phytofermentans (Figure 7). Collectively, these data help understand how bacteria recycle plant biomass works towards enabling the use of plant biomass as a low-cost chemical feedstock.
Fermentation of plant biomass by microbes like Clostridium phytofermentans recycles carbon globally and can make biofuels from inedible feedstocks. We analyzed C. phytofermentans fermenting cellulosic substrates by integrating quantitative mass spectrometry of more than 2500 proteins with measurements of growth, enzyme activities, fermentation products, and electron microscopy. Absolute protein concentrations were estimated using Absolute Protein EXpression (APEX); relative changes between treatments were quantified with chemical stable isotope labeling by reductive dimethylation (ReDi). We identified the different combinations of carbohydratases used to degrade cellulose and hemicellulose, many of which were secreted based on quantification of supernatant proteins, as well as the repertoires of glycolytic enzymes and alcohol dehydrogenases (ADHs) enabling ethanol production at near maximal yields. Growth on cellulose also resulted in diverse changes such as increased expression of tryptophan synthesis proteins and repression of proteins for fatty acid metabolism and cell motility. This study gives a systems-level understanding of how this microbe ferments biomass and provides a rational, empirical basis to identify engineering targets for industrial cellulosic fermentation.
PMCID: PMC3049413  PMID: 21245846
bioenergy; clostridium; proteomics
7.  LC–MS Based Detection of Differential Protein Expression 
While several techniques are available in proteomics, LC-MS based analysis of complex protein/peptide mixtures has turned out to be a mainstream analytical technique for quantitative proteomics. Significant technical advances at both sample preparation/separation and mass spectrometry levels have revolutionized comprehensive proteome analysis. Moreover, automation and robotics for sample handling process permit multiple sampling with high throughput.
For LC-MS based quantitative proteomics, sample preparation turns out to be critical step, as it can significantly influence sensitivity of downstream analysis. Several sample preparation strategies exist, including depletion of high abundant proteins or enrichment steps that facilitate protein quantification but with a compromise of focusing on a smaller subset of a proteome. While several experimental strategies have emerged, certain limitations such as physiochemical properties of a peptide/protein, protein turnover in a sample, analytical platform used for sample analysis and data processing, still imply challenges to quantitative proteomics. Other aspects that make analysis of a proteome a challenging task include dynamic nature of a proteome, need for efficient and fast analysis of protein due to its constant modifications inside a cell, concentration range of proteins that exceed dynamic range of a single analytical method, and absence of appropriate bioinformatics tools for analysis of large volume and high dimensional data.
This paper gives an overview of various LC-MS methods currently used in quantitative proteomics and their potential for detecting differential protein expression. Fundamental steps such as sample preparation, LC separation, mass spectrometry, quantitative assessment and protein identification are discussed.
For quantitative assessment of protein expression, both label and label free approaches are evaluated for their set of merits and demerits. While most of these methods edge on providing “relative abundance” information, absolute quantification is achieved with limitation as it caters to fewer proteins. Isotope labeling is extensively used for quantifying differentially expressed proteins, but is severely limited by successful incorporation of its heavy label. Lengthy labeling protocols restrict the number of samples that can be labeled and processed. Alternatively, label free approach appears promising as it can process many samples with any number of comparisons possible but entails reproducible experimental data for its application.
PMCID: PMC2867618  PMID: 20473349
Liquid chromatography-mass spectrometry (LC-MS); Quantitative proteomics; Labeling; Label-free; Tandem mass spectrometry (MS/MS)
8.  Comparative Shotgun Proteomics Using Spectral Count Data and Quasi-Likelihood Modeling 
Journal of Proteome Research  2010;9(8):4295-4305.
Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes using liquid chromatography−tandem mass spectrometry (LC−MS/MS) and allows a global analysis of protein changes. Nevertheless, sampling of complex proteomes by current shotgun proteomics platforms is incomplete, and this contributes to variability in assessment of peptide and protein inventories by spectral counting approaches. Thus, shotgun proteomics data pose challenges in comparing proteomes from different biological states. We developed an analysis strategy using quasi-likelihood Generalized Linear Modeling (GLM), included in a graphical interface software package (QuasiTel) that reads standard output from protein assemblies created by IDPicker, an HTML-based user interface to query shotgun proteomic data sets. This approach was compared to four other statistical analysis strategies: Student t test, Wilcoxon rank test, Fisher’s Exact test, and Poisson-based GLM. We analyzed the performance of these tests to identify differences in protein levels based on spectral counts in a shotgun data set in which equimolar amounts of 48 human proteins were spiked at different levels into whole yeast lysates. Both GLM approaches and the Fisher Exact test performed adequately, each with their unique limitations. We subsequently compared the proteomes of normal tonsil epithelium and HNSCC using this approach and identified 86 proteins with differential spectral counts between normal tonsil epithelium and HNSCC. We selected 18 proteins from this comparison for verification of protein levels between the individual normal and tumor tissues using liquid chromatography−multiple reaction monitoring mass spectrometry (LC−MRM-MS). This analysis confirmed the magnitude and direction of the protein expression differences in all 6 proteins for which reliable data could be obtained. Our analysis demonstrates that shotgun proteomic data sets from different tissue phenotypes are sufficiently rich in quantitative information and that statistically significant differences in proteins spectral counts reflect the underlying biology of the samples.
Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes but incomplete sampling poses challenges in comparing protein inventories by spectral counting approaches. We developed a statistical method based on quasi-likelihood modeling and demonstrate that it compares favorably to other statistical tests. Statistically significant spectral count differences were confirmed by MRM demonstrating that the observed protein level differences reflect the underlying biology of the samples.
PMCID: PMC2920032  PMID: 20586475
LC−MS/MS; shotgun proteomics; multiple reaction monitoring (MRM); head and neck carcinoma; Generalized Linear Model; spectral counting
9.  Comprehensive quantitative analysis of central carbon and amino-acid metabolism in Saccharomyces cerevisiae under multiple conditions by targeted proteomics 
With a targeted proteomic approach, we could quantify 90% of the enzymes involved in carbon and amino-acid metabolism in yeast, including complete isoenzyme families, throughout a set of different metabolic states.The data, interpreted through flux balance modeling, indicate that S. cerevisiae expresses enzymes, not necessarily used in a particular metabolic condition.For many isoenzymes our data suggest functional diversification, which might explain their parallel presence in the S. cerevisiae genome.
The metabolic network in the yeast Saccharomyces cerevisiae has been very well characterized in terms of components and topology. The adaptation of metabolism to changing nutritional conditions, in contrast, is much less well understood.
In this study, we exploited quantitative proteomic assays based on selected reaction monitoring (SRM) mass spectrometry to comprehensively analyze the set of enzymes involved in carbon and amino-acid metabolism in yeast (Figure 1), throughout a set of different metabolic states. To elucidate how this metabolic network of proteins adapts to environmental challenges, we chose five nutritional conditions resulting in maximal difference in magnitude and direction of metabolic fluxes. We could reproducibly detect and quantify across the different conditions, 90% of the targeted metabolic proteome, including complete families of isoenzymes, sharing up to 99.5% sequence identity and multi-subunit enzyme complexes. This yielded an information-rich proteomic data set that represents a nutritionally perturbed biological system with high coverage of its components.
Interpreted through flux balance modeling, the data indicate that S. cerevisiae expresses—at least at a basal level—more proteins than are actually necessary for sustaining a given metabolic condition. One potential explanation for the presence of non-necessary proteins is that these enzymes could realize immediate basal metabolic fluxes upon a change to new environmental conditions.
Next, we asked whether our data set could contribute to unravel the function of isoenzymes in the metabolic set. Previously proposed roles for isoenzymes include redundancy to buffer against mutations, a means to gene dosage or facilitation of evolutionary innovation and functional diversification. To address the role of isoenzymes in central metabolism, we used hierarchical clustering to analyze the abundance patterns of the metabolic proteins and their relationship to different functional classes and metabolic pathways. Interestingly, while subunits of the same protein complex preferably cluster in proximate branches, members of the same isoenzyme family often clustered in distant branches (Figure 5). The data therefore suggested functional diversification within most isoenzyme families and allowed to propose different functions of divergent isoenzymes.
We expect that the comprehensive data set presented in this study will be an ideal blueprint for further developing models of yeast metabolism and for follow-up studies on the function of target metabolic proteins.
Decades of biochemical research have identified most of the enzymes that catalyze metabolic reactions in the yeast Saccharomyces cerevisiae. The adaptation of metabolism to changing nutritional conditions, in contrast, is much less well understood. As an important stepping stone toward such understanding, we exploit the power of proteomics assays based on selected reaction monitoring (SRM) mass spectrometry to quantify abundance changes of the 228 proteins that constitute the central carbon and amino-acid metabolic network in the yeast Saccharomyces cerevisiae, at five different metabolic steady states. Overall, 90% of the targeted proteins, including families of isoenzymes, were consistently detected and quantified in each sample, generating a proteomic data set that represents a nutritionally perturbed biological system at high reproducibility. The data set is near comprehensive because we detect 95–99% of all proteins that are required under a given condition. Interpreted through flux balance modeling, the data indicate that S. cerevisiae retains proteins not necessarily used in a particular environment. Further, the data suggest differential functionality for several metabolic isoenzymes.
PMCID: PMC3063691  PMID: 21283140
metabolism; S. cerevisiae; SRM; targeted proteomics
10.  EP6 Quantitative Proteomics 
There are numerous approaches to study the proteome in a quantitative manner. All rely heavily on optimized sample preparation and appropriate statistical analysis of resulting datasets. This session will cover the following aspects of quantitative proteomics approaches:
Quantitative profiling of the membrane proteome requires special considerations not addressed in typical mass-spectrometry analyses. Optimized sample preparation and separation strategies will be discussed in the context of enriched membrane fractions and a quantitative proteomics platform using stable isotopes.In shotgun proteomics, a complex protein mixture is first digested to peptides, which are then analyzed by a combination of nanoflow chromatography and tandem mass spectrometry. The effects of subtle changes in sample preparation and chromatographic conditions in the characterization of complex mixtures will be presented. A discovery-based mass spectrometry approach using a bench-top LTQ linear ion trap and in-house written software for label-free differential protein profiling will be presented. This approach is quite comprehensive and is compatible with even the most inexpensive mass spectrometers. For proteins not detected routinely using our discovery-based approaches, we have applied selected reaction monitoring using a TSQ Quantum Ultra. This approach has been used to identify and quantify proteins at the low ng/mL level in plasma without any prior fractionation. A software pipeline has been developed to go from hypothesized proteins of interest derived from the literature to predicted hSRM transitions, collision offsets, and predicted chromatographic retention times. The combination of both discovery- and hypothesis-driven proteomics using nanoflow separations and tandem mass spectrometry provides us with unparalleled sensitivity and dynamic range in characterizing complex mixtures.Spectrum counting is an appealing and relatively straightforward approach for quantitative proteomics. Since the spectrum count of a protein in a proteomic analysis is the total number of peptides, not just unique peptides detected and identified for a given protein, searching criteria and false-positive minimization is important. There are several different versions of spectral counting currently in use, but each approach has shared core characteristics. An additional important consideration for quantitative proteomic analysis is the use of replicates for statistical analysis and determining the proper statistical test to use based on the overall structure of the datasets. This presentation will describe the foundation of spectral counting and the modifications to this approach used by different researchers. In addition, selected examples of the biological implementation of these approaches will be described.
PMCID: PMC2292016
11.  A Mouse to Human Search for Plasma Proteome Changes Associated with Pancreatic Tumor Development 
PLoS Medicine  2008;5(6):e123.
The complexity and heterogeneity of the human plasma proteome have presented significant challenges in the identification of protein changes associated with tumor development. Refined genetically engineered mouse (GEM) models of human cancer have been shown to faithfully recapitulate the molecular, biological, and clinical features of human disease. Here, we sought to exploit the merits of a well-characterized GEM model of pancreatic cancer to determine whether proteomics technologies allow identification of protein changes associated with tumor development and whether such changes are relevant to human pancreatic cancer.
Methods and Findings
Plasma was sampled from mice at early and advanced stages of tumor development and from matched controls. Using a proteomic approach based on extensive protein fractionation, we confidently identified 1,442 proteins that were distributed across seven orders of magnitude of abundance in plasma. Analysis of proteins chosen on the basis of increased levels in plasma from tumor-bearing mice and corroborating protein or RNA expression in tissue documented concordance in the blood from 30 newly diagnosed patients with pancreatic cancer relative to 30 control specimens. A panel of five proteins selected on the basis of their increased level at an early stage of tumor development in the mouse was tested in a blinded study in 26 humans from the CARET (Carotene and Retinol Efficacy Trial) cohort. The panel discriminated pancreatic cancer cases from matched controls in blood specimens obtained between 7 and 13 mo prior to the development of symptoms and clinical diagnosis of pancreatic cancer.
Our findings indicate that GEM models of cancer, in combination with in-depth proteomic analysis, provide a useful strategy to identify candidate markers applicable to human cancer with potential utility for early detection.
Samir Hanash and colleagues identify proteins that are increased at an early stage of pancreatic tumor development in a mouse model and may be a useful tool in detecting early tumors in humans.
Editors' Summary
Cancers are life-threatening, disorganized masses of cells that can occur anywhere in the human body. They develop when cells acquire genetic changes that allow them to grow uncontrollably and to spread around the body (metastasize). If a cancer is detected when it is still small and has not metastasized, surgery can often provide a cure. Unfortunately, many cancers are detected only when they are large enough to press against surrounding tissues and cause pain or other symptoms. By this time, surgical removal of the original (primary) tumor may be impossible and there may be secondary cancers scattered around the body. In such cases, radiotherapy and chemotherapy can sometimes help, but the outlook for patients whose cancers are detected late is often poor. One cancer type for which late detection is a particular problem is pancreatic adenocarcinoma. This cancer rarely causes any symptoms in its early stages. Furthermore, the symptoms it eventually causes—jaundice, abdominal and back pain, and weight loss—are seen in many other illnesses. Consequently, pancreatic cancer has usually spread before it is diagnosed, and most patients die within a year of their diagnosis.
Why Was This Study Done?
If a test could be developed to detect pancreatic cancer in its early stages, the lives of many patients might be extended. Tumors often release specific proteins—“cancer biomarkers”—into the blood, a bodily fluid that can be easily sampled. If a protein released into the blood by pancreatic cancer cells could be identified, it might be possible to develop a noninvasive screening test for this deadly cancer. In this study, the researchers use a “proteomic” approach to identify potential biomarkers for early pancreatic cancer. Proteomics is the study of the patterns of proteins made by an organism, tissue, or cell and of the changes in these patterns that are associated with various diseases.
What Did the Researchers Do and Find?
The researchers started their search for pancreatic cancer biomarkers by studying the plasma proteome (the proteins in the fluid portion of blood) of mice genetically engineered to develop cancers that closely resemble human pancreatic tumors. Through the use of two techniques called high-resolution mass spectrometry and acrylamide isotopic labeling, the researchers identified 165 proteins that were present in larger amounts in plasma collected from mice with early and/or advanced pancreatic cancer than in plasma from control mice. Then, to test whether any of these protein changes were relevant to human pancreatic cancer, the researchers analyzed blood samples collected from patients with pancreatic cancer. These samples, they report, contained larger amounts of some of these proteins than blood collected from patients with chronic pancreatitis, a condition that has similar symptoms to pancreatic cancer. Finally, using blood samples collected during a clinical trial, the Carotene and Retinol Efficacy Trial (a cancer-prevention study), the researchers showed that the measurement of five of the proteins present in increased amounts at an early stage of tumor development in the mouse model discriminated between people with pancreatic cancer and matched controls up to 13 months before cancer diagnosis.
What Do These Findings Mean?
These findings suggest that in-depth proteomic analysis of genetically engineered mouse models of human cancer might be an effective way to identify biomarkers suitable for the early detection of human cancers. Previous attempts to identify such biomarkers using human samples have been hampered by the many noncancer-related differences in plasma proteins that exist between individuals and by problems in obtaining samples from patients with early cancer. The use of a mouse model of human cancer, these findings indicate, can circumvent both of these problems. More specifically, these findings identify a panel of proteins that might allow earlier detection of pancreatic cancer and that might, therefore, extend the life of some patients who develop this cancer. However, before a routine screening test becomes available, additional markers will need to be identified and extensive validation studies in larger groups of patients will have to be completed.
Additional Information.
Please access these Web sites via the online version of this summary at
The MedlinePlus Encyclopedia has a page on pancreatic cancer (in English and Spanish). Links to further information are provided by MedlinePlus
The US National Cancer Institute has information about pancreatic cancer for patients and health professionals (in English and Spanish)
The UK charity Cancerbackup also provides information for patients about pancreatic cancer
The Clinical Proteomic Technologies for Cancer Initiative (a US National Cancer Institute initiative) provides a tutorial about proteomics and cancer and information on the Mouse Proteomic Technologies Initiative
PMCID: PMC2504036  PMID: 18547137
12.  Cardiovascular Proteomics – Implications for Clinical Applications 
Clinics in laboratory medicine  2009;29(1):87-99.
Proteomics is fulfilling its potential and beginning to impact the diagnosis and therapy of cardiovascular disease. The field continues to develop, taking on new roles in both de novo discovery and targeted approaches. As de novo discovery – using mass spectrometry alone, or in combination with peptide or protein separation techniques – becomes a reality, more and more attention is being directed toward the field of cardiovascular serum/plasma biomarker discovery. With the advent of quantitative mass spectrometry, this focus is shifting from the basic accumulation of protein identifications within a sample to the elucidation of complex protein interactions. Despite technical advances, however, the absolute number of biomarkers thus far discovered by proteomics’ systems biology approach is small. Although several factors contribute to this lack, one step we must take is to build “translation teams” involving a close collaboration between researchers and clinicians.
Proteomics provides a snapshot of the proteome of a sample (or a subfraction/subproteome) at any given point in time. Any change in function is preceded by a change on the protein level. As this can be induced by the slightest alteration in the microenvironment of the protein (e.g. fluctuation in pH), the strength of proteomics to detect these changes is at the same time the weakness of the method in the unstable context of a clinical setting. In order to take cardiovascular proteomics from bench to bedside, great care must be taken to achieve reproducible results.
PMCID: PMC4013284  PMID: 19389553
13.  Structural and functional protein network analyses predict novel signaling functions for rhodopsin 
Proteomic analyses, literature mining, and structural data were combined to generate an extensive signaling network linked to the visual G protein-coupled receptor rhodopsin. Network analysis suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking.
Using a shotgun proteomic approach, we identified the protein inventory of the light sensing outer segment of the mammalian photoreceptor.These data, combined with literature mining, structural modeling, and computational analysis, offer a comprehensive view of signal transduction downstream of the visual G protein-coupled receptor rhodopsin.The network suggests novel signaling branches downstream of rhodopsin to cytoskeleton dynamics and vesicular trafficking.The network serves as a basis for elucidating physiological principles of photoreceptor function and suggests potential disease-associated proteins.
Photoreceptor cells are neurons capable of converting light into electrical signals. The rod outer segment (ROS) region of the photoreceptor cells is a cellular structure made of a stack of around 800 closed membrane disks loaded with rhodopsin (Liang et al, 2003; Nickell et al, 2007). In disc membranes, rhodopsin arranges itself into paracrystalline dimer arrays, enabling optimal association with the heterotrimeric G protein transducin as well as additional regulatory components (Ciarkowski et al, 2005). Disruption of these highly regulated structures and processes by germline mutations is the cause of severe blinding diseases such as retinitis pigmentosa, macular degeneration, or congenital stationary night blindness (Berger et al, 2010).
Traditionally, signal transduction networks have been studied by combining biochemical and genetic experiments addressing the relations among a small number of components. More recently, large throughput experiments using different techniques like two hybrid or co-immunoprecipitation coupled to mass spectrometry have added a new level of complexity (Ito et al, 2001; Gavin et al, 2002, 2006; Ho et al, 2002; Rual et al, 2005; Stelzl et al, 2005). However, in these studies, space, time, and the fact that many interactions detected for a particular protein are not compatible, are not taken into consideration. Structural information can help discriminate between direct and indirect interactions and more importantly it can determine if two or more predicted partners of any given protein or complex can simultaneously bind a target or rather compete for the same interaction surface (Kim et al, 2006).
In this work, we build a functional and dynamic interaction network centered on rhodopsin on a systems level, using six steps: In step 1, we experimentally identified the proteomic inventory of the porcine ROS, and we compared our data set with a recent proteomic study from bovine ROS (Kwok et al, 2008). The union of the two data sets was defined as the ‘initial experimental ROS proteome'. After removal of contaminants and applying filtering methods, a ‘core ROS proteome', consisting of 355 proteins, was defined.
In step 2, proteins of the core ROS proteome were assigned to six functional modules: (1) vision, signaling, transporters, and channels; (2) outer segment structure and morphogenesis; (3) housekeeping; (4) cytoskeleton and polarity; (5) vesicles formation and trafficking, and (6) metabolism.
In step 3, a protein-protein interaction network was constructed based on the literature mining. Since for most of the interactions experimental evidence was co-immunoprecipitation, or pull-down experiments, and in addition many of the edges in the network are supported by single experimental evidence, often derived from high-throughput approaches, we refer to this network, as ‘fuzzy ROS interactome'. Structural information was used to predict binary interactions, based on the finding that similar domain pairs are likely to interact in a similar way (‘nature repeats itself') (Aloy and Russell, 2002). To increase the confidence in the resulting network, edges supported by a single evidence not coming from yeast two-hybrid experiments were removed, exception being interactions where the evidence was the existence of a three-dimensional structure of the complex itself, or of a highly homologous complex. This curated static network (‘high-confidence ROS interactome') comprises 660 edges linking the majority of the nodes. By considering only edges supported by at least one evidence of direct binary interaction, we end up with a ‘high-confidence binary ROS interactome'. We next extended the published core pathway (Dell'Orco et al, 2009) using evidence from our high-confidence network. We find several new direct binary links to different cellular functional processes (Figure 4): the active rhodopsin interacts with Rac1 and the GTP form of Rho. There is also a connection between active rhodopsin and Arf4, as well as PDEδ with Rab13 and the GTP-bound form of Arl3 that links the vision cycle to vesicle trafficking and structure. We see a connection between PDEδ with prenyl-modified proteins, such as several small GTPases, as well as with rhodopsin kinase. Further, our network reveals several direct binary connections between Ca2+-regulated proteins and cytoskeleton proteins; these are CaMK2A with actinin, calmodulin with GAP43 and S1008, and PKC with 14-3-3 family members.
In step 4, part of the network was experimentally validated using three different approaches to identify physical protein associations that would occur under physiological conditions: (i) Co-segregation/co-sedimentation experiments, (ii) immunoprecipitations combined with mass spectrometry and/or subsequent immunoblotting, and (iii) utilizing the glycosylated N-terminus of rhodopsin to isolate its associated protein partners by Concanavalin A affinity purification. In total, 60 co-purification and co-elution experiments supported interactions that were already in our literature network, and new evidence from 175 co-IP experiments in this work was added. Next, we aimed to provide additional independent experimental confirmation for two of the novel networks and functional links proposed based on the network analysis: (i) the proposed complex between Rac1/RhoA/CRMP-2/tubulin/and ROCK II in ROS was investigated by culturing retinal explants in the presence of an ROCK II-specific inhibitor (Figure 6). While morphology of the retinas treated with ROCK II inhibitor appeared normal, immunohistochemistry analyses revealed several alterations on the protein level. (ii) We supported the hypothesis that PDEδ could function as a GDI for Rac1 in ROS, by demonstrating that PDEδ and Rac1 co localize in ROS and that PDEδ could dissociate Rac1 from ROS membranes in vitro.
In step 5, we use structural information to distinguish between mutually compatible (‘AND') or excluded (‘XOR') interactions. This enables breaking a network of nodes and edges into functional machines or sub-networks/modules. In the vision branch, both ‘AND' and ‘XOR' gates synergize. This may allow dynamic tuning of light and dark states. However, all connections from the vision module to other modules are ‘XOR' connections suggesting that competition, in connection with local protein concentration changes, could be important for transmitting signals from the core vision module.
In the last step, we map and functionally characterize the known mutations that produce blindness.
In summary, this represents the first comprehensive, dynamic, and integrative rhodopsin signaling network, which can be the basis for integrating and mapping newly discovered disease mutants, to guide protein or signaling branch-specific therapies.
Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein–protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data. The resulting information not only offers a comprehensive view of signal transduction induced by this GPCR but also suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking, predicting an important level of regulation through small GTPases. Further, it demonstrates a specific disease susceptibility of the core visual pathway due to the uniqueness of its components present mainly in the eye. As a comprehensive multiscale network, it can serve as a basis to elucidate the physiological principles of photoreceptor function, identify potential disease-associated genes and proteins, and guide the development of therapies that target specific branches of the signaling pathway.
PMCID: PMC3261702  PMID: 22108793
protein interaction network; rhodopsin signaling; structural modeling
14.  Quantitative proteomics comparison of arachnoid cyst fluid and cerebrospinal fluid collected perioperatively from arachnoid cyst patients 
There is little knowledge concerning the content and the mechanisms of filling of arachnoid cysts. The aim of this study was to compare the protein content of arachnoid cysts and cerebrospinal fluid by quantitative proteomics to increase the understanding of arachnoid cysts.
Arachnoid cyst fluid and cerebrospinal fluid from five patients were analyzed by quantitative proteomics in two separate experiments.
In a label-free experiment arachnoid cyst fluid and cerebrospinal fluid samples from individual patients were trypsin digested and analyzed by Orbitrap mass spectrometry in a label-free manner followed by data analysis using the Progenesis software.
In the second proteomics experiment, a patient sample pooling strategy was followed by MARS-14 immunodepletion of high abundant proteins, trypsin digestion, iTRAQ labelling, and peptide separation by mix-phase chromatography followed by Orbitrap mass spectrometry analysis. The results from these analyzes were compared to previously published mRNA microarray data obtained from arachnoid membranes.
We quantified 348 proteins by the label-free individual patient approach and 1425 proteins in the iTRAQ experiment using a pool from five patients of arachnoid cyst fluid and cerebrospinal fluid. This is by far the largest number of arachnoid cyst fluid proteins ever identified, and the first large-scale quantitative comparison between the protein content of arachnoid cyst fluid and cerebrospinal fluid from the same patients at the same time. Consistently in both experiment, we found 22 proteins with significantly increased abundance in arachnoid cysts compared to cerebrospinal fluid and 24 proteins with significantly decreased abundance. We did not observe any molecular weight gradient over the arachnoid cyst membrane. Of the 46 proteins we identified as differentially abundant in our study, 45 were also detected from the mRNA expression level study. None of them were previously reported as differentially expressed. We did not quantify any of the proteins corresponding to gene products from the ten genes previously reported as differentially abundant between arachnoid cysts and control arachnoid membranes.
From our experiments, the protein content of arachnoid cyst fluid and cerebrospinal fluid appears to be similar. There were, however, proteins that were significantly differentially abundant between arachnoid cyst fluid and cerebrospinal fluid. This could reflect the possibility that these proteins are affected by the filling mechanism of arachnoid cysts or are shed from the membranes into arachnoid cyst fluid. Our results do not support the proposed filling mechanisms of oncotic pressure or valves.
PMCID: PMC3641952  PMID: 23628075
15.  Comprehensive analysis of the mouse renal cortex using two-dimensional HPLC – tandem mass spectrometry 
Proteome Science  2008;6:15.
Proteomic methodologies increasingly have been applied to the kidney to map the renal cortical proteome and to identify global changes in renal proteins induced by diseases such as diabetes. While progress has been made in establishing a renal cortical proteome using 1-D or 2-DE and mass spectrometry, the number of proteins definitively identified by mass spectrometry has remained surprisingly small. Low coverage of the renal cortical proteome as well as our interest in diabetes-induced changes in proteins found in the renal cortex prompted us to perform an in-depth proteomic analysis of mouse renal cortical tissue.
We report a large scale analysis of mouse renal cortical proteome using SCX prefractionation strategy combined with HPLC – tandem mass spectrometry. High-confidence identification of ~2,000 proteins, including cytoplasmic, nuclear, plasma membrane, extracellular and unknown/unclassified proteins, was obtained by separating tryptic peptides of renal cortical proteins into 60 fractions by SCX prior to LC-MS/MS. The identified proteins represented the renal cortical proteome with no discernible bias due to protein physicochemical properties, subcellular distribution, biological processes, or molecular function. The highest ranked molecular functions were characteristic of tubular epithelium, and included binding, catalytic activity, transporter activity, structural molecule activity, and carrier activity. Comparison of this renal cortical proteome with published human urinary proteomes demonstrated enrichment of renal extracellular, plasma membrane, and lysosomal proteins in the urine, with a lack of intracellular proteins. Comparison of the most abundant proteins based on normalized spectral abundance factor (NSAF) in this dataset versus a published glomerular proteome indicated enrichment of mitochondrial proteins in the former and cytoskeletal proteins in the latter.
A whole tissue extract of the mouse kidney cortex was analyzed by an unbiased proteomic approach, yielding a dataset of ~2,000 unique proteins identified with strict criteria to ensure a high level of confidence in protein identification. As a result of extracting all proteins from the renal cortex, we identified an exceptionally wide range of renal proteins in terms of pI, MW, hydrophobicity, abundance, and subcellular location. Many of these proteins, such as low-abundance proteins, membrane proteins and proteins with extreme values in pI or MW are traditionally under-represented in 2-DE-based proteomic analysis.
PMCID: PMC2412861  PMID: 18501002
16.  A study protocol for quantitative targeted absolute proteomics (QTAP) by LC-MS/MS: application for inter-strain differences in protein expression levels of transporters, receptors, claudin-5, and marker proteins at the blood–brain barrier in ddY, FVB, and C57BL/6J mice 
Proteomics has opened a new horizon in biological sciences. Global proteomic analysis is a promising technology for the discovery of thousands of proteins, post-translational modifications, polymorphisms, and molecular interactions in a variety of biological systems. The activities and roles of the identified proteins must also be elucidated, but this is complicated by the inability of conventional proteomic methods to yield quantitative information for protein expression. Thus, a variety of biological systems remain “black boxes”. Quantitative targeted absolute proteomics (QTAP) enables the determination of absolute expression levels (mol) of any target protein, including low-abundance functional proteins, such as transporters and receptors. Therefore, QTAP will be useful for understanding the activities and roles of individual proteins and their differences, including normal/disease, human/animal, or in vitro/in vivo. Here, we describe the study protocols and precautions for QTAP experiments including in silico target peptide selection, determination of peptide concentration by amino acid analysis, setup of selected/multiple reaction monitoring (SRM/MRM) analysis in liquid chromatography–tandem mass spectrometry, preparation of protein samples (brain capillaries and plasma membrane fractions) followed by the preparation of peptide samples, simultaneous absolute quantification of target proteins by SRM/MRM analysis, data analysis, and troubleshooting. An application of QTAP in biological sciences was introduced that utilizes data from inter-strain differences in the protein expression levels of transporters, receptors, tight junction proteins and marker proteins at the blood–brain barrier in ddY, FVB, and C57BL/6J mice. Among 18 molecules, 13 (abcb1a/mdr1a/P-gp, abcc4/mrp4, abcg2/bcrp, slc2a1/glut1, slc7a5/lat1, slc16a1/mct1, slc22a8/oat3, insr, lrp1, tfr1, claudin-5, Na+/K+-ATPase, and γ-gtp) were detected in the isolated brain capillaries, and their protein expression levels were within a range of 0.637-101 fmol/μg protein. The largest difference in the levels between the three strains was 2.2-fold for 13 molecules, although bcrp and mct1 displayed statistically significant differences between C57BL/6J and the other strain(s). Highly sensitive simultaneous absolute quantification achieved by QTAP will increase the usefulness of proteomics in biological sciences and is expected to advance the new research field of pharmacoproteomics (PPx).
PMCID: PMC3691662  PMID: 23758935
Quantitative targeted absolute proteomics (QTAP); Pharmacoproteomics (PPx); Absolute expression level; In silico peptide selection criteria; LC-MS/MS; Blood–brain barrier; Strain difference; Transporter; Receptor; Tight junction protein
17.  P208-S Top-Down Quantitative Proteomic Analysis Using a Highly Multiplexed Isobaric Mass Tagging Strategy 
Proteomic analysis has proved key to determining drug mechanisms and assessing toxicological potential during preclinical screening studies. A major goal in proteomics is to accurately measure changes in the relative abundance of large sets of proteins in complex biological systems as a function of experimental parameters, such as drug dose or exposure time. Until recently, top-down quantitative proteomics has been restricted to 2D gel analyses or two-plex mass tagging. A new top-down approach based upon isobaric mass tagging for highly multiplexing protein quantification is presented, involving chemically tagging cysteine or lysine residues of intact proteins isolated from cells, tissues, or biological fluids. As many as ten labeled samples are then combined, fractionated, proteolytically digested, and analyzed by gel electrophoresis or LC-MS/MS. Proteins are identified using public domain search engines, such as Mascot (Matrix Science Ltd., London UK) and quantified using an in-house developed software package. During the fragmentation, the tag-labeled peptides generate a set of low-mass reporters that are unique to each sample. Measurement of the intensity of these reporters allows the relative quantification of the peptides, and consequently the proteins from which they originated. The capabilities of the approach are demonstrated by analysis of the HeLa cell nucleolar proteome after treatment with the metabolic inhibitor actinomycin D for various time periods. A total of 542 proteins were qualitatively identified, and 232 of these proteins were then unambiguously quantified. The quantification data demonstrate that the nucleolar proteome changes significantly over time in response to differences in growth conditions, which is consistent with previous observations from several groups. The highly multiplexed and quantitative nature of the new technology should herald new opportunities to provide diagnostic and functional insights into the proteomics discovery process.
PMCID: PMC2292037
18.  Quantitative Proteomics Targeting Classes of Motif-containing Peptides Using Immunoaffinity-based Mass Spectrometry* 
The development of high-performance technology platforms for generating detailed protein expression profiles, or protein atlases, is essential. Recently, we presented a novel platform that we termed global proteome survey, where we combined the best features of affinity proteomics and mass spectrometry, to probe any proteome in a species independent manner while still using a limited set of antibodies. We used so called context-independent-motif-specific antibodies, directed against short amino acid motifs. This enabled enrichment of motif-containing peptides from a digested proteome, which then were detected and identified by mass spectrometry. In this study, we have demonstrated the quantitative capability, reproducibility, sensitivity, and coverage of the global proteome survey technology by targeting stable isotope labeling with amino acids in cell culture-labeled yeast cultures cultivated in glucose or ethanol. The data showed that a wide range of motif-containing peptides (proteins) could be detected, identified, and quantified in a highly reproducible manner. On average, each of six different motif-specific antibodies was found to target about 75 different motif-containing proteins. Furthermore, peptides originating from proteins spanning in abundance from over a million down to less than 50 copies per cell, could be targeted. It is worth noting that a significant set of peptides previously not reported in the PeptideAtlas database was among the profiled targets. The quantitative data corroborated well with the corresponding data generated after conventional strong cation exchange fractionation of the same samples. Finally, several differentially expressed proteins, with both known and unknown functions, many relevant for the central carbon metabolism, could be detected in the glucose- versus ethanol-cultivated yeast. Taken together, the study demonstrated the potential of our immunoaffinity-based mass spectrometry platform for reproducible quantitative proteomics targeting classes of motif-containing peptides.
PMCID: PMC3412966  PMID: 22543061
19.  Proteomics and Mass Spectrometry Applications in Biomedical Research 
Proteomics and mass spectrometry have provided unprecedented tools for fast, accurate, high throughput biomolecular separation and characterization, which are indispensable towards understanding the biological and medical systems. Studying at the protein level allows researchers to investigate how proteins, their dynamics and modifications affect cellular processes and how cellular processes and the environment affect proteins. The mission of our facility is to provide excellent service and training in proteomics and mass spectrometry to UF scientists and students. Here we present our capabilities in proteomics and other analytical services. The tools include a gel-based 2D-DIGE (Two Dimentional Difference Gel Electrophoresis) and gel-free iTRAQ (Isobaric Tags for Relative and Absolute Quantitation). Along with our capacity of separating thousands of proteins and characterizing differential protein expression, we have a suite of state-of-the-art mass spectrometers available for biomedical sciences and advanced technology research, including a tandem time-of-flight (4700 Proteomics Analyzer, AB), quadrupole/time-of-flight (QSTAR XL, AB), and hybrid quadrupole-linear ion-trap (4000 QTRAP, AB). These instruments are mainly used for protein identification, posttranslational modification characterization and protein expression analysis (e.g., Mass Western). Our facility is also set up to provide Edman de novo N-terminal protein sequence analysis and Biacore biomolecule interaction analysis. We are fully set up to synthesize and purify peptides and have a good track record with this service as well. Proteomics and mass spectrometry are useful in large-scale suvey of proteome for hypothesis generation as well as in detailed analysis of target proteins for hypothesis testing. Our services also include accurate molecular weight analysis, MRM-based protein screening and targeted metabolite profiling. To ensure success and maximize productivity, the facility offers education, consultation, data processing and reporting, and support of grant application.
PMCID: PMC2918203
20.  Proteomics and Mass Spectrometry Applications in Biomedical Research 
Proteomics and mass spectrometry have provided unprecedented tools for fast, accurate, high throughput biomolecular separation and characterization, which are indispensable towards understanding the biological and medical systems. Studying at the protein level allows researchers to investigate how proteins, their dynamics and modifications affect cellular processes and how cellular processes and the environment affect proteins. The mission of our facility is to provide excellent service and training in proteomics and mass spectrometry to UF scientists and students. Here we present our capabilities in proteomics and other analytical services. The tools include a gel-based 2D-DIGE (Two Dimentional Difference Gel Electrophoresis) and gelfree iTRAQ (Isobaric Tags for Relative and Absolute Quantitation). Along with our capacity of separating thousands of proteins and characterizing differential protein expression, we have a suite of state-of-the-art mass spectrometers available for biomedical sciences and advanced technology research, including a tandem time-of-flight (4700 Proteomics Analyzer, AB), quadrupole/time-of-flight (QSTAR XL, AB), and hybrid quadrupole-linear ion-trap (4000 QTRAP, AB). These instruments are mainly used for protein identification, posttranslational modification characterization and protein expression analysis (e.g., Mass Western). Our facility is also set up to provide Edman de novo N-terminal protein sequence analysis and Biacore biomolecule interaction analysis. We are fully set up to synthesize and purify peptides and have a good track record with this service as well. Proteomics and mass spectrometry are useful in large-scale survey of proteome for hypothesis generation as well as in detailed analysis of target proteins for hypothesis testing. Our services also include accurate molecular weight analysis, MRM-based protein screening and targeted metabolite profiling. To ensure success and maximize productivity, the facility offers education, consultation, data processing and reporting, and support of grant application.
PMCID: PMC3186580
21.  A complete mass spectrometric map for the analysis of the yeast proteome and its application to quantitative trait analysis 
Nature  2013;494(7436):266-270.
Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways.
PMCID: PMC3951219  PMID: 23334424
S. cerevisiae; selected reaction monitoring; SRM; MRM; spectral library; peptide library; mass spectrometric map; protein QTL
22.  Quantitative Shotgun Proteomics Using a Uniform 15N-Labeled Standard to Monitor Proteome Dynamics in Time Course Experiments Reveals New Insights into the Heat Stress Response of Chlamydomonas reinhardtii* 
Molecular & Cellular Proteomics : MCP  2011;10(9):M110.004739.
Crop-plant-yield safety is jeopardized by temperature stress caused by the global climate change. To take countermeasures by breeding and/or transgenic approaches it is essential to understand the mechanisms underlying plant acclimation to heat stress. To this end proteomics approaches are most promising, as acclimation is largely mediated by proteins. Accordingly, several proteomics studies, mainly based on two-dimensional gel-tandem MS approaches, were conducted in the past. However, results often were inconsistent, presumably attributable to artifacts inherent to the display of complex proteomes via two-dimensional-gels. We describe here a new approach to monitor proteome dynamics in time course experiments. This approach involves full 15N metabolic labeling and mass spectrometry based quantitative shotgun proteomics using a uniform 15N standard over all time points. It comprises a software framework, IOMIQS, that features batch job mediated automated peptide identification by four parallelized search engines, peptide quantification and data assembly for the processing of large numbers of samples. We have applied this approach to monitor proteome dynamics in a heat stress time course using the unicellular green alga Chlamydomonas reinhardtii as model system. We were able to identify 3433 Chlamydomonas proteins, of which 1116 were quantified in at least three of five time points of the time course. Statistical analyses revealed that levels of 38 proteins significantly increased, whereas levels of 206 proteins significantly decreased during heat stress. The increasing proteins comprise 25 (co-)chaperones and 13 proteins involved in chromatin remodeling, signal transduction, apoptosis, photosynthetic light reactions, and yet unknown functions. Proteins decreasing during heat stress were significantly enriched in functional categories that mediate carbon flux from CO2 and external acetate into protein biosynthesis, which also correlated with a rapid, but fully reversible cell cycle arrest after onset of stress. Our approach opens up new perspectives for plant systems biology and provides novel insights into plant stress acclimation.
PMCID: PMC3186191  PMID: 21610104
23.  The APEX Quantitative Proteomics Tool: Generating protein quantitation estimates from LC-MS/MS proteomics results 
BMC Bioinformatics  2008;9:529.
Mass spectrometry (MS) based label-free protein quantitation has mainly focused on analysis of ion peak heights and peptide spectral counts. Most analyses of tandem mass spectrometry (MS/MS) data begin with an enzymatic digestion of a complex protein mixture to generate smaller peptides that can be separated and identified by an MS/MS instrument. Peptide spectral counting techniques attempt to quantify protein abundance by counting the number of detected tryptic peptides and their corresponding MS spectra. However, spectral counting is confounded by the fact that peptide physicochemical properties severely affect MS detection resulting in each peptide having a different detection probability. Lu et al. (2007) described a modified spectral counting technique, Absolute Protein Expression (APEX), which improves on basic spectral counting methods by including a correction factor for each protein (called Oi value) that accounts for variable peptide detection by MS techniques. The technique uses machine learning classification to derive peptide detection probabilities that are used to predict the number of tryptic peptides expected to be detected for one molecule of a particular protein (Oi). This predicted spectral count is compared to the protein's observed MS total spectral count during APEX computation of protein abundances.
The APEX Quantitative Proteomics Tool, introduced here, is a free open source Java application that supports the APEX protein quantitation technique. The APEX tool uses data from standard tandem mass spectrometry proteomics experiments and provides computational support for APEX protein abundance quantitation through a set of graphical user interfaces that partition thparameter controls for the various processing tasks. The tool also provides a Z-score analysis for identification of significant differential protein expression, a utility to assess APEX classifier performance via cross validation, and a utility to merge multiple APEX results into a standardized format in preparation for further statistical analysis.
The APEX Quantitative Proteomics Tool provides a simple means to quickly derive hundreds to thousands of protein abundance values from standard liquid chromatography-tandem mass spectrometry proteomics datasets. The APEX tool provides a straightforward intuitive interface design overlaying a highly customizable computational workflow to produce protein abundance values from LC-MS/MS datasets.
PMCID: PMC2639435  PMID: 19068132
24.  iTRAQ-based quantitative proteome and phosphoprotein characterization reveals the central metabolism changes involved in wheat grain development 
BMC Genomics  2014;15(1):1029.
Wheat (Triticum aestivum L.) is an economically important grain crop. Two-dimensional gel-based approaches are limited by the low identification rate of proteins and lack of accurate protein quantitation. The recently developed isobaric tag for relative and absolute quantitation (iTRAQ) method allows sensitive and accurate protein quantification. Here, we performed the first iTRAQ-based quantitative proteome and phosphorylated proteins analyses during wheat grain development.
The proteome profiles and phosphoprotein characterization of the metabolic proteins during grain development of the elite Chinese bread wheat cultivar Yanyou 361 were studied using the iTRAQ-based quantitative proteome approach, TiO2 microcolumns, and liquid chromatography-tandem mass spectrometry (LC-MS/MS). Among 1,146 non-redundant proteins identified, 421 showed at least 2-fold differences in abundance, and they were identified as differentially expressed proteins (DEPs), including 256 upregulated and 165 downregulated proteins. Of the 421 DEPs, six protein expression patterns were identified, most of which were up, down, and up-down expression patterns. The 421 DEPs were classified into nine functional categories mainly involved in different metabolic processes and located in the membrane and cytoplasm. Hierarchical clustering analysis indicated that the DEPs involved in starch biosynthesis, storage proteins, and defense/stress-related proteins significantly accumulated at the late grain development stages, while those related to protein synthesis/assembly/degradation and photosynthesis showed an opposite expression model during grain development. Quantitative real-time polymerase chain reaction (qRT-PCR) analysis of 12 representative genes encoding different metabolic proteins showed certain transcriptional and translational expression differences during grain development. Phosphorylated proteins analyses demonstrated that 23 DEPs such as AGPase, sucrose synthase, Hsp90, and serpins were phosphorylated in the developing grains and were mainly involved in starch biosynthesis and stress/defense.
Our results revealed a complex quantitative proteome and phosphorylation profile during wheat grain development. Numerous DEPs are involved in grain starch and protein syntheses as well as adverse defense, which set an important basis for wheat yield and quality. Particularly, some key DEPs involved in starch biosynthesis and stress/defense were phosphorylated, suggesting their roles in wheat grain development.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1029) contains supplementary material, which is available to authorized users.
PMCID: PMC4301063  PMID: 25427527
Wheat; Grain proteome; iTRAQ; Phosphoproteins; qRT-PCR
25.  Novel aspects of grapevine response to phytoplasma infection investigated by a proteomic and phospho-proteomic approach with data integration into functional networks 
BMC Genomics  2013;14:38.
Translational and post-translational protein modifications play a key role in the response of plants to pathogen infection. Among the latter, phosphorylation is critical in modulating protein structure, localization and interaction with other partners. In this work, we used a multiplex staining approach with 2D gels to study quantitative changes in the proteome and phosphoproteome of Flavescence dorée-affected and recovered ‘Barbera’ grapevines, compared to healthy plants.
We identified 48 proteins that differentially changed in abundance, phosphorylation, or both in response to Flavescence dorée phytoplasma infection. Most of them did not show any significant difference in recovered plants, which, by contrast, were characterized by changes in abundance, phosphorylation, or both for 17 proteins not detected in infected plants. Some enzymes involved in the antioxidant response that were up-regulated in infected plants, such as isocitrate dehydrogenase and glutathione S-transferase, returned to healthy-state levels in recovered plants. Others belonging to the same functional category were even down-regulated in recovered plants (oxidoreductase GLYR1 and ascorbate peroxidase). Our proteomic approach thus agreed with previously published biochemical and RT-qPCR data which reported down-regulation of scavenging enzymes and accumulation of H2O2 in recovered plants, possibly suggesting a role for this molecule in remission from infection. Fifteen differentially phosphorylated proteins (| ratio | > 2, p < 0.05) were identified in infected compared to healthy plants, including proteins involved in photosynthesis, response to stress and the antioxidant system. Many were not differentially phosphorylated in recovered compared to healthy plants, pointing to their specific role in responding to infection, followed by a return to a steady-state phosphorylation level after remission of symptoms. Gene ontology (GO) enrichment and statistical analysis showed that the general main category “response to stimulus” was over-represented in both infected and recovered plants but, in the latter, the specific child category “response to biotic stimulus” was no longer found, suggesting a return to steady-state levels for those proteins specifically required for defence against pathogens.
Proteomic data were integrated into biological networks and their interactions were represented through a hypothetical model, showing the effects of protein modulation on primary metabolic ways and related secondary pathways. By following a multiplex-staining approach, we obtained new data on grapevine proteome pathways that specifically change at the phosphorylation level during phytoplasma infection and following recovery, focusing for the first time on phosphoproteome changes during pathogen infection in this host.
PMCID: PMC3564869  PMID: 23327683
Vitis vinifera; Flavescence dorée; Recovery; 2-DE

Results 1-25 (1408424)