Motivation: The size and complex nature of mass spectrometry-based proteomics datasets motivate development of specialized software for statistical data analysis and exploration. We present DanteR, a graphical R package that features extensive statistical and diagnostic functions for quantitative proteomics data analysis, including normalization, imputation, hypothesis testing, interactive visualization and peptide-to-protein rollup. More importantly, users can easily extend the existing functionality by including their own algorithms under the Add-On tab.
Availability: DanteR and its associated user guide are available for download free of charge at http://omics.pnl.gov/software/. We have an updated binary source for the DanteR package up on our website together with a vignettes document. For Windows, a single click automatically installs DanteR along with the R programming environment. For Linux and Mac OS X, users must install R and then follow instructions on the DanteR website for package installation.
The cause of multiple sclerosis (MS), its driving pathogenesis at the earliest stages, and what factors allow the first clinical attack to manifest remain unknown. Some imaging studies suggest gray rather than white matter may be involved early, and some postulate this may be predictive of developing MS. Other imaging studies are in conflict. To determine if there was objective molecular evidence of gray matter involvement in early MS we used high-resolution mass spectrometry to identify proteins in the cerebrospinal fluid (CSF) of first-attack MS patients (two independent groups) compared to established relapsing remitting (RR) MS and controls. We found that the CSF proteins in first-attack patients were differentially enriched for gray matter components (axon, neuron, synapse). Myelin components did not distinguish these groups. The results support that gray matter dysfunction is involved early in MS, and also may be integral for the initial clinical presentation.
Selected reaction monitoring (SRM)—also known as multiple reaction monitoring (MRM)—has emerged as a promising high-throughput targeted protein quantification technology for candidate biomarker verification and systems biology applications. A major bottleneck for current SRM technology, however, is insufficient sensitivity for e.g., detecting low-abundance biomarkers likely present at the low ng/mL to pg/mL range in human blood plasma or serum, or extremely low-abundance signaling proteins in cells or tissues. Herein we review recent advances in methods and technologies, including front-end immunoaffinity depletion, fractionation, selective enrichment of target proteins/peptides including posttranslational modifications (PTMs), as well as advances in MS instrumentation which have significantly enhanced the overall sensitivity of SRM assays and enabled the detection of low-abundance proteins at low to sub- ng/mL level in human blood plasma or serum. General perspectives on the potential of achieving sufficient sensitivity for detection of pg/mL level proteins in plasma are also discussed.
SRM; sensitivity; fractionation; ion funnel; enrichment
Differential ion mobility spectrometry (FAIMS) integrated with mass spectrometry (MS) is a powerful new tool for biological and environmental analyses. Large proteins occupy regions of FAIMS spectra distinct from peptides, lipids, or other medium-size biomolecules, likely because strong electric fields align huge dipoles common to macroions. Here we confirm this phenomenon in separations of proteins at extreme fields using FAIMS chips coupled to MS and demonstrate their use to detect even minor amounts of large proteins in complex matrices of smaller proteins and peptides.
Herbivores gain access to nutrients stored in plant biomass largely by harnessing the metabolic activities of microbes. Leaf-cutter ants of the genus Atta are a hallmark example; these dominant neotropical herbivores cultivate symbiotic fungus gardens on large quantities of fresh plant forage. As the external digestive system of the ants, fungus gardens facilitate the production and sustenance of millions of workers. Using metagenomic and metaproteomic techniques, we characterize the bacterial diversity and physiological potential of fungus gardens from two species of Atta. Our analysis of over 1.2 Gbp of community metagenomic sequence and three 16S pyrotag libraries reveals that in addition to harboring the dominant fungal crop, these ecosystems contain abundant populations of Enterobacteriaceae, including the genera Enterobacter, Pantoea, Klebsiella, Citrobacter and Escherichia. We show that these bacterial communities possess genes associated with lignocellulose degradation and diverse biosynthetic pathways, suggesting that they play a role in nutrient cycling by converting the nitrogen-poor forage of the ants into B-vitamins, amino acids and other cellular components. Our metaproteomic analysis confirms that bacterial glycosyl hydrolases and proteins with putative biosynthetic functions are produced in both field-collected and laboratory-reared colonies. These results are consistent with the hypothesis that fungus gardens are specialized fungus–bacteria communities that convert plant material into energy for their ant hosts. Together with recent investigations into the microbial symbionts of vertebrates, our work underscores the importance of microbial communities in the ecology and evolution of herbivorous metazoans.
leaf-cutter ants; symbiosis; Leucoagaricus gongylophorus; microbial consortia; Atta
The utility of mass spectrometry (MS)-based proteomic analyses and their clinical applications have been increasingly recognized over the past decade due to their high sensitivity, specificity and throughput. MS-based proteomic measurements have been used in a wide range of biological and biomedical investigations, including analysis of cellular responses and disease-specific post-translational modifications. These studies greatly enhance our understanding of the complex and dynamic nature of the proteome in biology and disease. Some MS techniques, such as those for targeted analysis, are being successfully applied for biomarker verification, whereas others, including global quantitative analysis (for example, for biomarker discovery), are more challenging and require further development. However, recent technological improvements in sample processing, instrumental platforms, data acquisition approaches and informatics capabilities continue to advance MS-based applications. Improving the detection of significant changes in proteins through these advances shows great promise for the discovery of improved biomarker candidates that can be verified pre-clinically using targeted measurements, and ultimately used in clinical studies - for example, for early disease diagnosis or as targets for drug development and therapeutic intervention. Here, we review the current state of MS-based proteomics with regard to its advantages and current limitations, and we highlight its translational applications in studies of protein biomarkers.
biomarker; clinical proteomics; ion mobility separations; mass spectrometry; multiple reaction monitoring; selected reaction monitoring; shotgun proteomics; targeted proteomics; translational proteomics
In classic work, Kuntz et al. (Proc. Nat. Acad. Sci. USA 1999, 96, 9997–10002) introduced the concept of ligand efficiency. Though that study focused primarily on drug-like molecules, it also showed that metal binding led to the greatest ligand efficiencies. Here, the physical limits of binding are examined across the wide variety of small molecules in the Binding MOAD database. The complexes with the greatest ligand efficiencies share the trait of being small, charged ligands bound in highly charged, well buried binding sites. The limit of ligand efficiency is −1.75 kcal/mol-atom for the protein-ligand complexes within Binding MOAD, and 95% of the set have efficiencies below a “soft limit” of −0.83 kcal/mol-atom. Based on buried molecular surface area, the hard limit of ligand efficiency is −117 cal/mol-Å2, which is in surprising agreement with the limit of macromolecule-protein binding. Close examination of the most efficient systems reveals their incredibly high efficiency is dictated by tight contacts between the charged groups of the ligand and the pocket. In fact, a misfit of 0.24 Å in the average contacts inherently decreases the maximum possible efficiency by at least 0.1 kcal/mol-atom.
Ligand Efficiency; Maximum Binding Affinity; Protein-Ligand Binding; Electrostatics
MS dissociation methods, including CID, HCD, and ETD, can each contribute distinct peptidome identifications using conventional peptide identification methods (Shen et al. J. Proteome Res. 2011), but such samples still pose significant informatics challenges. In this work, we explored utilization of high accuracy fragment ion mass measurements, in this case provided by FT MS/MS, to improve peptidome peptide dataset size and consistency relative to conventional descriptive and probabilistic scoring methods. For example, we identified 20–40% more peptides than SEQUEST, Mascot, and MS-GF scoring methods using high accuracy fragment ion information and the same FDR (e.g., <10 mass errors) from CID, HCD, and ETD spectra. Identified species covered >90% of the collective identifications obtained using various conventional peptide identification methods, which resolves the issue of different data analysis methods generating different peptide datasets. Choice of peptide dissociation and high-precision measurement-based identification methods presently available for degradomic-peptidomic analyses needs to be based on the coverage and confidence (or specificity) afforded by the method, as well as practical issues (e.g., throughput). By using accurate fragment information, >1000 peptidome peptides can be identified from a single human blood plasma sample with low peptide-level FDRs (e.g., 0.6%), providing an improved basis for investigating potential disease-related peptidome components.
FT MS/MS; CID; HCD; ETD; peptides; non-tryptic peptides; peptidome; degradome; merging of spectra; scoring of spectra
A multi-functional liquid chromatography system that performs 1-dimensional, 2-dimensional (strong cation exchange/reverse phase liquid chromatography, or SCX/RPLC) separations and online phosphopeptide enrichment using a single binary nano-flow pump has been developed. With a simple operation of a function selection valve equipped with a SCX column and a TiO2 (titanium dioxide) column, a fully automated selection of three different experiment modes was achieved. Because the current system uses essentially the same solvent flow paths, the same trap column, and the same separation column for reverse-phase separation of 1D, 2D, and online phosphopeptides enrichment experiments, the elution time information obtained from these experiments is in excellent agreement, which facilitates correlating peptide information from different experiments. The final reverse-phase separation of the three experiments is completely decoupled from all of function selection processes; thereby salts or acids from SCX or TiO2 column do not affect the efficiency of the reverse-phase separation.
Cultures of the cyanobacterial genus Cyanothece have been shown to produce high levels of biohydrogen. These strains are diazotrophic and undergo pronounced diurnal cycles when grown under N2-fixing conditions in light-dark cycles. We seek to better understand the way in which proteins respond to these diurnal changes, and we performed quantitative proteome analysis of Cyanothece sp. strains ATCC 51142 and PCC 7822 grown under 8 different nutritional conditions. Nitrogenase expression was limited to N2-fixing conditions, and in the absence of glycerol, nitrogenase gene expression was linked to the dark period. However, glycerol induced expression of nitrogenase during part of the light period, together with cytochrome c oxidase (Cox), glycogen phosphorylase (Glp), and glycolytic and pentose phosphate pathway (PPP) enzymes. This indicated that nitrogenase expression in the light was facilitated via higher levels of respiration and glycogen breakdown. Key enzymes of the Calvin cycle were inhibited in Cyanothece ATCC 51142 in the presence of glycerol under H2-producing conditions, suggesting a competition between these sources of carbon. However, in Cyanothece PCC 7822, the Calvin cycle still played a role in cofactor recycling during H2 production. Our data comprise the first comprehensive profiling of proteome changes in Cyanothece PCC 7822 and allow an in-depth comparative analysis of major physiological and biochemical processes that influence H2 production in both strains. Our results revealed many previously uncharacterized proteins that may play a role in nitrogenase activity and in other metabolic pathways and may provide suitable targets for genetic manipulation that would lead to improvement of large-scale H2 production.
Background: In many parts of the world, livestock production is undergoing a process of rapid intensification. The health implications of this development are uncertain. Intensification creates cheaper products, allowing more people to access animal-based foods. However, some practices associated with intensification may contribute to zoonotic disease emergence and spread: for example, the sustained use of antibiotics, concentration of animals in confined units, and long distances and frequent movement of livestock.
Objectives: Here we present the diverse range of ecological, biological, and socioeconomic factors likely to enhance or reduce zoonotic risk, and identify ways in which a comprehensive risk analysis may be conducted by using an interdisciplinary approach. We also offer a conceptual framework to guide systematic research on this problem.
Discussion: We recommend that interdisciplinary work on zoonotic risk should take into account the complexity of risk environments, rather than limiting studies to simple linear causal relations between risk drivers and disease emergence and/or spread. In addition, interdisciplinary integration is needed at different levels of analysis, from the study of risk environments to the identification of policy options for risk management.
Conclusion: Given rapid changes in livestock production systems and their potential health implications at the local and global level, the problem we analyze here is of great importance for environmental health and development. Although we offer a systematic interdisciplinary approach to understand and address these implications, we recognize that further research is needed to clarify methodological and practical questions arising from the integration of the natural and social sciences.
emerging diseases; integrated ecology and human health; livestock production; risk characterization; risk management; zoonoses
Proteomics analysis identifies human serum proteins involved with innate immune responses, complement activation, and blood coagulation that are diagnostic for type 1 diabetes.
Using global liquid chromatography-mass spectrometry (LC-MS)–based proteomics analyses, we identified 24 serum proteins that were significantly variant between those with type 1 diabetes (T1D) and healthy controls. Functionally, these proteins represent innate immune responses, the activation cascade of complement, inflammatory responses, and blood coagulation. Targeted verification analyses were performed on 52 surrogate peptides representing these proteins, with serum samples from an antibody standardization program cohort of 100 healthy control and 50 type 1 diabetic subjects. 16 peptides were verified as having very good discriminating power, with areas under the receiver operating characteristic curve ≥0.8. Further validation with blinded serum samples from an independent cohort (10 healthy control and 10 type 1 diabetics) demonstrated that peptides from platelet basic protein and C1 inhibitor achieved both 100% sensitivity and 100% specificity for classification of samples. The disease specificity of these proteins was assessed using sera from 50 age-matched type 2 diabetic individuals, and a subset of proteins, C1 inhibitor in particular, were exceptionally good discriminators between these two forms of diabetes. The panel of biomarkers distinguishing those with T1D from healthy controls and those with type 2 diabetes suggests that dysregulated innate immune responses may be associated with the development of this disorder.
The prevalence of diabetes mellitus is increasing dramatically throughout the world, and the disease has become a major public health issue. The most common form of the disease, type 2 diabetes, is characterized by insulin resistance and insufficient insulin production from the pancreatic beta-cell. Since glucose is the most potent regulator of beta-cell function under physiological conditions, identification of the insulin secretory defect underlying type 2 diabetes requires a better understanding of glucose regulation of human beta-cell function. To this aim, a bottom-up LC-MS/MS-based proteomics approach was used to profile pooled islets from multiple donors under basal (5 mM) or high (15 mM) glucose conditions. Our analysis discovered 256 differentially abundant proteins (~p<0.05) after 24 h of high glucose exposure from more than 4500 identified in total. Several novel glucose-regulated proteins were elevated under high glucose conditions, including regulators of mRNA splicing (Pleiotropic regulator 1), processing (Retinoblastoma binding protein 6), and function (Nuclear RNA export factor 1), in addition to Neuron navigator 1 and Plasminogen activator inhibitor 1. Proteins whose abundances markedly decreased during incubation at 15 mM glucose included Bax inhibitor 1 and Synaptotagmin-17. Up-regulation of Dicer 1 and SLC27A2 and down-regulation of Phospholipase Cβ4 were confirmed by Western blots. Many proteins found to be differentially abundant after high glucose stimulation are annotated as uncharacterized or hypothetical. These findings expand our knowledge of glucose regulation of the human islet proteome and suggest many hitherto unknown responses to glucose that require additional studies to explore novel functional roles.
human; pancreatic islet; glucose; type 2 diabetes; proteomics; mass spectrometry; LC-MS/MS
Francisella tularensis causes the zoonosis tularemia in humans and is one of the most virulent bacterial pathogens. We utilized a global proteomic approach to characterize protein changes in bronchoalveolar lavage fluid from mice exposed to one of three organisms, F. tularensis ssp. novicida, an avirulent mutant of F. tularensis ssp. novicida (F.t. novicida-ΔmglA); and Pseudomonas aeruginosa. The composition of BALF proteins was altered following infection, including proteins involved in neutrophil activation, oxidative stress and inflammatory responses. Components of the innate immune response were induced including the acute phase response and the complement system, however the timing of their induction varied. Francisella tularensis ssp. novicida infected mice do not appear to have an effective innate immune response in the first hours of infection, however within 24 hours they show an upregulation of innate immune response proteins. This delayed response is in contrast to P. aeruginosa infected animals which show an early innate immune response. Likewise, F.t. novicida-ΔmglA infection initiates an early innate immune response, however this response is dimished by 24 hours. Finally, this study identifies several candidate biomarkers, including Chitinase 3-like-1 (CHI3L1 or YKL-40) and peroxiredoxin 1, that are associated with F. tularensis ssp. novicida but not P. aeruginosa infection.
innate immunity; Francisella tularensis; proteomics; bronchoalveolar lavage fluid
To design a robust quantitative proteomics study, an understanding of both the inherent heterogeneity of the biological samples being studied as well as the technical variability of the proteomics methods and platform is needed. Additionally, accurately identifying the technical steps associated with the largest variability would provide valuable information for the improvement and design of future processing pipelines. We present an experimental strategy that allows for a detailed examination of the variability of the quantitative LC-MS proteomics measurements. By replicating analyses at different stages of processing, various technical components can be estimated and their individual contribution to technical variability can be dissected. This design can be easily adapted to other quantitative proteomics pipelines. Herein, we applied this methodology to our label-free workflow for the processing of human brain tissue. For this application, the pipeline was divided into four critical components: Tissue dissection and homogenization (extraction), protein denaturation followed by trypsin digestion and SPE clean-up (digestion), short-term run-to-run instrumental response fluctuation (instrumental variance), and long-term drift of the quantitative response of the LC-MS/MS platform over the 2 week period of continuous analysis (instrumental stability). From this analysis, we found the following contributions to variability: extraction (72%) >> instrumental variance (16%) > instrumental stability (8.4%) > digestion (3.1%). Furthermore, the stability of the platform and its' suitability for discovery proteomics studies is demonstrated.
Label-free quantification; technical variation; sample preparation; reproducibility; study design; tissue analysis
Motivation: Quantitative mass spectrometry-based proteomics involves statistical inference on protein abundance, based on the intensities of each protein's associated spectral peaks. However, typical MS-based proteomics datasets have substantial proportions of missing observations, due at least in part to censoring of low intensities. This complicates intensity-based differential expression analysis.
Results: We outline a statistical method for protein differential expression, based on a simple Binomial likelihood. By modeling peak intensities as binary, in terms of ‘presence/absence,’ we enable the selection of proteins not typically amenable to quantitative analysis; e.g. ‘one-state’ proteins that are present in one condition but absent in another. In addition, we present an analysis protocol that combines quantitative and presence/absence analysis of a given dataset in a principled way, resulting in a single list of selected proteins with a single-associated false discovery rate.
Availability: All R code available here: http://www.stat.tamu.edu/~adabney/share/xuan_code.zip.
Supplementary data are available at Bioinformatics online.
Biological networks are important for elucidating disease etiology due to their ability to model complex high dimensional data and biological systems. Proteomics provides a critical data source for such models, but currently lacks robust de novo methods for network construction, which could bring important insights in systems biology.
We have evaluated the construction of network models using methods derived from weighted gene co-expression network analysis (WGCNA). We show that approximately scale-free peptide networks, composed of statistically significant modules, are feasible and biologically meaningful using two mouse lung experiments and one human plasma experiment. Within each network, peptides derived from the same protein are shown to have a statistically higher topological overlap and concordance in abundance, which is potentially important for inferring protein abundance. The module representatives, called eigenpeptides, correlate significantly with biological phenotypes. Furthermore, within modules, we find significant enrichment for biological function and known interactions (gene ontology and protein-protein interactions).
Biological networks are important tools in the analysis of complex systems. In this paper we evaluate the application of weighted co-expression network analysis to quantitative proteomics data. Protein co-expression networks allow novel approaches for biological interpretation, quality control, inference of protein abundance, a framework for potentially resolving degenerate peptide-protein mappings, and a biomarker signature discovery.
Biomarkers; Biological networks; Networks; Systems biology; Virology; Sarcopenia; LC-MS; Proteomics
Differential ion mobility spectrometry (FAIMS) can baseline-resolve multiple variants of post-translationally modified peptides extending to the 3 - 4 kDa range, which differ in the localization of a PTM as small as acetylation. Essentially orthogonal separations for different charge states expand the total peak capacity in proportion to the number of observed states that increases for longer polypeptides. This might enable resolving localization variants for yet larger peptides and even intact proteins.
Our objective here was to perform a quantitative phosphoproteomic study on a reconstituted human skin tissue to identify low and high dose ionizing radiation dependent signaling in a complex 3-dimensional setting. Application of an isobaric labeling strategy using sham and 3 radiation doses (3, 10, 200 cGy) resulted in the identification of 1052 unique phosphopeptides. Statistical analyses identified 176 phosphopeptides showing significant changes in response to radiation and radiation dose. Proteins responsible for maintaining skin structural integrity including keratins and desmosomal proteins (desmoglein, desmoplakin, plakophilin 1, 2 and 3) had altered phosphorylation levels following exposure to both low and high doses of radiation. Altered phosphorylation of multiple sites in profilaggrin linker domains coincided with altered profilaggrin processing suggesting a role for linker phosphorylation in human profilaggrin regulation. These studies demonstrate that the reconstituted human skin system undergoes a coordinated response to both low and high doses of ionizing radiation involving multiple layers of the stratified epithelium that serve to maintain tissue integrity and mitigate effects of radiation exposure.
Ionizing Radiation; Skin; Phosphorylation
The generation of genome-scale data is becoming more routine, yet the subsequent analysis of omics data remains a significant challenge. Here, an approach that integrates multiple omics datasets with bioinformatics tools was developed that produces a detailed annotation of several microbial genomic features. This methodology was used to characterize the genome of Thermotoga maritima—a phylogenetically deep-branching, hyperthermophilic bacterium. Experimental data were generated for whole-genome resequencing, transcription start site (TSS) determination, transcriptome profiling, and proteome profiling. These datasets, analyzed in combination with bioinformatics tools, served as a basis for the improvement of gene annotation, the elucidation of transcription units (TUs), the identification of putative non-coding RNAs (ncRNAs), and the determination of promoters and ribosome binding sites. This revealed many distinctive properties of the T. maritima genome organization relative to other bacteria. This genome has a high number of genes per TU (3.3), a paucity of putative ncRNAs (12), and few TUs with multiple TSSs (3.7%). Quantitative analysis of promoters and ribosome binding sites showed increased sequence conservation relative to other bacteria. The 5′UTRs follow an atypical bimodal length distribution comprised of “Short” 5′UTRs (11–17 nt) and “Common” 5′UTRs (26–32 nt). Transcriptional regulation is limited by a lack of intergenic space for the majority of TUs. Lastly, a high fraction of annotated genes are expressed independent of growth state and a linear correlation of mRNA/protein is observed (Pearson r = 0.63, p<2.2×10−16 t-test). These distinctive properties are hypothesized to be a reflection of this organism's hyperthermophilic lifestyle and could yield novel insights into the evolutionary trajectory of microbial life on earth.
Genomic studies have greatly benefited from the advent of high-throughput technologies and bioinformatics tools. Here, a methodology integrating genome-scale data and bioinformatics tools is developed to characterize the genome organization of the hyperthermophilic, phylogenetically deep-branching bacterium Thermotoga maritima. This approach elucidates several features of the genome organization and enables comparative analysis of these features across diverse taxa. Our results suggest that the genome of T. maritima is reflective of its hyperthermophilic lifestyle. Ultimately, constraints imposed on the genome have negative impacts on regulatory complexity and phenotypic diversity. Investigating the genome organization of Thermotogae species will help resolve various causal factors contributing to the genome organization such as phylogeny and environment. Applying a similar analysis of the genome organization to numerous taxa will likely provide insights into microbial evolution.
A novel hydrodynamic injector that is directly controlled by a pneumatic valve has been developed for reproducible microchip capillary electrophoresis (CE) separations. The poly(dimethylsiloxane) (PDMS) devices used for evaluation comprise a separation channel, a side channel for sample introduction, and a pneumatic valve aligned at the intersection of the channels. A low pressure (≤ 3 psi) applied to the sample reservoir is sufficient to drive sample into the separation channel. The rapidly actuated pneumatic valve enables injection of discrete sample plugs as small as ~100 pL for CE separation. The injection volume can be easily controlled by adjusting the intersection geometry, the solution back pressure and the valve actuation time. Sample injection could be reliably operated at different frequencies (< 0.1 Hz to >2 Hz) with good reproducibility (peak height relative standard deviation ≤ 3.6%) and no sampling biases associated with the conventional electrokinetic injections. The separation channel was dynamically coated with a cationic polymer, and FITC-labeled amino acids were employed to evaluate the CE separation. Highly efficient (≥ 7.0 × 103 theoretical plates for the ~2.4 cm long channel) and reproducible CE separations were obtained. The demonstrated method has numerous advantages compared with the conventional techniques, including repeatable and unbiased injections, little sample waste, high duty cycle, controllable injected sample volume, and fewer electrodes with no need for voltage switching. The prospects of implementing this injection method for coupling multidimensional separations, for multiplexing CE separations and for sample-limited bioanalyses are discussed.
Hydrodynamic injection; Microchip electrophoresis; Microfluidics; Pneumatic valve; Repeatable injection
Orthogonal high-resolution separations are critical for attaining improved analytical dynamic range and protein coverage in proteomic measurements. High pH reversed-phase liquid chromatography (RPLC) followed by fraction concatenation affords better peptide analysis than conventional strong-cation exchange (SCX) chromatography applied for the two-dimensional proteomic analysis. For example, concatenated high pH reversed-phase liquid chromatography increased identification for peptides (1.8-fold) and proteins (1.6-fold) in shotgun proteomics analyses of a digested human protein sample. Additional advantages of high pH RPLC with fraction concatenation include improved protein sequence coverage, simplified sample processing, and reduced sample losses, making this an attractive alternative to SCX chromatography in conjunction with the second dimension low pH RPLC for two-dimensional proteomics analyses.
Two dimensional chromatographic separation; shotgun proteomics analysis; SCX; Fraction concatenation; High pH RP
Sodium dodecyl sulfate (SDS) is one of the most popular laboratory reagents used for biological sample extraction; however, the presence of this reagent in samples challenges LC-MS-based proteomics analyses because it can interfer with reversed-phase LC separations and electrospray ionization. This study reports a simple SDS-assisted proteomics sample preparation method facilitated by a novel peptide-level SDS removal step. In an initial demonstration, SDS was effectively (>99.9%) removed from peptide samples through ion substitution-mediated DS- precipitation using potassium chloride (KCl), and excellent peptide recovery (>95%) was observed for <20 μg peptides. Further experiments demonstrated the compatibility of this protocol with LC-MS/MS analyses. The resulting proteome coverage obtained for both mammalian tissues and bacterial samples was comparable to or better than that obtained for the same sample types prepared using standard proteomics preparation methods and analyzed using LC-MS/MS. These results suggest the SDS-assisted protocol is a practical, simple, and broadly applicable proteomics sample processing method, which can be particularly useful when dealing with samples difficult to solubilize by other methods.
SDS removal; KDS precipitation; proteomics; sample preparation; LC-MS
Cell signaling systems transmit information by post-translationally modifying signaling proteins, often via phosphorylation. While thousands of sites of phosphorylation have been identified in proteomic studies, the vast majority of sites have no known function. Assigning functional roles to the catalog of uncharacterized phosphorylation sites is a key research challenge. Here we present a general approach to address this challenge and apply it to a prototypical signaling pathway, the pheromone response pathway in Saccharomyces cerevisiae. The pheromone pathway includes a mitogen activated protein kinase (MAPK) cascade activated by a G-protein coupled receptor (GPCR). We used published mass spectrometry-based proteomics data to identify putative sites of phosphorylation on pheromone pathway components, and we used evolutionary conservation to assign priority to a list of candidate MAPK regulatory sites. We made targeted alterations in those sites, and measured the effects of the mutations on pheromone pathway output in single cells. Our work identified six new sites that quantitatively tuned system output. We developed simple computational models to find system architectures that recapitulated the quantitative phenotypes of the mutants. Our results identify a number of putative phosphorylation events that contribute to adjust the input-output relationship of this model eukaryotic signaling system. We believe this combined approach constitutes a general means not only to reveal modification sites required to turn a pathway on and off, but also those required for more subtle quantitative effects that tune pathway output. Our results suggest that relatively small quantitative influences from individual phosphorylation events endow signaling systems with plasticity that evolution may exploit to quantitatively tailor signaling outcomes.