Peptide and protein identification via tandem mass spectrometry (MS/MS) lies at the heart of proteomic characterization of biological samples. Several algorithms are able to search, score, and assign peptides to large MS/MS datasets. Most popular methods, however, underutilize the intensity information available in the tandem mass spectrum due to the complex nature of the peptide fragmentation process, thus contributing to loss of potential identifications. We present a novel probabilistic scoring algorithm called Context-Sensitive Peptide Identification (CSPI) based on highly flexible Input-Output Hidden Markov Models (IO-HMM) that capture the influence of peptide physicochemical properties on their observed MS/MS spectra. We use several local and global properties of peptides and their fragment ions from literature. Comparison with two popular algorithms, Crux (re-implementation of SEQUEST) and X!Tandem, on multiple datasets of varying complexity, shows that peptide identification scores from our models are able to achieve greater discrimination between true and false peptides, identifying up to ∼25% more peptides at a False Discovery Rate (FDR) of 1%. We evaluated two alternative normalization schemes for fragment ion-intensities, a global rank-based and a local window-based. Our results indicate the importance of appropriate normalization methods for learning superior models. Further, combining our scores with Crux using a state-of-the-art procedure, Percolator, we demonstrate the utility of using scoring features from intensity-based models, identifying ∼4-8 % additional identifications over Percolator at 1% FDR. IO-HMMs offer a scalable and flexible framework with several modeling choices to learn complex patterns embedded in MS/MS data.
In mass spectrometry based proteomics, data-independent acquisition (DIA) strategies have the ability to acquire a single dataset useful for identification and quantification of detectable peptides in a complex mixture. Despite this, DIA is often overlooked due to noisier data resulting from a typical five to ten fold reduction in precursor selectivity compared to data dependent acquisition or selected reaction monitoring. We demonstrate a multiplexing technique which improves precursor selectivity five-fold.
Data Independent Acquisition; Q-Exactive; Multiplexing; Targeted Proteomics; Shotgun Proteomics
Seminal fluid plays an important role in successful fertilization, but knowledge of the full suite of proteins transferred from males to females during copulation is incomplete. The list of ejaculated proteins remains particularly scant in one of the best-studied mammalian systems, the house mouse (Mus domesticus), where artificial ejaculation techniques have proven inadequate. Here we investigate an alternative method for identifying ejaculated proteins, by isotopically labeling females with 15N and then mating them to unlabeled, vasectomized males. Proteins were then isolated from mated females and identified using mass spectrometry. In addition to gaining insights into possible functions and fates of ejaculated proteins, our study serves as proof of concept that isotopic labeling is a powerful means to study reproductive proteins.
We identified 69 male-derived proteins from the female reproductive tract following copulation. More than a third of all spectra detected mapped to just seven genes known to be structurally important in the formation of the copulatory plug, a hard coagulum that forms shortly after mating. Seminal fluid is significantly enriched for proteins that function in protection from oxidative stress and endopeptidase inhibition. Females, on the other hand, produce endopeptidases in response to mating. The 69 ejaculated proteins evolve significantly more rapidly than other proteins that we previously identified directly from dissection of the male reproductive tract.
Our study attempts to comprehensively identify the proteins transferred from males to females during mating, expanding the application of isotopic labeling to mammalian reproductive genomics. This technique opens the way to the targeted monitoring of the fate of ejaculated proteins as they incubate in the female reproductive tract.
seminal fluid; ejaculate; evolution
Elevated chromatographic temperatures are well recognized to provide beneficial analytical effects. Previously, we demonstrated that elevated chromatographic temperature enhances the identification of hydrophobic peptides prepared from enriched membrane samples. Here, we quantitatively assess and compare the recovery of peptide analytes from both simple and complex tryptic peptide matrices using the SRM mass spectrometry. Our study demonstrates that elevated chromatographic temperature results in significant improvements in the magnitude of peptide recovery for both hydrophilic and hydrophobic peptides from both simple and complex peptide matrices. Importantly, the analytical benefits for quantitative measurements in whole mouse brain matrix are demonstrated, suggesting broad utility in the proteomic analyses of complex mammalian tissues. Any improvement in peptide recovery from chromatographic separations translates directly to the apparent sensitivity of downstream mass analysis in μLC-MS/MS based proteomic applications. Therefore, the incorporation of elevated chromatographic temperatures should result in significant improvements in peptide quantification as well as detection and identification.
The ultimate goal of most shotgun proteomic pipelines is the discovery of novel biomarkers to direct the development of quantitative diagnostics for the detection and treatment of disease. Differential comparisons of biological samples identify candidate peptides that can serve as proxys of candidate proteins. While these discovery approaches are robust and fairly comprehensive, they have relatively low throughput. When merged with targeted mass spectrometry, this pipeline can fuel hypothesis-driven studies and the development of novel diagnostics and therapeutics.
quantitative shotgun proteomics; biomarker discovery; targeted mass spectrometry; human tissue
Integral membrane proteins perform crucial cellular functions and are the targets for the majority of pharmaceutical agents. However, the hydrophobic nature of their membrane-embedded domains makes them difficult to work with. Here, we describe a shotgun proteomic method for the high-throughput analysis of the membrane-embedded transmembrane domains of integral membrane proteins which extends the depth of coverage of the membrane proteome.
Although two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has been used as the standard proteomic approach for separating proteins in a complex mixture, this technique has many drawbacks. These include a limited molecular mass range, poor separation of highly acidic or basic proteins, and exclusion of the majority of membrane proteins from analysis. Considering the important functions of many membrane proteins, such as receptors, ion transporters, signal transducers, and cell adhesion proteins, it is increasingly important that these proteins are not excluded during the global proteomic analysis of cellular systems. Multidimensional Protein Identification Technology (MudPIT) offers a gel-free alternative to 2D-PAGE for the analysis of both membrane and soluble proteins.
The goal of this chapter is to provide detailed methods for using MudPIT to profile both membrane and soluble proteins in complex unfractionated samples. Methods discussed will include tissue homogenization, sample preparation, MudPIT, data analysis, and an application for the analysis of unfractionated total tissue homogenate from human heart.
MudPIT; 2D-PAGE; proteomics; membrane proteins; human heart explants
Proteomics research is beginning to expand beyond the more traditional shotgun analysis of protein mixtures to include targeted analyses of specific proteins using mass spectrometry. Integral to the development of a robust assay based on targeted mass spectrometry is prior knowledge of which peptides provide an accurate and sensitive proxy of the originating gene product (i.e., proteotypic peptides). To develop a catalog of “proteotypic peptides” in human heart, TRIzol extracts of left-ventricular tissue from nonfailing and failing human heart explants were optimized for shotgun proteomic analysis using Multidimensional Protein Identification Technology (MudPIT). Ten replicate MudPIT analyses were performed on each tissue sample and resulted in the identification of 30 605 unique peptides with a q-value ≤ 0.01, corresponding to 7138 unique human heart proteins. Experimental observation frequencies were assessed and used to select over 4476 proteotypic peptides for 2558 heart proteins. This human cardiac data set can serve as a public reference to guide the selection of proteotypic peptides for future targeted mass spectrometry experiments monitoring potential protein biomarkers of human heart diseases.
proteotypic peptides; targeted mass spectrometry; human heart explant; dilated cardiomyopathy; MudPIT
Integral membrane proteins (IMPs) perform crucial cellular functions and are the primary targets for most pharmaceutical agents. However, the hydrophobic nature of their membrane-embedded domains and their intimate association with lipids makes them difficult to handle. Multiple proteomics platforms that include LC separations have been reported for the high-throughput profiling of complex protein samples. However, there are still many challenges to overcome for proteomic analyses of IMPs, especially as compared to their soluble counterparts. In particular, considerations for the technical challenges associated with chromatographic separations are just beginning to be investigated. Here, we review the benefits of using elevated temperatures during LC for the proteomic analysis of complex membrane protein samples.
Liquid chromatography; Microcapillary; Shotgun; Temperature
Cellular membranes are composed of proteins and glyco- and phospholipids and play an indispensible role in maintaining cellular integrity and homeostasis by physically restricting biochemical processes within cells and providing protection. Membrane proteins perform many essential functions, which include operating as transporters, adhesion-anchors, receptors, and enzymes. Recent advancements in proteomic mass spectrometry have resulted in substantial progress towards the determination of the plasma membrane (PM) proteome, resolution of membrane protein topology, establishment of numerous receptor protein complexes, identification of ligand–receptor pairs, and the elucidation of signaling networks originating at the PM. Here we discuss the recent accelerated success of discovery-based proteomic pipelines for the establishment of a complete membrane proteome.
Despite advances in metabolic and postmetabolic labeling methods for quantitative proteomics, there remains a need for improved label-free approaches. This need is particularly pressing for workflows that incorporate affinity enrichment at the peptide level, where isobaric chemical labels such as isobaric tags for relative and absolute quantitation and tandem mass tags may prove problematic or where stable isotope labeling with amino acids in cell culture labeling cannot be readily applied. Skyline is a freely available, open source software tool for quantitative data processing and proteomic analysis. We expanded the capabilities of Skyline to process ion intensity chromatograms of peptide analytes from full scan mass spectral data (MS1) acquired during HPLC MS/MS proteomic experiments. Moreover, unlike existing programs, Skyline MS1 filtering can be used with mass spectrometers from four major vendors, which allows results to be compared directly across laboratories. The new quantitative and graphical tools now available in Skyline specifically support interrogation of multiple acquisitions for MS1 filtering, including visual inspection of peak picking and both automated and manual integration, key features often lacking in existing software. In addition, Skyline MS1 filtering displays retention time indicators from underlying MS/MS data contained within the spectral library to ensure proper peak selection. The modular structure of Skyline also provides well defined, customizable data reports and thus allows users to directly connect to existing statistical programs for post hoc data analysis. To demonstrate the utility of the MS1 filtering approach, we have carried out experiments on several MS platforms and have specifically examined the performance of this method to quantify two important post-translational modifications: acetylation and phosphorylation, in peptide-centric affinity workflows of increasing complexity using mouse and human models.
Early detection of breast cancer is associated with improved patient survival. While early disease is commonly identified by patient self-examination and breast mammography, interpretation of these findings are highly subjective and often require significant disease burden to achieve sensitivity. Cancer screening utilizing blood-based assays, such as measurement of prostate-specific antigen (PSA) abundance for prostate cancer, has proven to be a minimally invasive method that aids in detecting early disease. The generation of a blood-based assay for the detection of early disease in breast cancer would enable more facile disease diagnosis and thus expedite patient care.
The discovery of proteins actively shed or secreted by tumor cells into blood plasma by global proteomic analyses has proven analytically challenging, due mainly to the large dynamic range of protein abundances in blood. Common methods to enrich for tumor-specific proteins include depletion of abundant proteins from plasma samples, such as albumin and immunoglobulins. Furthermore, strategies are needed to detect blood-based candidates derived specifically from tumor cell populations to provide high-confidence candidates for further validation efforts.
To this end, we have developed a method combining global proteomic analyses of plasma collected from a mouse xenograft model of primary human breast cancer with post-data acquisition filtering of species-specific peptide search results. Primary xenograft models enable analyses of human tumor tissue in non-native biological backgrounds. Therefore, species-specific protein and gene sequences can be exploited in discovery efforts to selectively identify tumor cell-specific characteristics. Preliminary studies of plasma analyzed from xenograft-bearing mice have resulted in the identification of human-specific peptides corresponding to proteins previously described as being secreted from breast tissue and associated with breast cancer pathogenesis. Application of this strategy to proteomic analyses from a cohort of xenograft mice bearing HER2+ and triple negative breast cancer tissues will be presented.
This article summarizes the proceedings of a symposium presented at the 2005 annual meeting of the Research Society on Alcoholism in Santa Barbara, California. The organizer was James M. Sikela, and he and Michael F. Miles were chairs. The presentations were (1) Genomewide Surveys of Gene Copy Number Variation in Human and Mouse: Implications for the Genetics of Alcohol Action, by James M. Sikela; (2) Regional Differences in the Regulation of Brain Gene Expression: Relevance to the Detection of Genes Associated with Alcohol-Related Traits, by Robert Hitzemann; (3) Identification of Ethanol Quantitative Trait Loci Candidate Genes by Expression Profiling in Inbred Long Sleep/Inbred Short Sleep Congenic Mice, by Robnet T. Kerns; and (4) Quantitative Proteomic Analysis of AC7-Modified Mice, by Kathleen J. Grant.
Array-Based Comparative Genomic Hybridization; Gene Copy Number; Microarrays; Gene Expression Profiling; Alcohol-Related QTL; Proteomics; Adenylyl Cyclase
We previously reported the metabolic 15N labeling of a rat where enrichment ranged from 94% to 74%. We report here an improved labeling strategy which generates 94% 15N enrichment throughout all tissues of the rat. A high 15N enrichment of the internal standard is necessary for accurate quantitation, and thus, this approach will allow quantitative mass spectrometry analysis of animal models of disease targeting any tissue.
GRASP55 is a Golgi-associated protein, but its function at the Golgi remains unclear. Addition of full-length GRASP55, GRASP55-specific peptides, or an anti-GRASP55 antibody inhibited Golgi fragmentation by mitotic extracts in vitro, and entry of cells into mitosis. Phospho-peptide mapping of full-length GRASP55 revealed that threonine 225 and 249 were mitotically phosphorylated. Wild-type peptides containing T225 and T249 inhibited Golgi fragmentation and entry of cells into mitosis. Mutant peptides containing T225E and T249E, in contrast, did not affect Golgi fragmentation and entry into mitosis. These findings reveal a role of GRASP55 in events leading to Golgi fragmentation and the subsequent entry of cell into mitosis. Surprisingly, however, under our experimental conditions, >85% knockdown of GRASP55 did not affect the overall organization of Golgi organization in terms of cisternal stacking and lateral connections between stacks. Based on our findings we suggest that phosphorylation of GRASP55 at T225/T249 releases a bound component, which is phosphorylated and necessary for Golgi fragmentation. Thus, GRASP55 has no role in the organization of Golgi membranes per se, but it controls their fragmentation by regulating the release of a partner, which requires a G2-specific phosphorylation at T225/T249.
Alcohol dependence; genetic theory of alcohol and other drug use; genetic trait; brain; animal models; proteins; protein analysis; proteomics; mass spectrometry; peptides
In the pathogenic bacterium Chlamydia trachomatis, a transcriptional repressor, HrcA, regulates the major heat shock operons, dnaK and groE. Cellular stress causes a transient increase in transcription of these heat shock operons through relief of HrcA-mediated repression, but the pathway leading to derepression is unclear. Elevated temperature alone is not sufficient, and it is hypothesized that additional chlamydial factors play a role. We used DNA affinity chromatography to purify proteins that interact with HrcA in vivo and identified a higher-order complex consisting of HrcA, GroEL, and GroES. This endogenous HrcA complex migrated differently than recombinant HrcA, but the complex could be disrupted, releasing native HrcA that resembled recombinant HrcA. In in vitro assays, GroEL increased the ability of HrcA to bind to the CIRCE operator and to repress transcription. Other chlamydial heat shock proteins, including the two additional GroEL paralogs present in all chlamydial species, did not modulate HrcA activity.
The Golgi complex functions to posttranslationally modify newly synthesized proteins and lipids and to sort them to their sites of function. In this study, a stacked Golgi fraction was isolated by classical cell fractionation, and the protein complement (the Golgi proteome) was characterized using multidimensional protein identification technology. Many of the proteins identified are known residents of the Golgi, and 64% of these are predicted transmembrane proteins. Proteins localized to other organelles also were identified, strengthening reports of functional interfacing between the Golgi and the endoplasmic reticulum and cytoskeleton. Importantly, 41 proteins of unknown function were identified. Two were selected for further analysis, and Golgi localization was confirmed. One of these, a putative methyltransferase, was shown to be arginine dimethylated, and upon further proteomic analysis, arginine dimethylation was identified on 18 total proteins in the Golgi proteome. This survey illustrates the utility of proteomics in the discovery of novel organellar functions and resulted in 1) a protein profile of an enriched Golgi fraction; 2) identification of 41 previously uncharacterized proteins, two with confirmed Golgi localization; 3) the identification of arginine dimethylated residues in Golgi proteins; and 4) a confirmation of methyltransferase activity within the Golgi fraction.
Incubating cells at 20°C blocks transport out of the Golgi complex and amplifies the exit compartments. We have used the 20°C block, followed by EM tomography and serial section reconstruction, to study the structure of Golgi exit sites in NRK cells. The dominant feature of Golgi structure in temperature-blocked cells is the presence of large bulging domains on the three trans-most cisternae. These domains extend laterally from the stack and are continuous with “cisternal” domains that maintain normal thickness and alignment with the other stacked Golgi cisternae. The bulging domains do not resemble the perpendicularly extending tubules associated with the trans-cisternae of control cells. Such tubules are completely absent in temperature-blocked cells. The three cisternae with bulging domains can be identified as trans by their association with specialized ER and the presence of clathrin-coated buds on the trans-most cisterna only. Immunogold labeling and immunoblots show a significant degradation of a medial- and a trans-Golgi marker with no evidence for their redistribution within the Golgi or to other organelles. These data suggest that exit from the Golgi occurs directly from three trans-cisternae and that specialized ER plays a significant role in trans-Golgi function.