Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)
Year of Publication
Document Types
1.  Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics 
Journal of proteome research  2016;15(11):4091-4100.
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at
PMCID: PMC5096980  PMID: 27577934
shotgun mass spectrometry; search databases; human
2.  Using PeptideAtlas, SRMAtlas and PASSEL – Comprehensive Resources for discovery and targeted proteomics 
PeptideAtlas, SRMAtlas and PASSEL are web-accessible resources to support discovery and targeted proteomics research. PeptideAtlas is a multi-species compendium of shotgun proteomic data provided by the scientific community, SRMAtlas is a resource of high-quality, complete proteome SRM assays generated in a consistent manner for the targeted identification and quantification of proteins, and PASSEL is a repository that compiles and represents selected reaction monitoring data, all in an easy to use interface. The databases are generated from native mass spectrometry data files that are analyzed in a standardized manner including statistical validation of the results. Each resource offers search functionalities and can be queried by user defined constraints; the query results are provided in tables or are graphically displayed. PeptideAtlas, SRMAtlas and PASSEL are publicly available freely via the website In this protocol, we describe the use of these resources, we highlight how to submit, search, collate and download data.
PMCID: PMC4331073  PMID: 24939129
discovery proteomics; targeted proteomics; selected reaction monitoring (SRM); data repository; data resource; complete proteome library
3.  A repository of assays to quantify 10,000 human proteins by SWATH-MS 
Scientific Data  2014;1:140031.
Mass spectrometry is the method of choice for deep and reliable exploration of the (human) proteome. Targeted mass spectrometry reliably detects and quantifies pre-determined sets of proteins in a complex biological matrix and is used in studies that rely on the quantitatively accurate and reproducible measurement of proteins across multiple samples. It requires the one-time, a priori generation of a specific measurement assay for each targeted protein. SWATH-MS is a mass spectrometric method that combines data-independent acquisition (DIA) and targeted data analysis and vastly extends the throughput of proteins that can be targeted in a sample compared to selected reaction monitoring (SRM). Here we present a compendium of highly specific assays covering more than 10,000 human proteins and enabling their targeted analysis in SWATH-MS datasets acquired from research or clinical specimens. This resource supports the confident detection and quantification of 50.9% of all human proteins annotated by UniProtKB/Swiss-Prot and is therefore expected to find wide application in basic and clinical research. Data are available via ProteomeXchange (PXD000953-954) and SWATHAtlas (SAL00016-35).
PMCID: PMC4322573  PMID: 25977788
4.  The Mtb Proteome Library: A Resource of Assays to Quantify the Complete Proteome of Mycobacterium tuberculosis 
Cell host & microbe  2013;13(5):602-612.
Research advancing our understanding of Mycobacterium tuberculosis (Mtb) biology and complex host-Mtb interactions requires consistent and precise quantitative measurements of Mtb proteins. We describe the generation and validation of a compendium of assays to quantify 97% of the 4,012 annotated Mtb proteins by the targeted mass spectrometric method selected reaction monitoring (SRM). Furthermore, we estimate the absolute abundance for 55% of all Mtb proteins, revealing a dynamic range within the Mtb proteome of over four orders of magnitude, and identify previously un-annotated proteins. As an example of the assay library utility, we monitored the entire Mtb dormancy survival regulon (DosR), which is linked to anaerobic survival and Mtb persistence, and show its dynamic protein-level regulation during hypoxia. In conclusion, we present a publicly available research resource that supports the sensitive, precise, and reproducible quantification of virtually any Mtb protein by a robust and widely accessible mass spectrometric method.
PMCID: PMC3766585  PMID: 23684311
5.  A complete mass spectrometric map for the analysis of the yeast proteome and its application to quantitative trait analysis 
Nature  2013;494(7436):266-270.
Complete reference maps or datasets, like the genomic map of an organism, are highly beneficial tools for biological and biomedical research. Attempts to generate such reference datasets for a proteome so far failed to reach complete proteome coverage, with saturation apparent at approximately two thirds of the proteomes tested, even for the most thoroughly characterized proteomes. Here, we used a strategy based on high-throughput peptide synthesis and mass spectrometry to generate a close to complete reference map (97% of the genome-predicted proteins) of the S. cerevisiae proteome. We generated two versions of this mass spectrometric map one supporting discovery- (shotgun) and the other hypothesis-driven (targeted) proteomic measurements. The two versions of the map, therefore, constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. The reference libraries can be browsed via a web-based repository and associated navigation tools. To demonstrate the utility of the reference libraries we applied them to a protein quantitative trait locus (pQTL) analysis, which requires measurement of the same peptides over a large number of samples with high precision. Protein measurements over a set of 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, impacting on the levels of related proteins. Our results suggest that selective pressure favors the acquisition of sets of polymorphisms that maintain the stoichiometry of protein complexes and pathways.
PMCID: PMC3951219  PMID: 23334424
S. cerevisiae; selected reaction monitoring; SRM; MRM; spectral library; peptide library; mass spectrometric map; protein QTL
6.  PASSEL: The PeptideAtlas SRM Experiment Library 
Proteomics  2012;12(8):10.1002/pmic.201100515.
Public repositories for proteomics data have accelerated proteomics research by enabling more efficient cross-analyses of datasets, supporting the creation of protein and peptide compendia of experimental results, supporting the development and testing of new software tools, and facilitating the manuscript review process. The repositories available to date have been designed to accommodate either shotgun experiments or generic proteomic data files. Here, we describe a new kind of proteomic data repository for the collection and representation of data from selected reaction monitoring (SRM) measurements. The PeptideAtlas SRM Experiment Library (PASSEL) allows researchers to easily submit proteomic data sets generated by SRM. The raw data are automatically processed in a uniform manner and the results are stored in a database, where they may be downloaded or browsed via a web interface that includes a chromatogram viewer. PASSEL enables cross-analysis of SRM data, supports optimization of SRM data collection, and facilitates the review process of SRM data. Further, PASSEL will help in the assessment of proteotypic peptide performance in a wide array of samples containing the same peptide, as well as across multiple experimental protocols.
PMCID: PMC3832291  PMID: 22318887
data repository; MRM; software; SRM; targeted proteomics
7.  TraML—A Standard Format for Exchange of Selected Reaction Monitoring Transition Lists* 
Molecular & Cellular Proteomics : MCP  2011;11(4):R111.015040.
Targeted proteomics via selected reaction monitoring is a powerful mass spectrometric technique affording higher dynamic range, increased specificity and lower limits of detection than other shotgun mass spectrometry methods when applied to proteome analyses. However, it involves selective measurement of predetermined analytes, which requires more preparation in the form of selecting appropriate signatures for the proteins and peptides that are to be targeted. There is a growing number of software programs and resources for selecting optimal transitions and the instrument settings used for the detection and quantification of the targeted peptides, but the exchange of this information is hindered by a lack of a standard format. We have developed a new standardized format, called TraML, for encoding transition lists and associated metadata. In addition to introducing the TraML format, we demonstrate several implementations across the community, and provide semantic validators, extensive documentation, and multiple example instances to demonstrate correctly written documents. Widespread use of TraML will facilitate the exchange of transitions, reduce time spent handling incompatible list formats, increase the reusability of previously optimized transitions, and thus accelerate the widespread adoption of targeted proteomics via selected reaction monitoring.
PMCID: PMC3322582  PMID: 22159873
8.  Circulating 25-Hydroxyvitamin D and Risk of Kidney Cancer 
American Journal of Epidemiology  2010;172(1):47-57.
Although the kidney is a major organ for vitamin D metabolism, activity, and calcium-related homeostasis, little is known about whether this nutrient plays a role in the development or the inhibition of kidney cancer. To address this gap in knowledge, the authors examined the association between circulating 25-hydroxyvitamin D (25(OH)D) and kidney cancer within a large, nested case-control study developed as part of the Cohort Consortium Vitamin D Pooling Project of Rarer Cancers. Concentrations of 25(OH)D were measured from 775 kidney cancer cases and 775 age-, sex-, race-, and season-matched controls from 8 prospective cohort studies. Overall, neither low nor high concentrations of circulating 25(OH)D were significantly associated with kidney cancer risk. Although the data showed a statistically significant decreased risk for females (odds ratio = 0.31, 95% confidence interval: 0.12, 0.85) with 25(OH)D concentrations of ≥75 nmol/L, the linear trend was not statistically significant and the number of cases in this category was small (n = 14). The findings from this consortium-based study do not support the hypothesis that vitamin D is inversely associated with the risk of kidney cancer overall or with renal cell carcinoma specifically.
PMCID: PMC2892538  PMID: 20562187
case-control studies; cohort studies; kidney neoplasms; prospective studies; vitamin D
9.  Circulating 25-Hydroxyvitamin D and Risk of Esophageal and Gastric Cancer 
American Journal of Epidemiology  2010;172(1):94-106.
Upper gastrointestinal (GI) cancers of the stomach and esophagus have high incidence and mortality worldwide, but they are uncommon in Western countries. Little information exists on the association between vitamin D and risk of upper GI cancers. This study examined the association between circulating 25-hydroxyvitamin D (25(OH)D) and upper GI cancer risk in the Cohort Consortium Vitamin D Pooling Project of Rarer Cancers. Concentrations of 25(OH)D were measured from 1,065 upper GI cancer cases and 1,066 age-, sex-, race-, and season-of blood draw–matched controls from 8 prospective cohort studies. In multivariate-adjusted models, circulating 25(OH)D concentration was not significantly associated with upper GI cancer risk. Subgroup analysis by race showed that among Asians, but not Caucasians, lower concentrations of 25(OH)D (<25 nmol/L) were associated with a statistically significant decreased risk of upper GI cancer (reference: 50–<75 nmol/L) (odds ratio = 0.53, 95% confidence interval: 0.31, 0.91; P trend = 0.003). Never smokers with concentrations of <25 nmol/L showed a lower risk of upper GI cancers (odds ratio = 0.55, 95% confidence interval: 0.31, 0.96). Subgroup analyses by alcohol consumption produced opposing trends. Results do not support the hypothesis that interventions aimed at increasing vitamin D status would lead to a lower risk of these highly fatal cancers.
PMCID: PMC2892544  PMID: 20562192
case-control studies; cohort studies; esophageal neoplasms; prospective studies; stomach neoplasms; vitamin D

Results 1-9 (9)