Proteomics aims at the comprehensive identification and quantification of the proteins present in a biological sample. Typical samples include lysates of cells, tissue extracts, body fluids such as serum or plasma, or fractions of complete proteomes such as organelles or subcellular fractions. These samples usually contain thousands to tens of thousands of different proteins and their complete analysis has been technically challenging, in spite of significant recent progress. Most proteomic measurements have been carried out by mass spectrometry. Several strategies have been developed that all involve the generation of a protein sample, the digestion of the proteins, typically with trypsin, and the separation, ionization and mass spectrometric analysis of the complex peptide samples. There are several strategies for mass spectrometry-based experiments [
1,
2]. In the most commonly used strategy, referred to as data dependent analysis (DDA), shotgun proteomics or discovery proteomics, the instrument samples specific precursor ions (molecular ions of intact peptides) from all the precursor ions detected in a survey scan using a simple heuristics. Even though the sampling rate on modern mass spectrometers has increased considerably over the last few years, for complex proteome samples, the number of precursor ions detected in a survey scan typically exceeds the number of selection and fragmentation cycles in the instrument. Consequently, with repeat analyses of identical or very similar samples, different subsets of peptides are identified, resulting in irreproducible data sets.
Recently, a complementary proteomic workflow has emerged that is based on the targeted analysis of a set of predetermined proteins and peptides. This workflow is based on a mass spectrometric method referred to as selected reaction monitoring (SRM). It involves the selection of proteotypic peptides[
3,
4] from the predetermined protein set and the targeted selection of precursor ions based on their mass to charge ratio, the fragmentation of the precursor ions in the collision cell of a QQQ mass spectrometer and the selective detection of peptide-specific fragment ions. The detected fragment ions derived from a specific precursor ion are referred to as transitions [
5]. The precursor ion mass and the corresponding optimized set of transitions, along with additional information such as the preferred charge state of a peptide ion and the chromatographic elution time of the peptide, constitute a specific and highly sensitive assay for the detection of a particular peptide in a sample. SRM-based mass spectrometry produces consistent, reproducible and highly sensitive data sets that are particularly important for comparison of protein profiles across multiple samples, as is the case with biomarker discovery and validation studies and in systems biology where a biological system is analyzed in differentially perturbed states [
6,
7].
Over the last decade, a rich environment of open source and proprietary software tools has emerged to support all aspects of shotgun proteomics. In contrast, software tools to support the targeted, SRM-based workflow are still sparse. Previously published TIQAM [
8] software suites (TIQAM-digestor, TIQAM-peptidealtas, TIQAM-viewer) were also designed to support SRM workflow by generating transitions from in-silico digestion, by connecting to peptideatlas for transition selection and by manual examination of SRM triggered MS2 spectra. However, TIQAM connection to PeptideAtlas is limited when user wishes to prioritize peptides based on weighted amino acide composition. TIQAM-viewer is used for manual validation with assistant of MS2 spectra that are acquired by SRM triggered MS2 and there is no systematic method to classify "validated" vs. "not validate". Moreover, TIQAM is a single user desktop application rather than deployable and sharing the data in collaborative working environment. Similarly, Bertsch
et al. recently published an algorithm to predict proteotypic peptides, their fragmentation and retention time using sequence information alone[
9]. The recently published Skyline [
10], a window client application, provides a way to build SRM methods based on the BiblioSpec, NIST and GPM formats of spectral libraries. Skyline also provides a quantification value by calculating the area under the curve using CRAWDAD software[
11]. However, it uses a single score (hydrophobicity value from SSRCalc) to provide confidence in identification. Similar to TIQAM, it's a desktop application with manual inspection for validation. As a complementary tool to Skyline, AuDIT[
12], which is a webserver application running at the GenePattern website, provides further statistical validation in quantification using user-provided quantification values of light and heavy transitions. However, there is no open or commercial software that supports the entire SRM workflow for multiple users and that can interface with web browsers on a personal computer as well as connect to institution-wide computing resources for high throughput data analysis. Here we introduce a new open source software pipeline, ATAQS (Automated and Targeted Analysis with Quantitative SRM), which provides modules with algorithms that collectively support all of the steps of the SRM assay development and deployment workflow for targeted proteomic experiments (Figure ). ATAQS software is designed to support multiple users at an institution. ATAQS can be easily extended and customized by the user with the addition of user-implemented algorithms at any of the workflow steps. It also provides API for connecting to existing web service tools for easy data export and import to user's institutional web-based ATAQS application.
In this manuscript, we describe the following workflow steps of ATAQS that collectively support the routine application of SRM-based targeted proteomics studies:
1. ATAQS system overview
2. Workflow overview
3. Target Protein set selection
4. Peptide transition selection
5. Addition of isotopic pair and decoy transitions
6. Identification of confirmed transitions
7. Publication of verified transitions
In the section 'ATAQS application', we will also illustrate the ATAQS workflow with SRM-based analysis of synthesized yeast peptides and signaling kinase proteins in a human cancer cell line.
The ATAQS software is essential for the implementation and wide dissemination of a targeted proteomic workflow and therefore, this software is expected to have wide application in all fields of life science research that require the high-throughput generation of hypothesis-based, reproducible and highly sensitive proteomic datasets.