PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcbioiBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Bioinformatics
 
BMC Bioinformatics. 2012; 13: 74.
Published online May 4, 2012. doi:  10.1186/1471-2105-13-74
PMCID: PMC3436780
Metaprotein expression modeling for label-free quantitative proteomics
Joseph E Lucas,corresponding author#1 J Will Thompson,#1 Laura G Dubois,1 Jeanette McCarthy,1 Hans Tillmann,2 Alexander Thompson,2 Norah Shire,3 Ron Hendrickson,3 Francisco Dieguez,3 Phyllis Goldman,3 Kathleen Schwarz,4 Keyur Patel,2 John McHutchison,2 and M Arthur Moseley1
1Institute for Genome Sciences and Policy, Duke University, Durham, NC, USA
2Gasteroenterology, Duke University School of Medicine, Durham, NC, USA
3Merck, Sharpe, and Dohme, Corp, Whitehouse Station, NJ, USA
4Johns Hopkins Children’s Center, Baltimore, MD, USA
corresponding authorCorresponding author.
#Contributed equally.
Joseph E Lucas: joe/at/stat.duke.edu; J Will Thompson: will.thompson/at/duke.edu; Laura G Dubois: laura.dubois/at/duke.edu; Jeanette McCarthy: jeanette.mccarthy/at/duke.edu; Hans Tillmann: hans.tillmann/at/duke.edu; Alexander Thompson: alexander.thompson/at/duke.edu; Norah Shire: norah_shire/at/merck.com; Ron Hendrickson: ronaldhendrickson/at/gmail.com; Francisco Dieguez: francisco_dieguez/at/merck.com; Phyllis Goldman: phyllis_goldman/at/merck.com; Kathleen Schwarz: kschwarz/at/jhmi.edu; Keyur Patel: keyur.patel/at/duke.edu; John McHutchison: john.mchutchison/at/gilead.com; M Arthur Moseley: arthur.moseley/at/duke.edu
Received December 12, 2011; Accepted May 4, 2012.
Abstract
Background
Label-free quantitative proteomics holds a great deal of promise for the future study of both medicine and biology. However, the data generated is extremely intricate in its correlation structure, and its proper analysis is complex. There are issues with missing identifications. There are high levels of correlation between many, but not all, of the peptides derived from the same protein. Additionally, there may be systematic shifts in the sensitivity of the machine between experiments or even through time within the duration of a single experiment.
Results
We describe a hierarchical model for analyzing unbiased, label-free proteomics data which utilizes the covariance of peptide expression across samples as well as MS/MS-based identifications to group peptides—a strategy we call metaprotein expression modeling. Our metaprotein model acknowledges the possibility of misidentifications, post-translational modifications and systematic differences between samples due to changes in instrument sensitivity or differences in total protein concentration. In addition, our approach allows us to validate findings from unbiased, label-free proteomics experiments with further unbiased, label-free proteomics experiments. Finally, we demonstrate the clinical/translational utility of the model for building predictors capable of differentiating biological phenotypes as well as for validating those findings in the context of three novel cohorts of patients with Hepatitis C.
Conclusions
Mass-spectrometry proteomics is quickly becoming a powerful tool for studying biological and translational questions. Making use of all of the information contained in a particular set of data will be critical to the success of those endeavors. Our proposed model represents an advance in the ability of statistical models of proteomic data to identify and utilize correlation between features. This allows validation of predictors without translation to targeted assays in addition to informing the choice of targets when it is appropriate to generate those assays.
Keywords: Proteomics, Factor, Hepatitis, Open platform, Statistics, Statistical model, Srm, Mrm
Articles from BMC Bioinformatics are provided here courtesy of
BioMed Central