Search tips
Search criteria

Results 1-10 (10)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Proteogenomic database construction driven from large scale RNA-seq data 
Journal of proteome research  2013;13(1):21-28.
The advent of inexpensive RNA-Seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our manuscript addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2GB of aligned RNA-seq SAM files to 410MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom dataset, using a completely automated pipeline and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame-shifts, 1166 reverse-strands, and 42 translated UTR. Our results highlight the usefulness of transcript+proteomic integration for improved genome annotations.
PMCID: PMC4034692  PMID: 23802565
2.  Multiplexed MS/MS for Improved Data Independent Acquisition 
Nature methods  2013;10(8):10.1038/nmeth.2528.
In mass spectrometry based proteomics, data-independent acquisition (DIA) strategies have the ability to acquire a single dataset useful for identification and quantification of detectable peptides in a complex mixture. Despite this, DIA is often overlooked due to noisier data resulting from a typical five to ten fold reduction in precursor selectivity compared to data dependent acquisition or selected reaction monitoring. We demonstrate a multiplexing technique which improves precursor selectivity five-fold.
PMCID: PMC3881977  PMID: 23793237
Data Independent Acquisition; Q-Exactive; Multiplexing; Targeted Proteomics; Shotgun Proteomics
3.  De novo Correction of Mass Measurement Error in Low Resolution Tandem MS Spectra for Shotgun Proteomics 
We report an algorithm designed for the calibration of low resolution peptide mass spectra. Our algorithm is implemented in a program called FineTune which corrects systematic mass measurement error in one minute, with no input required besides the mass spectra themselves. The mass measurement accuracy for a set of spectra collected on an LTQ-Velos improved 20-fold from −0.1776 ± 0.0010 m/z to 0.0078 ± 0.0006 m/z after calibration (avg +/− 95% confidence interval). The precision in mass measurement was improved due to the correction of non-linear variation in mass measurement accuracy across the m/z range.
PMCID: PMC3515694  PMID: 23007965
Mass measurement accuracy; shotgun proteomics; linear ion trap
4.  (P1) Technology Development in a Multidisciplinary Center 
Variation in RNA, protein, and metabolite levels among individuals is an important source of physiological and phenotypic differences within and between species. However, relatively little is known about the magnitude and genetic basis of these high-dimensional molecular phenotypes. Yeast provide an ideal model system for the genetic dissection of complex and quantitative traits, and whole-genome sequences are accumulating for dozens of Saccharomyces cerevisiae strains isolated from natural, industrial, and lab environments. We grew a diverse selection of sequenced strains in continuous culture and used a randomized and replicated study design. We exploited all the technologies in the Yeast Resource Center to obtain high quality and high coverage measurements of RNA, protein, metabolite, and morphological phenotypes. The resulting data sets provide a unique and powerful opportunity to combine comparative functional genomics data with comparative sequence analyses and delineate the genetic architecture of complex and quantitative phenotypes in yeast. Our initial analyses indicate that a high degree of strain-to-strain variation exists at all systems levels, and that this variation largely correlates with strain relatedness as measured by sequence comparison. Variation in RNA levels correlates with the corresponding peptides and related metabolites in complex ways. These experiments have resulted in an important large-scale data set of thousands of quantitative traits collected in a carefully designed randomized study, which will provide novel insights into the magnitude and patterns of natural variation of molecular and morphological phenotypes, as well as preliminary insights into their genetic basis.
PMCID: PMC3630640
5.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project 
Gerstein, Mark B. | Lu, Zhi John | Van Nostrand, Eric L. | Cheng, Chao | Arshinoff, Bradley I. | Liu, Tao | Yip, Kevin Y. | Robilotto, Rebecca | Rechtsteiner, Andreas | Ikegami, Kohta | Alves, Pedro | Chateigner, Aurelien | Perry, Marc | Morris, Mitzi | Auerbach, Raymond K. | Feng, Xin | Leng, Jing | Vielle, Anne | Niu, Wei | Rhrissorrakrai, Kahn | Agarwal, Ashish | Alexander, Roger P. | Barber, Galt | Brdlik, Cathleen M. | Brennan, Jennifer | Brouillet, Jeremy Jean | Carr, Adrian | Cheung, Ming-Sin | Clawson, Hiram | Contrino, Sergio | Dannenberg, Luke O. | Dernburg, Abby F. | Desai, Arshad | Dick, Lindsay | Dosé, Andréa C. | Du, Jiang | Egelhofer, Thea | Ercan, Sevinc | Euskirchen, Ghia | Ewing, Brent | Feingold, Elise A. | Gassmann, Reto | Good, Peter J. | Green, Phil | Gullier, Francois | Gutwein, Michelle | Guyer, Mark S. | Habegger, Lukas | Han, Ting | Henikoff, Jorja G. | Henz, Stefan R. | Hinrichs, Angie | Holster, Heather | Hyman, Tony | Iniguez, A. Leo | Janette, Judith | Jensen, Morten | Kato, Masaomi | Kent, W. James | Kephart, Ellen | Khivansara, Vishal | Khurana, Ekta | Kim, John K. | Kolasinska-Zwierz, Paulina | Lai, Eric C. | Latorre, Isabel | Leahey, Amber | Lewis, Suzanna | Lloyd, Paul | Lochovsky, Lucas | Lowdon, Rebecca F. | Lubling, Yaniv | Lyne, Rachel | MacCoss, Michael | Mackowiak, Sebastian D. | Mangone, Marco | McKay, Sheldon | Mecenas, Desirea | Merrihew, Gennifer | Miller, David M. | Muroyama, Andrew | Murray, John I. | Ooi, Siew-Loon | Pham, Hoang | Phippen, Taryn | Preston, Elicia A. | Rajewsky, Nikolaus | Rätsch, Gunnar | Rosenbaum, Heidi | Rozowsky, Joel | Rutherford, Kim | Ruzanov, Peter | Sarov, Mihail | Sasidharan, Rajkumar | Sboner, Andrea | Scheid, Paul | Segal, Eran | Shin, Hyunjin | Shou, Chong | Slack, Frank J. | Slightam, Cindie | Smith, Richard | Spencer, William C. | Stinson, E. O. | Taing, Scott | Takasaki, Teruaki | Vafeados, Dionne | Voronina, Ksenia | Wang, Guilin | Washington, Nicole L. | Whittle, Christina M. | Wu, Beijing | Yan, Koon-Kiu | Zeller, Georg | Zha, Zheng | Zhong, Mei | Zhou, Xingliang | Ahringer, Julie | Strome, Susan | Gunsalus, Kristin C. | Micklem, Gos | Liu, X. Shirley | Reinke, Valerie | Kim, Stuart K. | Hillier, LaDeana W. | Henikoff, Steven | Piano, Fabio | Snyder, Michael | Stein, Lincoln | Lieb, Jason D. | Waterston, Robert H.
Science (New York, N.Y.)  2010;330(6012):1775-1787.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
PMCID: PMC3142569  PMID: 21177976
6.  Mitochondrial dysfunction in NnaD mutant flies and Purkinje cell degeneration (pcd) mice reveals a role for Nna proteins in neuronal bioenergetics 
Neuron  2010;66(6):835-847.
The Purkinje cell degeneration (pcd) mouse is a recessive model of neurodegeneration, involving cerebellum and retina. Purkinje cell death in pcd is dramatic, as >99% of Purkinje neurons are lost in three weeks. Loss-of-function of Nna1 causes pcd, and Nna1 is a highly conserved zinc carboxypeptidase. To determine the basis of pcd, we implemented a two-pronged approach, combining characterization of loss-of-function phenotypes of the Drosophila Nna1 orthologue (NnaD) with proteomics analysis of pcd mice. Reduced NnaD function yielded larval lethality, with survivors displaying phenotypes that mirror disease in pcd. Quantitative proteomics revealed expression alterations for glycolytic and oxidative phosphorylation enzymes. Nna proteins localize to mitochondria, loss of NnaD / Nna1 produces mitochondrial abnormalities, and pcd mice display altered proteolytic processing of Nna1 interacting proteins. Our studies indicate that Nna1 loss-of-function results in altered bioenergetics and mitochondrial dysfunction, and suggest that pcd shares pathogenic features with neurodegenerative disorders such as Parkinson's disease.
PMCID: PMC3101252  PMID: 20620870
7.  Deconvolution of Mixture Spectra from Ion-Trap Data-Independent-Acquisition Tandem Mass Spectrometry 
Analytical chemistry  2010;82(3):833.
Data-independent tandem mass spectrometry isolates and fragments all of the molecular species within a given mass-to-charge window, regardless of whether a precursor ion was detected within the window. For shotgun proteomics on complex protein mixtures, data-independent MS/MS offers certain advantages over the traditional data-dependent MS/MS: identification of low-abundance peptides with insignificant precursor peaks; more direct relative quantification, free of biases caused by competing precursors and dynamic exclusion; and faster throughput due to simultaneous fragmentation of multiple peptides. However, data-independent MS/MS, especially on low-resolution ion-trap instruments, strains standard peptide identification programs, because of less precise knowledge of the peptide precursor mass and large numbers of spectra composed of two or more peptides. Here we describe a computer program called DeMux that deconvolves mixture spectra and improves the peptide identification rate by ~25%. We compare the number of identifications made by data-independent and data-dependent MS/MS at the peptide and protein levels: conventional data-dependent MS/MS makes a greater number of identifications but is less reproducible from run to run.
PMCID: PMC2813958  PMID: 20039681
8.  Expediting the Development of Targeted SRM Assays: Using Data from Shotgun Proteomics to Automate Method Development 
Journal of proteome research  2009;8(6):2733-2739.
Selected reaction monitoring (SRM) is a powerful tandem mass spectrometry method that can be used to monitor target peptides within a complex protein digest. The specificity and sensitivity of the approach, as well as its capability to multiplex the measurement of many analytes in parallel, has made it a technology of particular promise for hypothesis driven proteomics. An underappreciated step in the development of an assay to measure many peptides in parallel is the time and effort necessary to establish a usable assay. Here we report the use of shotgun proteomics data to expedite the selection of SRM transitions for target peptides of interest. The use of tandem mass spectrometry data acquired on an LTQ ion trap mass spectrometer can accurately predict which fragment ions will produce the greatest signal in an SRM assay using a triple quadrupole mass spectrometer. Furthermore, we present a scoring routine that can compare the targeted SRM chromatogram data with an MS/MS spectrum acquired by data-dependent acquisition and stored in a library. This scoring routine is invaluable in determining which signal in the chromatogram from a complex mixture best represents the target peptide. These algorithmic developments have been implemented in a software package that is available from the authors upon request.
PMCID: PMC2743471  PMID: 19326923
9.  Post analysis data acquisition for the iterative MS/MS sampling of proteomics mixtures 
Journal of proteome research  2009;8(4):1870-1875.
The identification of peptides by microcapillary liquid chromatography-tandem mass spectrometry (µLC-MS/MS) has become routine because of the development of fast scanning mass spectrometers, data-dependent acquisition, and database searching algorithms. However, many peptides within the detection limit of the mass spectrometer remain unidentified because of limitations in MS/MS sampling speed despite the dynamic range and peak capacity of the instrument. We have developed an automated approach that uses the mass spectra from high resolution µLC-MS data to define the molecular species present in the mixture and directs the acquisition of MS/MS spectra to precursors that were missed in prior analyses. This approach increases the coverage of the molecular species sampled by MS/MS and consequently the number of peptides and proteins identified during the acquisition of technical or biological replicates using a simple one-dimensional chromatographic separation. The combination of a unique workflow and custom software contribute to the improved identification of molecular features detected in proteomics experiments of complex protein mixtures.
PMCID: PMC2671646  PMID: 19256536
10.  A Novel Dual-Pressure Linear Ion Trap Mass Spectrometer Improves the Analysis of Complex Protein Mixtures 
Analytical chemistry  2009;81(18):7757-7765.
The considerable progress in high throughput proteomics analysis via liquid chromatography-electrospray ionization-tandem mass spectrometry over the last decade has been fueled to a large degree by continuous improvements in instrumentation. High throughput identification experiments are based on peptide sequencing and are largely accomplished through the use of tandem mass spectrometry, with ion trap and trap-based instruments having become broadly adopted analytical platforms. To satisfy increasingly demanding requirements for depth of characterization and throughput, we present a newly developed dual-pressure linear ion trap mass spectrometer (LTQ Velos) that features increased sensitivity, afforded by a new source design, and demonstrates practical cycle times two times shorter than that of an LTQ XL, while improving or maintaining spectral quality for MS/MS fragmentation spectra. These improvements resulted in a substantial increase in the detection and identification of both proteins and unique peptides from the complex proteome of Caenorhabditis elegans, as compared to existing platforms. The greatly increased ion flux into the mass spectrometer in combination with improved isolation of low-abundance precursor ions resulted in increased detection of low-abundance peptides. These improvements cumulatively resulted in a substantially greater penetration into the baker’s yeast (Saccharomyces cerevisiae) proteome compared to LTQ XL. Alternatively, faster cycle times on the new instrument allowed for higher throughput for a given depth of proteome analysis, with more peptides and proteins identified in 60 min using an LTQ Velos than in 180 min using an LTQ XL. When mass analysis was carried out with resolution in excess of 25,000 FWHM, it became possible to isotopically resolve a small intact protein and its fragments, opening possibilities for top down experiments.
PMCID: PMC2810160  PMID: 19689114

Results 1-10 (10)