Search tips
Search criteria

Results 1-7 (7)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Sparse Regression Based Structure Learning of Stochastic Reaction Networks from Single Cell Snapshot Time Series 
PLoS Computational Biology  2016;12(12):e1005234.
Stochastic chemical reaction networks constitute a model class to quantitatively describe dynamics and cell-to-cell variability in biological systems. The topology of these networks typically is only partially characterized due to experimental limitations. Current approaches for refining network topology are based on the explicit enumeration of alternative topologies and are therefore restricted to small problem instances with almost complete knowledge. We propose the reactionet lasso, a computational procedure that derives a stepwise sparse regression approach on the basis of the Chemical Master Equation, enabling large-scale structure learning for reaction networks by implicitly accounting for billions of topology variants. We have assessed the structure learning capabilities of the reactionet lasso on synthetic data for the complete TRAIL induced apoptosis signaling cascade comprising 70 reactions. We find that the reactionet lasso is able to efficiently recover the structure of these reaction systems, ab initio, with high sensitivity and specificity. With only < 1% false discoveries, the reactionet lasso is able to recover 45% of all true reactions ab initio among > 6000 possible reactions and over 102000 network topologies. In conjunction with information rich single cell technologies such as single cell RNA sequencing or mass cytometry, the reactionet lasso will enable large-scale structure learning, particularly in areas with partial network structure knowledge, such as cancer biology, and thereby enable the detection of pathological alterations of reaction networks. We provide software to allow for wide applicability of the reactionet lasso.
Author Summary
Virtually all biological processes are driven by biochemical reactions. However, their quantitative description in terms of stochastic chemical reaction networks is often precluded by the computational difficulty of structure learning, i.e. the identification of biologically active reaction networks among the combinatorially many possible topologies. This work describes the reactionet lasso, a structure learning approach that takes advantage of novel, information-rich single cell data and a tractable problem formulation to achieve structure learning for problem instances hundreds of orders of magnitude larger than previously reported. This approach opens the prospect of obtaining quantitative and predictive reaction models in many areas of biology and medicine, and in particular areas such as cancer biology, which are characterized by significant system alterations and many unknown reactions.
PMCID: PMC5140059  PMID: 27923064
2.  Epitope-Tagged Autotransporters as Single-Cell Reporters for Gene Expression by a Salmonella Typhimurium wbaP Mutant 
PLoS ONE  2016;11(5):e0154828.
Phenotypic diversity is an important trait of bacterial populations and can enhance fitness of the existing genotype in a given environment. To characterize different subpopulations, several studies have analyzed differential gene expression using fluorescent reporters. These studies visualized either single or multiple genes within single cells using different fluorescent proteins. However, variable maturation and folding kinetics of different fluorophores complicate the study of dynamics of gene expression. Here, we present a proof-of-principle study for an alternative gene expression system in a wbaP mutant of Salmonella Typhimurium (S. Tm) lacking the O-sidechain of the lipopolysaccharide. We employed the hemagglutinin (HA)-tagged inverse autotransporter invasin (invAHA) as a transcriptional reporter for the expression of the type three secretion system 1 (T1) in S. Tm. Using a two-reporter approach with GFP and the InvAHA in single cells, we verify that this reporter system can be used for T1 gene expression analysis, at least in strains lacking the O-antigen (wbaP), which are permissive for detection of the surface-exposed HA-epitope. When we placed the two reporters gfp and invAHA under the control of either one or two different promoters of the T1 regulon, we were able to show correlative expression of both reporters. We conclude that the invAHA reporter system is a suitable tool to analyze T1gene expression in S. Tm and propose its applicability as molecular tool for gene expression studies within single cells.
PMCID: PMC4858243  PMID: 27149272
3.  The Mtb Proteome Library: A Resource of Assays to Quantify the Complete Proteome of Mycobacterium tuberculosis 
Cell host & microbe  2013;13(5):602-612.
Research advancing our understanding of Mycobacterium tuberculosis (Mtb) biology and complex host-Mtb interactions requires consistent and precise quantitative measurements of Mtb proteins. We describe the generation and validation of a compendium of assays to quantify 97% of the 4,012 annotated Mtb proteins by the targeted mass spectrometric method selected reaction monitoring (SRM). Furthermore, we estimate the absolute abundance for 55% of all Mtb proteins, revealing a dynamic range within the Mtb proteome of over four orders of magnitude, and identify previously un-annotated proteins. As an example of the assay library utility, we monitored the entire Mtb dormancy survival regulon (DosR), which is linked to anaerobic survival and Mtb persistence, and show its dynamic protein-level regulation during hypoxia. In conclusion, we present a publicly available research resource that supports the sensitive, precise, and reproducible quantification of virtually any Mtb protein by a robust and widely accessible mass spectrometric method.
PMCID: PMC3766585  PMID: 23684311
4.  Estimation of Absolute Protein Quantities of Unlabeled Samples by Selected Reaction Monitoring Mass Spectrometry* 
Molecular & Cellular Proteomics : MCP  2011;11(3):M111.013987.
For many research questions in modern molecular and systems biology, information about absolute protein quantities is imperative. This information includes, for example, kinetic modeling of processes, protein turnover determinations, stoichiometric investigations of protein complexes, or quantitative comparisons of different proteins within one sample or across samples. To date, the vast majority of proteomic studies are limited to providing relative quantitative comparisons of protein levels between limited numbers of samples. Here we describe and demonstrate the utility of a targeting MS technique for the estimation of absolute protein abundance in unlabeled and nonfractionated cell lysates. The method is based on selected reaction monitoring (SRM) mass spectrometry and the “best flyer” hypothesis, which assumes that the specific MS signal intensity of the most intense tryptic peptides per protein is approximately constant throughout a whole proteome. SRM-targeted best flyer peptides were selected for each protein from the peptide precursor ion signal intensities from directed MS data. The most intense transitions per peptide were selected from full MS/MS scans of crude synthetic analogs. We used Monte Carlo cross-validation to systematically investigate the accuracy of the technique as a function of the number of measured best flyer peptides and the number of SRM transitions per peptide. We found that a linear model based on the two most intense transitions of the three best flying peptides per proteins (TopPep3/TopTra2) generated optimal results with a cross-correlated mean fold error of 1.8 and a squared Pearson coefficient R2 of 0.88. Applying the optimized model to lysates of the microbe Leptospira interrogans, we detected significant protein abundance changes of 39 target proteins upon antibiotic treatment, which correlate well with literature values. The described method is generally applicable and exploits the inherent performance advantages of SRM, such as high sensitivity, selectivity, reproducibility, and dynamic range, and estimates absolute protein concentrations of selected proteins at minimized costs.
PMCID: PMC3316728  PMID: 22101334
5.  The quantitative proteome of a human cell line 
The majority of all proteins expressed in the human osteosarcoma cell line U2OS were absolutely quantified by mass spectrometry. The quantified proteins span a concentration range of seven orders of magnitude up to 20 000 000 copies per cell.
The generation of mathematical models of biological processes, the simulation of these processes under different conditions, and the comparison and integration of multiple data sets are explicit goals of systems biology that require the knowledge of the absolute quantity of the system's components. To date, systematic estimates of cellular protein concentrations have been exceptionally scarce. Here, we provide a quantitative description of the proteome of a commonly used human cell line in two functional states, interphase and mitosis. We show that these human cultured cells express at least ∼10 000 proteins and that the quantified proteins span a concentration range of seven orders of magnitude up to 20 000 000 copies per cell. We discuss how protein abundance is linked to function and evolution.
PMCID: PMC3261713  PMID: 22068332
mass spectrometry; protein abundance; proteomics
6.  Absolute quantification of microbial proteomes at different states by directed mass spectrometry 
The developed, directed mass spectrometry workflow allows to generate consistent and system-wide quantitative maps of microbial proteomes in a single analysis. Application to the human pathogen L. interrogans revealed mechanistic proteome changes over time involved in pathogenic progression and antibiotic defense, and new insights about the regulation of absolute protein abundances within operons.
The developed, directed proteomic approach allowed consistent detection and absolute quantification of 1680 proteins of the human pathogen L. interrogans in a single LC–MS/MS experiment.The comparison of 25 extensive, consistent and quantitative proteome maps revealed new insights about the proteome changes involved in pathogenic progression and antibiotic defense of L. interrogans, and about the regulation of protein abundances within operons.The generated time-resolved data sets are compatible with pattern analysis algorithms developed for transcriptomics, including hierarchical clustering and functional enrichment analysis of the detected profile clusters.This is the first study that describes the absolute quantitative behavior of any proteome over multiple states and represents the most comprehensive proteome abundance pattern comparison for any organism to date.
Over the last decade, mass spectrometry (MS)-based proteomics has evolved as the method of choice for system-wide proteome studies and now allows for the characterization of several thousands of proteins in a single sample. Despite these great advances, redundant monitoring of protein levels over large sample numbers in a high-throughput manner remains a challenging task. New directed MS strategies have shown to overcome some of the current limitations, thereby enabling the acquisition of consistent and system-wide data sets of proteomes with low-to-moderate complexity at high throughput.
In this study, we applied this integrated, two-stage MS strategy to investigate global proteome changes in the human pathogen L. interrogans. In the initial discovery phase, 1680 proteins (out of around 3600 gene products) could be identified (Schmidt et al, 2008) and, by focusing precious MS-sequencing time on the most dominant, specific peptides per protein, all proteins could be accurately and consistently monitored over 25 different samples within a few days of instrument time in the following scoring phase (Figure 1). Additionally, the co-analysis of heavy reference peptides enabled us to obtain absolute protein concentration estimates for all identified proteins in each perturbation (Malmström et al, 2009). The detected proteins did not show any biases against functional groups or protein classes, including membrane proteins, and span an abundance range of more than three orders of magnitude, a range that is expected to cover most of the L. interrogans proteome (Malmström et al, 2009).
To elucidate mechanistic proteome changes over time involved in pathogenic progression and antibiotic defense of L. interrogans, we generated time-resolved proteome maps of cells perturbed with serum and three different antibiotics at sublethal concentrations that are currently used to treat Leptospirosis. This yielded an information-rich proteomic data set that describes, for the first time, the absolute quantitative behavior of any proteome over multiple states, and represents the most comprehensive proteome abundance pattern comparison for any organism to date. Using this unique property of the data set, we could quantify protein components of entire pathways across several time points and subject the data sets to cluster analysis, a tool that was previously limited to the transcript level due to incomplete sampling on protein level (Figure 4). Based on these analyses, we could demonstrate that Leptospira cells adjust the cellular abundance of a certain subset of proteins and pathways as a general response to stress while other parts of the proteome respond highly specific. The cells furthermore react to individual treatments by ‘fine tuning' the abundance of certain proteins and pathways in order to cope with the specific cause of stress. Intriguingly, the most specific and significant expression changes were observed for proteins involved in motility, tissue penetration and virulence after serum treatment where we tried to simulate the host environment. While many of the detected protein changes demonstrate good agreement with available transcriptomics data, most proteins showed a poor correlation. This includes potential virulence factors, like Loa22 or OmpL1, with confirmed expression in vivo that were significantly up-regulated on the protein level, but not on the mRNA level, strengthening the importance of proteomic studies. The high resolution and coverage of the proteome data set enabled us to further investigate protein abundance changes of co-regulated genes within operons. This suggests that although most proteins within an operon respond to regulation synchronously, bacterial cells seem to have subtle means to adjust the levels of individual proteins or protein groups outside of the general trend, a phenomena that was recently also observed on the transcript level of other bacteria (Güell et al, 2009).
The method can be implemented with standard high-resolution mass spectrometers and software tools that are readily available in the majority of proteomics laboratories. It is scalable to any proteome of low-to-medium complexity and can be extended to post-translational modifications or peptide-labeling strategies for quantification. We therefore expect the approach outlined here to become a cornerstone for microbial systems biology.
Over the past decade, liquid chromatography coupled with tandem mass spectrometry (LC–MS/MS) has evolved into the main proteome discovery technology. Up to several thousand proteins can now be reliably identified from a sample and the relative abundance of the identified proteins can be determined across samples. However, the remeasurement of substantially similar proteomes, for example those generated by perturbation experiments in systems biology, at high reproducibility and throughput remains challenging. Here, we apply a directed MS strategy to detect and quantify sets of pre-determined peptides in tryptic digests of cells of the human pathogen Leptospira interrogans at 25 different states. We show that in a single LC–MS/MS experiment around 5000 peptides, covering 1680 L. interrogans proteins, can be consistently detected and their absolute expression levels estimated, revealing new insights about the proteome changes involved in pathogenic progression and antibiotic defense of L. interrogans. This is the first study that describes the absolute quantitative behavior of any proteome over multiple states, and represents the most comprehensive proteome abundance pattern comparison for any organism to date.
PMCID: PMC3159967  PMID: 21772258
absolute quantification; directed mass spectrometry; Leptospira interrogans; microbiology; proteomics
7.  Proteome coverage prediction with infinite Markov models 
Bioinformatics  2009;25(12):i154-i160.
Motivation: Liquid chromatography tandem mass spectrometry (LC-MS/MS) is the predominant method to comprehensively characterize complex protein mixtures such as samples from prefractionated or complete proteomes. In order to maximize proteome coverage for the studied sample, i.e. identify as many traceable proteins as possible, LC-MS/MS experiments are typically repeated extensively and the results combined. Proteome coverage prediction is the task of estimating the number of peptide discoveries of future LC-MS/MS experiments. Proteome coverage prediction is important to enhance the design of efficient proteomics studies. To date, there does not exist any method to reliably estimate the increase of proteome coverage at an early stage.
Results: We propose an extended infinite Markov model DiriSim to extrapolate the progression of proteome coverage based on a small number of already performed LC-MS/MS experiments. The method explicitly accounts for the uncertainty of peptide identifications. We tested DiriSim on a set of 37 LC-MS/MS experiments of a complete proteome sample and demonstrated that DiriSim correctly predicts the coverage progression already from a small subset of experiments. The predicted progression enabled us to specify maximal coverage for the test sample. We demonstrated that quality requirements on the final proteome map impose an upper bound on the number of useful experiment repetitions and limit the achievable proteome coverage.
PMCID: PMC2687987  PMID: 19477982

Results 1-7 (7)