Search tips
Search criteria

Results 1-14 (14)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Panorama of ancient metazoan macromolecular complexes 
Nature  2015;525(7569):339-344.
Macromolecular complexes are essential to conserved biological processes, but their prevalence across animals is unclear. By combining extensive biochemical fractionation with quantitative mass spectrometry, we directly examined the composition of soluble multiprotein complexes among diverse metazoan models. Using an integrative approach, we then generated a draft conservation map consisting of >1 million putative high-confidence co-complex interactions for species with fully sequenced genomes that encompasses functional modules present broadly across all extant animals. Clustering revealed a spectrum of conservation, ranging from ancient Eukaryal assemblies likely serving cellular housekeeping roles for at least 1 billion years, ancestral complexes that have accrued contemporary components, and rarer metazoan innovations linked to multicellularity. We validated these projections by independent co-fractionation experiments in evolutionarily distant species, by affinity-purification and by functional analyses. The comprehensiveness, centrality and modularity of these reconstructed interactomes reflect their fundamental mechanistic significance and adaptive value to animal cell systems.
PMCID: PMC5036527  PMID: 26344197
2.  Proteome-wide dataset supporting the study of ancient metazoan macromolecular complexes 
Data in Brief  2015;6:715-721.
Our analysis examines the conservation of multiprotein complexes among metazoa through use of high resolution biochemical fractionation and precision mass spectrometry applied to soluble cell extracts from 5 representative model organisms Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Strongylocentrotus purpuratus, and Homo sapiens. The interaction network obtained from the data was validated globally in 4 distant species (Xenopus laevis, Nematostella vectensis, Dictyostelium discoideum, Saccharomyces cerevisiae) and locally by targeted affinity-purification experiments. Here we provide details of our massive set of supporting biochemical fractionation data available via ProteomeXchange (PXD002319-PXD002328), PPIs via BioGRID (185267); and interaction network projections via ( made fully accessible to allow further exploration. The datasets here are related to the research article on metazoan macromolecular complexes in Nature [1].
PMCID: PMC4738005  PMID: 26870755
Proteomics; Metazoa; Protein complexes; Biochemical; Fractionation
3.  Controlled Measurement and Comparative Analysis of Cellular Components in E. coli Reveals Broad Regulatory Changes in Response to Glucose Starvation 
PLoS Computational Biology  2015;11(8):e1004400.
How do bacteria regulate their cellular physiology in response to starvation? Here, we present a detailed characterization of Escherichia coli growth and starvation over a time-course lasting two weeks. We have measured multiple cellular components, including RNA and proteins at deep genomic coverage, as well as lipid modifications and flux through central metabolism. Our study focuses on the physiological response of E. coli in stationary phase as a result of being starved for glucose, not on the genetic adaptation of E. coli to utilize alternative nutrients. In our analysis, we have taken advantage of the temporal correlations within and among RNA and protein abundances to identify systematic trends in gene regulation. Specifically, we have developed a general computational strategy for classifying expression-profile time courses into distinct categories in an unbiased manner. We have also developed, from dynamic models of gene expression, a framework to characterize protein degradation patterns based on the observed temporal relationships between mRNA and protein abundances. By comparing and contrasting our transcriptomic and proteomic data, we have identified several broad physiological trends in the E. coli starvation response. Strikingly, mRNAs are widely down-regulated in response to glucose starvation, presumably as a strategy for reducing new protein synthesis. By contrast, protein abundances display more varied responses. The abundances of many proteins involved in energy-intensive processes mirror the corresponding mRNA profiles while proteins involved in nutrient metabolism remain abundant even though their corresponding mRNAs are down-regulated.
Author Summary
Bacteria frequently experience starvation conditions in their natural environments. Yet how they modify their physiology in response to these conditions remains poorly understood. Here, we performed a detailed, two-week starvation experiment in E. coli. We exhaustively monitored changes in cellular components, such as RNA and protein abundances, over time. We subsequently compared and contrasted these measurements using novel computational approaches we developed specifically for analyzing gene-expression time-course data. Using these approaches, we could identify systematic trends in the E. coli starvation response. In particular, we found that cells systematically limit mRNA and protein production, degrade proteins involved in energy-intensive processes, and maintain or increase the amount of proteins involved in energy production. Thus, the bacteria assume a cellular state in which their ongoing energy use is limited while they are poised to take advantage of any nutrients that may become available.
PMCID: PMC4537216  PMID: 26275208
4.  A proteomic survey of widespread protein aggregation in yeast 
Molecular bioSystems  2014;10(4):851-861.
Many normally cytosolic yeast proteins form insoluble intracellular bodies in response to nutrient depletion, suggesting the potential for widespread protein aggregation in stressed cells. Nearly 200 such bodies have been found in yeast by screening libraries of fluorescently tagged proteins. In order to more broadly characterize the formation of these bodies in response to stress, we employed a proteome-wide shotgun mass spectrometry assay in order to measure shifts in the intracellular solubilities of endogenous proteins following heat stress. As quantified by mass spectrometry, heat stress tended to shift the same proteins into insoluble form as did nutrient depletion; many of these proteins were also known to form foci in response to arsenic stress. Affinity purification of several foci-forming proteins showed enrichment for co-purifying chaperones, including Hsp90 chaperones. Tests of induction conditions and co-localization of metabolic enzymes participating in the same metabolic pathways suggested those foci did not correspond to multi-enzyme organizing centers. Thus, in yeast, the formation of stress bodies appears common across diverse, normally diffuse cytoplasmic proteins and is induced by multiple types of cell stress, including thermal, chemical, and nutrient stress.
PMCID: PMC4142438  PMID: 24488121
5.  Proteomic Identification of Monoclonal Antibodies from Serum 
Analytical Chemistry  2014;86(10):4758-4766.
Characterizing the in vivo dynamics of the polyclonal antibody repertoire in serum, such as that which might arise in response to stimulation with an antigen, is difficult due to the presence of many highly similar immunoglobulin proteins, each specified by distinct B lymphocytes. These challenges have precluded the use of conventional mass spectrometry for antibody identification based on peptide mass spectral matches to a genomic reference database. Recently, progress has been made using bottom-up analysis of serum antibodies by nanoflow liquid chromatography/high-resolution tandem mass spectrometry combined with a sample-specific antibody sequence database generated by high-throughput sequencing of individual B cell immunoglobulin variable domains (V genes). Here, we describe how intrinsic features of antibody primary structure, most notably the interspersed segments of variable and conserved amino acid sequences, generate recurring patterns in the corresponding peptide mass spectra of V gene peptides, greatly complicating the assignment of correct sequences to mass spectral data. We show that the standard method of decoy-based error modeling fails to account for the error introduced by these highly similar sequences, leading to a significant underestimation of the false discovery rate. Because of these effects, antibody-derived peptide mass spectra require increased stringency in their interpretation. The use of filters based on the mean precursor ion mass accuracy of peptide-spectrum matches is shown to be particularly effective in distinguishing between “true” and “false” identifications. These findings highlight important caveats associated with the use of standard database search and error-modeling methods with nonstandard data sets and custom sequence databases.
PMCID: PMC4033631  PMID: 24684310
6.  Bacteriophages use an expanded genetic code on evolutionary paths to higher fitness 
Nature chemical biology  2014;10(3):178-180.
Bioengineering advances have made it possible to fundamentally alter the genetic codes of organisms. However, the evolutionary consequences of expanding an organism's genetic code with a non-canonical amino acid are poorly understood. Here we show that bacteriophages evolved on a host that incorporates 3-iodotyrosine at the amber stop codon acquired neutral and beneficial mutations to this new amino acid in their proteins, demonstrating that an expanded genetic code increases evolvability.
PMCID: PMC3932624  PMID: 24487692
7.  A Census of Human Soluble Protein Complexes 
Cell  2012;150(5):1068-1081.
Cellular processes often depend on stable physical associations between proteins. Despite recent progress, knowledge of the composition of human protein complexes remains limited. To close this gap, we applied an integrative global proteomic profiling approach, based on chromatographic separation of cultured human cell extracts into more than one thousand biochemical fractions which were subsequently analyzed by quantitative tandem mass spectrometry, to systematically identify a network of 13,993 high-confidence physical interactions among 3,006 stably-associated soluble human proteins. Most of the 622 putative protein complexes we report are linked to core biological processes, and encompass both candidate disease genes and unnanotated proteins to inform on mechanism. Strikingly, whereas larger multi-protein assemblies tend to be more extensively annotated and evolutionarily conserved, human protein complexes with 5 or fewer subunits are far more likely to be functionally un-annotated or restricted to vertebrates, suggesting more recent functional innovations.
PMCID: PMC3477804  PMID: 22939629
8.  Proteomic and protein interaction network analysis of human T lymphocytes during cell-cycle entry 
Proteomic analysis of T cells emerging from quiescence identifies dynamic network-level changes in key cellular processes. Disruption of two such processes, ribosome biogenesis and RNA splicing, reveals that the programs controlling cell growth and cell-cycle entry are separable.
The authors conduct a proteomic and protein interaction network analysis of human T lymphocytes during entry into the first cell cycle.Inhibiting the induction of eIF6 (60S ribosome biogenesis) causes T cells to enter the cell cycle without growing in size.Inhibiting the induction of SF3B2/SF3B4 (U2/U12-dependent RNA splicing) allows an increase in cell size without entering the cell cycle.These results provide proof of principle that blastogenesis and proliferation programs are separable in primary human T cells.
Regulating the transition of cells such as T lymphocytes from quiescence (G0) into an activated, proliferating state involves initiation of cellular programs resulting in entry into the cell cycle (proliferation), the growth cycle (blastogenesis, cell size) and effector (functional) activation. We show the first proteomic analysis of protein interaction networks activated during entry into the first cell cycle from G0. We also provide proof of principle that blastogenesis and proliferation programs are separable in primary human T cells. We employed a proteomic profiling method to identify large-scale changes in chromatin/nuclear matrix-bound and unbound proteins in human T lymphocytes during the transition from G0 into the first cell cycle and mapped them to form functionally annotated, dynamic protein interaction networks. Inhibiting the induction of two proteins involved in two of the most significantly upregulated cellular processes, ribosome biogenesis (eIF6) and hnRNA splicing (SF3B2/SF3B4), showed, respectively, that human T cells can enter the cell cycle without growing in size, or increase in size without entering the cell cycle.
PMCID: PMC3321526  PMID: 22415777
cell cycle; cell size; mass spectrometry; proteomics; T cells
9.  Protein abundances are more conserved than mRNA abundances across diverse taxa 
Proteomics  2010;10(23):4209-4212.
Proteins play major roles in most biological processes; as a consequence, protein expression levels are highly regulated. While extensive post-transcriptional, translational and protein degradation control clearly influence protein concentration and functionality, it is often thought that protein abundances are primarily determined by the abundances of the corresponding mRNAs. Hence surprisingly, a recent study showed that abundances of orthologous nematode and fly proteins correlate better than their corresponding mRNA abundances. We tested if this phenomenon is general by collecting and testing matching large-scale protein and mRNA expression datasets from seven different species: two bacteria, yeast, nematode, fly, human, and plant. We find that steady-state abundances of proteins show significantly higher correlation across these diverse phylogenetic taxa than the abundances of their corresponding mRNAs (p=0.0008, paired Wilcoxon). These data support the presence of strong selective pressure to maintain protein abundances during evolution, even when mRNA abundances diverge.
PMCID: PMC3113407  PMID: 21089048
10.  Ultrafast Ultraviolet Photodissociation at 193 nm and its Applicability to Proteomic Workflows 
Journal of proteome research  2010;9(8):4205-4214.
Ultraviolet photodissociation (UVPD) at 193 nm was implemented on a linear ion trap mass spectrometer for high-throughput proteomic workflows. Upon irradiation by a single 5 ns laser pulse, efficient photodissociation of tryptic peptides was achieved with production of a, b, c, x, y, and z sequence ions, in addition to immonium ions and v and w side-chain loss ions. The factors that influence the UVPD mass spectra and subsequent in silico database searching via SEQUEST were evaluated. Peptide sequence aromaticity and the precursor charge state were found to influence photodissociation efficiency more so than the number of amide chromophores, and the ion trap q-value and number of laser pulses significantly affected the number and abundances of diagnostic product ions (e.g., sequence and immonium ions). Also, photoionization background subtraction was shown to dramatically improve SEQUEST results, especially when peptide signals were low. A liquid chromatography – mass spectrometry (LC-MS) – UVPD strategy was implemented and yielded comparable or better results relative to LC-MS – collision induced dissociation (CID) for analysis of proteolyzed bovine serum albumin and lysed human HT-1080 cytosolic fibrosarcoma cells.
PMCID: PMC2917496  PMID: 20578723
11.  Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line 
We provide a large-scale dataset on absolute protein and matching mRNA concentrations from the human medulloblastoma cell line Daoy. The correlation between mRNA and protein concentrations is significant and positive (Rs=0.46, R2=0.29, P-value<2e16), although non-linear.Out of ∼200 tested sequence features, sequence length, frequency and properties of amino acids, as well as translation initiation-related features are the strongest individual correlates of protein abundance when accounting for variation in mRNA concentration.When integrating mRNA expression data and all sequence features into a non-parametric regression model (Multivariate Adaptive Regression Splines), we were able to explain up to 67% of the variation in protein concentrations. Half of the contributions were attributed to mRNA concentrations, the other half to sequence features relating to regulation of translation and protein degradation. The sequence features are primarily linked to the coding and 3′ untranslated region. To our knowledge, this is the most comprehensive predictive model of human protein concentrations achieved so far.
mRNA decay, translation regulation and protein degradation are essential parts of eukaryotic gene expression regulation (Hieronymus and Silver, 2004; Mata et al, 2005), which enable the dynamics of cellular systems and their responses to external and internal stimuli without having to rely exclusively on transcription regulation. The importance of these processes is emphasized by the generally low correlation between mRNA and protein concentrations. For many prokaryotic and eukaryotic organisms, <50% of variation in protein abundance variation is explained by variation in mRNA concentrations (de Sousa Abreu et al, 2009).
Given the plethora of regulatory mechanisms involved, most studies have focused so far on individual regulators and specific targets. Particularly in human, we currently lack system-wide, quantitative analyses that evaluate the relative contribution of regulatory elements encoded in the mRNA and protein sequence. Existing studies have been carried out only in bacteria and yeast (Nie et al, 2006; Brockmann et al, 2007; Tuller et al, 2007; Wu et al, 2008). Here, we present the first comprehensive analysis on the impact of translation and protein degradation on protein abundance variation in a human cell line. For this purpose, we experimentally measured absolute protein and mRNA concentrations in the Daoy medulloblastoma cell line, using shotgun proteomics and microarrays, respectively (Figure 1). These data comprise one of the largest such sets available today for human. We focused on sequence features that likely impact protein translation and protein degradation, including length, nucleotide composition, structure of the untranslated regions (UTRs), coding sequence, composition of the translation initiation site, presence of upstream open reading frames putative target sites of miRNAs, codon usage, amino-acid composition and protein degradation signals.
Three types of tests have been conducted: (a) we examined partial Spearman's rank correlation of numerical features (e.g. length) with protein concentration, accounting for variation in mRNA concentrations; (b) for numerical and categorical features (e.g. function), we compared two extreme populations with Welch's t-test and (c) using a Multivariate Adaptive Regression Splines model, we analyzed the combined contributions of mRNA expression and sequence features to protein abundance variation (Figure 1). To account for the non-linearity of many relationships, we use non-parametric approaches throughout the analysis.
We observed a significant positive correlation between mRNA and protein concentrations, larger than many previous measurements (de Sousa Abreu et al, 2009). We also show that the contribution of translation and protein degradation is at least as important as the contribution of mRNA transcription and stability to the abundance variation of the final protein products. Although variation in mRNA expression explains ∼25–30% of the variation in protein abundance, another 30–40% can be accounted for by characteristics of the sequences, which we identified in a comparative assessment of global correlates. Among these characteristics, sequence length, amino-acid frequencies and also nucleotide frequencies in the coding region are of strong influence (Figure 3A). Characteristics of the 3′UTR and of the 5′UTR, that is length, nucleotide composition and secondary structures, describe another part of the variation, leaving 33% expression variation unexplained. The unexplained fraction may be accounted for by mechanisms not considered in this analysis (e.g. regulation by RNA-binding proteins or gene-specific structural motifs), as well as expression and measurement noise.
Our combined model including mRNA concentration and sequence features can explain 67% of the variation of protein abundance in this system—and thus has the highest predictive power for human protein abundance achieved so far (Figure 3B).
Transcription, mRNA decay, translation and protein degradation are essential processes during eukaryotic gene expression, but their relative global contributions to steady-state protein concentrations in multi-cellular eukaryotes are largely unknown. Using measurements of absolute protein and mRNA abundances in cellular lysate from the human Daoy medulloblastoma cell line, we quantitatively evaluate the impact of mRNA concentration and sequence features implicated in translation and protein degradation on protein expression. Sequence features related to translation and protein degradation have an impact similar to that of mRNA abundance, and their combined contribution explains two-thirds of protein abundance variation. mRNA sequence lengths, amino-acid properties, upstream open reading frames and secondary structures in the 5′ untranslated region (UTR) were the strongest individual correlates of protein concentrations. In a combined model, characteristics of the coding region and the 3′UTR explained a larger proportion of protein abundance variation than characteristics of the 5′UTR. The absolute protein and mRNA concentration measurements for >1000 human genes described here represent one of the largest datasets currently available, and reveal both general trends and specific examples of post-transcriptional regulation.
PMCID: PMC2947365  PMID: 20739923
gene expression regulation; protein degradation; protein stability; translation
12.  A map of human protein interactions derived from co-expression of human mRNAs and their orthologs 
The human protein interaction network will offer global insights into the molecular organization of cells and provide a framework for modeling human disease, but the network's large scale demands new approaches. We report a set of 7000 physical associations among human proteins inferred from indirect evidence: the comparison of human mRNA co-expression patterns with those of orthologous genes in five other eukaryotes, which we demonstrate identifies proteins in the same physical complexes. To evaluate the accuracy of the predicted physical associations, we apply quantitative mass spectrometry shotgun proteomics to measure elution profiles of 3013 human proteins during native biochemical fractionation, demonstrating systematically that putative interaction partners tend to co-sediment. We further validate uncharacterized proteins implicated by the associations in ribosome biogenesis, including WBSCR20C, associated with Williams–Beuren syndrome. This meta-analysis therefore exploits non-protein-based data, but successfully predicts associations, including 5589 novel human physical protein associations, with measured accuracies of 54±10%, comparable to direct large-scale interaction assays. The new associations' derivation from conserved in vivo phenomena argues strongly for their biological relevance.
PMCID: PMC2387231  PMID: 18414481
interactions; mass spectrometry; networks; proteomics; systems biology
13.  Discovery of a thermophilic protein complex stabilized by topologically interlinked chains 
Journal of molecular biology  2007;368(5):1332-1344.
A growing number of organisms have been discovered inhabiting extreme environments, including temperatures in excess of 100 °C. How cellular proteins from such organisms retain their native folds under extreme conditions is still not fully understood. Recent computational and structural studies have identified disulfide bonding as an important mechanism for stabilizing intracellular proteins in certain thermophilic microbes. Here, we present the first proteomic analysis of intracellular disulfide bonding in the hyperthermophilic archaeon Pyrobaculum aerophilum. Our study reveals that the utilization of disulfide bonds extends beyond individual proteins to include many protein-protein complexes. We report the 1.6Å crystal structure of one such complex, a citrate synthase homodimer. The structure contains two intramolecular disulfide bonds, one per subunit, which result in the cyclization of each protein chain in such a way that the two chains are topologically interlinked, rendering them inseparable. This unusual feature emphasizes the variety and sophistication of the molecular mechanisms that can be achieved by evolution.
PMCID: PMC1955483  PMID: 17395198
disulfide bond; protein stability; catenane; citrate synthase; thermophile
14.  The Genomics of Disulfide Bonding and Protein Stabilization in Thermophiles 
PLoS Biology  2005;3(9):e309.
Thermophilic organisms flourish in varied high-temperature environmental niches that are deadly to other organisms. Recently, genomic evidence has implicated a critical role for disulfide bonds in the structural stabilization of intracellular proteins from certain of these organisms, contrary to the conventional view that structural disulfide bonds are exclusively extracellular. Here both computational and structural data are presented to explore the occurrence of disulfide bonds as a protein-stabilization method across many thermophilic prokaryotes. Based on computational studies, disulfide-bond richness is found to be widespread, with thermophiles containing the highest levels. Interestingly, only a distinct subset of thermophiles exhibit this property. A computational search for proteins matching this target phylogenetic profile singles out a specific protein, known as protein disulfide oxidoreductase, as a potential key player in thermophilic intracellular disulfide-bond formation. Finally, biochemical support in the form of a new crystal structure of a thermophilic protein with three disulfide bonds is presented together with a survey of known structures from the literature. Together, the results provide insight into biochemical specialization and the diversity of methods employed by organisms to stabilize their proteins in exotic environments. The findings also motivate continued efforts to sequence genomes from divergent organisms.
Certain thermophiles are found to stabilize their proteins in extreme environments with additional disulfide bonds. A phylogenetic profile identifies a protein disulfide oxidoreductase critical to the stabilization process.
PMCID: PMC1188242  PMID: 16111437

Results 1-14 (14)