PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1277289)

Clipboard (0)
None

Related Articles

1.  Contribution of oncoproteomics to cancer biomarker discovery 
Molecular Cancer  2007;6:25.
Oncoproteomics is the study of proteins and their interactions in a cancer cell by proteomic technologies. Proteomic research first came to the fore with the introduction of two-dimensional gel electrophoresis. At the turn of the century, proteomics has been increasingly applied to cancer research with the wide-spread introduction of mass spectrometry and proteinchip. There is an intense interest in applying proteomics to foster an improved understanding of cancer pathogenesis, develop new tumor biomarkers for diagnosis, and early detection using proteomic portrait of samples. Oncoproteomics has the potential to revolutionize clinical practice, including cancer diagnosis and screening based on proteomic platforms as a complement to histopathology, individualized selection of therapeutic combinations that target the entire cancer-specific protein network, real-time assessment of therapeutic efficacy and toxicity, and rational modulation of therapy based on changes in the cancer protein network associated with prognosis and drug resistance. Besides, oncoproteomics is also applied to the discovery of new therapeutic targets and to the study of drug effects. In pace with the successful completion of the Human Genome Project, the wave of proteomics has raised the curtain on the postgenome era. The study of oncoproteomics provides mankind with a better understanding of neoplasia. In this article, the discovery of cancer biomarkers in recent years is reviewed. The challenges ahead and perspectives of oncoproteomics for biomarkers development are also addressed. With a wealth of information that can be applied to a broad spectrum of biomarker research projects, this review serves as a reference for biomarker researchers, scientists working in proteomics and bioinformatics, oncologists, pharmaceutical scientists, biochemists, biologists, and chemists.
doi:10.1186/1476-4598-6-25
PMCID: PMC1852117  PMID: 17407558
2.  Bioinformatics in microbial biotechnology – a mini review 
The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i) genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii) proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii) cell visualization and simulation to study and model cell behavior, and (iv) application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1) analysis based upon the available experimental wet-lab data, (2) the use of mathematical modeling to derive new information, and (3) an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene expression analysis to derive regulatory pathways, the development of statistical techniques, clustering techniques and data mining techniques to derive protein-protein and protein-DNA interactions, and modeling of 3D structure of proteins and 3D docking between proteins and biochemicals for rational drug design, difference analysis between pathogenic and non-pathogenic strains to identify candidate genes for vaccines and anti-microbial agents, and the whole genome comparison to understand the microbial evolution. The development of bioinformatics techniques has enhanced the pace of biological discovery by automated analysis of large number of microbial genomes. We are on the verge of using all this knowledge to understand cellular mechanisms at the systemic level. The developed bioinformatics techniques have potential to facilitate (i) the discovery of causes of diseases, (ii) vaccine and rational drug design, and (iii) improved cost effective agents for bioremediation by pruning out the dead ends. Despite the fast paced global effort, the current analysis is limited by the lack of available gene-functionality from the wet-lab data, the lack of computer algorithms to explore vast amount of data with unknown functionality, limited availability of protein-protein and protein-DNA interactions, and the lack of knowledge of temporal and transient behavior of genes and pathways.
doi:10.1186/1475-2859-4-19
PMCID: PMC1182391  PMID: 15985162
3.  A Mouse to Human Search for Plasma Proteome Changes Associated with Pancreatic Tumor Development 
PLoS Medicine  2008;5(6):e123.
Background
The complexity and heterogeneity of the human plasma proteome have presented significant challenges in the identification of protein changes associated with tumor development. Refined genetically engineered mouse (GEM) models of human cancer have been shown to faithfully recapitulate the molecular, biological, and clinical features of human disease. Here, we sought to exploit the merits of a well-characterized GEM model of pancreatic cancer to determine whether proteomics technologies allow identification of protein changes associated with tumor development and whether such changes are relevant to human pancreatic cancer.
Methods and Findings
Plasma was sampled from mice at early and advanced stages of tumor development and from matched controls. Using a proteomic approach based on extensive protein fractionation, we confidently identified 1,442 proteins that were distributed across seven orders of magnitude of abundance in plasma. Analysis of proteins chosen on the basis of increased levels in plasma from tumor-bearing mice and corroborating protein or RNA expression in tissue documented concordance in the blood from 30 newly diagnosed patients with pancreatic cancer relative to 30 control specimens. A panel of five proteins selected on the basis of their increased level at an early stage of tumor development in the mouse was tested in a blinded study in 26 humans from the CARET (Carotene and Retinol Efficacy Trial) cohort. The panel discriminated pancreatic cancer cases from matched controls in blood specimens obtained between 7 and 13 mo prior to the development of symptoms and clinical diagnosis of pancreatic cancer.
Conclusions
Our findings indicate that GEM models of cancer, in combination with in-depth proteomic analysis, provide a useful strategy to identify candidate markers applicable to human cancer with potential utility for early detection.
Samir Hanash and colleagues identify proteins that are increased at an early stage of pancreatic tumor development in a mouse model and may be a useful tool in detecting early tumors in humans.
Editors' Summary
Background.
Cancers are life-threatening, disorganized masses of cells that can occur anywhere in the human body. They develop when cells acquire genetic changes that allow them to grow uncontrollably and to spread around the body (metastasize). If a cancer is detected when it is still small and has not metastasized, surgery can often provide a cure. Unfortunately, many cancers are detected only when they are large enough to press against surrounding tissues and cause pain or other symptoms. By this time, surgical removal of the original (primary) tumor may be impossible and there may be secondary cancers scattered around the body. In such cases, radiotherapy and chemotherapy can sometimes help, but the outlook for patients whose cancers are detected late is often poor. One cancer type for which late detection is a particular problem is pancreatic adenocarcinoma. This cancer rarely causes any symptoms in its early stages. Furthermore, the symptoms it eventually causes—jaundice, abdominal and back pain, and weight loss—are seen in many other illnesses. Consequently, pancreatic cancer has usually spread before it is diagnosed, and most patients die within a year of their diagnosis.
Why Was This Study Done?
If a test could be developed to detect pancreatic cancer in its early stages, the lives of many patients might be extended. Tumors often release specific proteins—“cancer biomarkers”—into the blood, a bodily fluid that can be easily sampled. If a protein released into the blood by pancreatic cancer cells could be identified, it might be possible to develop a noninvasive screening test for this deadly cancer. In this study, the researchers use a “proteomic” approach to identify potential biomarkers for early pancreatic cancer. Proteomics is the study of the patterns of proteins made by an organism, tissue, or cell and of the changes in these patterns that are associated with various diseases.
What Did the Researchers Do and Find?
The researchers started their search for pancreatic cancer biomarkers by studying the plasma proteome (the proteins in the fluid portion of blood) of mice genetically engineered to develop cancers that closely resemble human pancreatic tumors. Through the use of two techniques called high-resolution mass spectrometry and acrylamide isotopic labeling, the researchers identified 165 proteins that were present in larger amounts in plasma collected from mice with early and/or advanced pancreatic cancer than in plasma from control mice. Then, to test whether any of these protein changes were relevant to human pancreatic cancer, the researchers analyzed blood samples collected from patients with pancreatic cancer. These samples, they report, contained larger amounts of some of these proteins than blood collected from patients with chronic pancreatitis, a condition that has similar symptoms to pancreatic cancer. Finally, using blood samples collected during a clinical trial, the Carotene and Retinol Efficacy Trial (a cancer-prevention study), the researchers showed that the measurement of five of the proteins present in increased amounts at an early stage of tumor development in the mouse model discriminated between people with pancreatic cancer and matched controls up to 13 months before cancer diagnosis.
What Do These Findings Mean?
These findings suggest that in-depth proteomic analysis of genetically engineered mouse models of human cancer might be an effective way to identify biomarkers suitable for the early detection of human cancers. Previous attempts to identify such biomarkers using human samples have been hampered by the many noncancer-related differences in plasma proteins that exist between individuals and by problems in obtaining samples from patients with early cancer. The use of a mouse model of human cancer, these findings indicate, can circumvent both of these problems. More specifically, these findings identify a panel of proteins that might allow earlier detection of pancreatic cancer and that might, therefore, extend the life of some patients who develop this cancer. However, before a routine screening test becomes available, additional markers will need to be identified and extensive validation studies in larger groups of patients will have to be completed.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050123.
The MedlinePlus Encyclopedia has a page on pancreatic cancer (in English and Spanish). Links to further information are provided by MedlinePlus
The US National Cancer Institute has information about pancreatic cancer for patients and health professionals (in English and Spanish)
The UK charity Cancerbackup also provides information for patients about pancreatic cancer
The Clinical Proteomic Technologies for Cancer Initiative (a US National Cancer Institute initiative) provides a tutorial about proteomics and cancer and information on the Mouse Proteomic Technologies Initiative
doi:10.1371/journal.pmed.0050123
PMCID: PMC2504036  PMID: 18547137
4.  Clinical Proteomics: Present and Future Prospects 
Clinical Biochemist Reviews  2006;27(2):99-116.
Advances in proteomics technology offer great promise in the understanding and treatment of the molecular basis of disease. The past decade of proteomics research, the study of dynamic protein expression, post-translational modifications, cellular and sub-cellular protein distribution, and protein-protein interactions, has culminated in the identification of many disease-related biomarkers and potential new drug targets. While proteomics remains the tool of choice for discovery research, new innovations in proteomic technology now offer the potential for proteomic profiling to become standard practice in the clinical laboratory. Indeed, protein profiles can serve as powerful diagnostic markers, and can predict treatment outcome in many diseases, in particular cancer. A number of technical obstacles remain before routine proteomic analysis can be achieved in the clinic; however the standardisation of methodologies and dissemination of proteomic data into publicly available databases is starting to overcome these hurdles. At present the most promising application for proteomics is in the screening of specific subsets of protein biomarkers for certain diseases, rather than large scale full protein profiling. Armed with these technologies the impending era of individualised patient-tailored therapy is imminent. This review summarises the advances in proteomics that has propelled us to this exciting age of clinical proteomics, and highlights the future work that is required for this to become a reality.
PMCID: PMC1579414  PMID: 17077880
5.  SP3 In-depth Bioinformatics Analysis of Proteomics Data: Problems and Solutions 
The principal goal in proteomics is to extract biologically or clinically meaningful information from large-scale studies in order to provide new insights into fundamental biological processes or find new means to diagnose or treat disease. Many labs now have methods and machinery in place that make possible the robust generation of many thousands of protein identifications each day. This, however, has exposed a new major bottleneck in the proteomics workflow—the problem of analyzing the wealth of protein identifications to find the relatively few proteins that actually render support for biological hypotheses or that have potential medical relevance.
This presentation shows the application of a new bioinformatics tool that almost entirely removes this bottleneck. The new technology was specifically developed to help researchers gain a fast overview of biologically relevant features in vast protein datasets and rapidly zoom in on single proteins or subsets of proteins of particular interest. In a matter of minutes, the output from the MS database search software was turned into information that was biologically more meaningful. This was done by means of sequential steps that: (1) collapsed all protein redundancies into non-redundant lists; (2) filtered/sorted these lists based on experimental observations and biological sequence annotation that was automatically added; and (3) compared/combined lists of annotated proteins to elucidate differences and overlaps between multiple experimental datasets.
The presentation will show through examples how the new technology can be used in proteomics experiments in order to accelerate the otherwise tedious process of making biological sense of lists of protein accession codes. We present the efficient data mining and categorizing of several large datasets of proteins, including data uploaded from the PRIDE database and HUPO projects.
PMCID: PMC2291907
6.  Methods for visual mining of genomic and proteomic data atlases 
BMC Bioinformatics  2012;13:58.
Background
As the volume, complexity and diversity of the information that scientists work with on a daily basis continues to rise, so too does the requirement for new analytic software. The analytic software must solve the dichotomy that exists between the need to allow for a high level of scientific reasoning, and the requirement to have an intuitive and easy to use tool which does not require specialist, and often arduous, training to use. Information visualization provides a solution to this problem, as it allows for direct manipulation and interaction with diverse and complex data. The challenge addressing bioinformatics researches is how to apply this knowledge to data sets that are continually growing in a field that is rapidly changing.
Results
This paper discusses an approach to the development of visual mining tools capable of supporting the mining of massive data collections used in systems biology research, and also discusses lessons that have been learned providing tools for both local researchers and the wider community. Example tools were developed which are designed to enable the exploration and analyses of both proteomics and genomics based atlases. These atlases represent large repositories of raw and processed experiment data generated to support the identification of biomarkers through mass spectrometry (the PeptideAtlas) and the genomic characterization of cancer (The Cancer Genome Atlas). Specifically the tools are designed to allow for: the visual mining of thousands of mass spectrometry experiments, to assist in designing informed targeted protein assays; and the interactive analysis of hundreds of genomes, to explore the variations across different cancer genomes and cancer types.
Conclusions
The mining of massive repositories of biological data requires the development of new tools and techniques. Visual exploration of the large-scale atlas data sets allows researchers to mine data to find new meaning and make sense at scales from single samples to entire populations. Providing linked task specific views that allow a user to start from points of interest (from diseases to single genes) enables targeted exploration of thousands of spectra and genomes. As the composition of the atlases changes, and our understanding of the biology increase, new tasks will continually arise. It is therefore important to provide the means to make the data available in a suitable manner in as short a time as possible. We have done this through the use of common visualization workflows, into which we rapidly deploy visual tools. These visualizations follow common metaphors where possible to assist users in understanding the displayed data. Rapid development of tools and task specific views allows researchers to mine large-scale data almost as quickly as it is produced. Ultimately these visual tools enable new inferences, new analyses and further refinement of the large scale data being provided in atlases such as PeptideAtlas and The Cancer Genome Atlas.
doi:10.1186/1471-2105-13-58
PMCID: PMC3352268  PMID: 22524279
7.  System-wide Perturbation Analysis with Nearly Complete Coverage of the Yeast Proteome by Single-shot Ultra HPLC Runs on a Bench Top Orbitrap* 
Molecular & Cellular Proteomics : MCP  2011;11(3):M111.013722.
Yeast remains an important model for systems biology and for evaluating proteomics strategies. In-depth shotgun proteomics studies have reached nearly comprehensive coverage, and rapid, targeted approaches have been developed for this organism. Recently, we demonstrated that single LC-MS/MS analysis using long columns and gradients coupled to a linear ion trap Orbitrap instrument had an unexpectedly large dynamic range of protein identification (Thakur, S. S., Geiger, T., Chatterjee, B., Bandilla, P., Frohlich, F., Cox, J., and Mann, M. (2011) Deep and highly sensitive proteome coverage by LC-MS/MS without prefractionation. Mol. Cell Proteomics 10, 10.1074/mcp.M110.003699). Here we couple an ultra high pressure liquid chromatography system to a novel bench top Orbitrap mass spectrometer (Q Exactive) with the goal of nearly complete, rapid, and robust analysis of the yeast proteome. Single runs of filter-aided sample preparation (FASP)-prepared and LysC-digested yeast cell lysates identified an average of 3923 proteins. Combined analysis of six single runs improved these values to more than 4000 identified proteins/run, close to the total number of proteins expressed under standard conditions, with median sequence coverage of 23%. Because of the absence of fractionation steps, only minuscule amounts of sample are required. Thus the yeast model proteome can now largely be covered within a few hours of measurement time and at high sensitivity. Median coverage of proteins in Kyoto Encyclopedia of Genes and Genomes pathways with at least 10 members was 88%, and pathways not covered were not expected to be active under the conditions used. To study perturbations of the yeast proteome, we developed an external, heavy lysine-labeled SILAC yeast standard representing different proteome states. This spike-in standard was employed to measure the heat shock response of the yeast proteome. Bioinformatic analysis of the heat shock response revealed that translation-related functions were down-regulated prominently, including nucleolar processes. Conversely, stress-related pathways were up-regulated. The proteomic technology described here is straightforward, rapid, and robust, potentially enabling widespread use in the yeast and other biological research communities.
doi:10.1074/mcp.M111.013722
PMCID: PMC3316726  PMID: 22021278
8.  Proteome Dataset of Human Gingival Crevicular Fluid from Healthy Periodontium Sites by Multi-Dimensional Protein Separation and Mass Spectrometry 
Journal of Periodontal Research  2011;47(2):248-262.
Background and Objective
Gingival crevicular fluid (GCF) has been of major interest for many decades as valuable body fluid that may serve as a source of biomarkers for both periodontal and systemic diseases. Because of its very small sample size, sub-μl level, identification of its protein composition by classical biochemical methods has been limited. The advent of highly sensitive mass spectrometric technology has permitted large-scale identification of protein components of many biological samples. This technology has been employed to identify protein composition of GCF from inflamed and periodontal sites. In this report we present a proteome dataset of GCF from healthy periodontium sites.
Methods
A combination of periopaper collection method with application of multidimensional protein separation and mass spectrometric (MS) technology led to a large-scale documentation of the proteome of GCF from healthy periodontium sites.
Results
The approaches utilized have culminated in identification of 199 proteins in GCF of periodontally healthy sites. The current GCF proteome from healthy sites was compared and contrasted with those proteomes of GCF from inflamed and periodontal sites as well as serum. The cross-correlation of the GCF and plasma proteomes permitted dissociation of the 199 identified GCF proteins into, 105 proteins (57%) that can be identified in plasma and 94 proteins (43%) which are distinct and unique to GCF microenvironment. Such analysis also revealed distinctions in protein functional categories between serum proteins and those specific to GCF microenvironment.
Conclusion
Firstly, the data presented herein provide the proteome of GCF from periodontally healthy sites through establishment of innovative analytical approaches for effective analysis of GCF from periopapers both at the level of complete elusion and removal of abundant albumin which restricts identification of low abundant proteins. Secondly, it adds significantly to the knowledge of GCF composition and highlights new groups of proteins specific to GCF microenvironment.
doi:10.1111/j.1600-0765.2011.01429.x
PMCID: PMC3272151  PMID: 22029670
Gingival crevicular fluid; mass spectrometry; proteomics; saliva; oral; biomarkers; diagnostics
9.  A Novel Cross-Disciplinary Multi-Institute Approach to Translational Cancer Research: Lessons Learned from Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC) 
Cancer Informatics  2007;3:255-274.
Background:
The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC, http://www.pcabc.upmc.edu) is one of the first major project-based initiatives stemming from the Pennsylvania Cancer Alliance that was funded for four years by the Department of Health of the Commonwealth of Pennsylvania. The objective of this was to initiate a prototype biorepository and bioinformatics infrastructure with a robust data warehouse by developing a statewide data model (1) for bioinformatics and a repository of serum and tissue samples; (2) a data model for biomarker data storage; and (3) a public access website for disseminating research results and bioinformatics tools. The members of the Consortium cooperate closely, exploring the opportunity for sharing clinical, genomic and other bioinformatics data on patient samples in oncology, for the purpose of developing collaborative research programs across cancer research institutions in Pennsylvania. The Consortium’s intention was to establish a virtual repository of many clinical specimens residing in various centers across the state, in order to make them available for research. One of our primary goals was to facilitate the identification of cancer-specific biomarkers and encourage collaborative research efforts among the participating centers.
Methods:
The PCABC has developed unique partnerships so that every region of the state can effectively contribute and participate. It includes over 80 individuals from 14 organizations, and plans to expand to partners outside the State. This has created a network of researchers, clinicians, bioinformaticians, cancer registrars, program directors, and executives from academic and community health systems, as well as external corporate partners - all working together to accomplish a common mission.
The various sub-committees have developed a common IRB protocol template, common data elements for standardizing data collections for three organ sites, intellectual property/tech transfer agreements, and material transfer agreements that have been approved by each of the member institutions. This was the foundational work that has led to the development of a centralized data warehouse that has met each of the institutions’ IRB/HIPAA standards.
Results:
Currently, this “virtual biorepository” has over 58,000 annotated samples from 11,467 cancer patients available for research purposes. The clinical annotation of tissue samples is either done manually over the internet or semi-automated batch modes through mapping of local data elements with PCABC common data elements. The database currently holds information on 7188 cases (associated with 9278 specimens and 46,666 annotated blocks and blood samples) of prostate cancer, 2736 cases (associated with 3796 specimens and 9336 annotated blocks and blood samples) of breast cancer and 1543 cases (including 1334 specimens and 2671 annotated blocks and blood samples) of melanoma. These numbers continue to grow, and plans to integrate new tumor sites are in progress. Furthermore, the group has also developed a central web-based tool that allows investigators to share their translational (genomics/proteomics) experiment data on research evaluating potential biomarkers via a central location on the Consortium’s web site.
Conclusions:
The technological achievements and the statewide informatics infrastructure that have been established by the Consortium will enable robust and efficient studies of biomarkers and their relevance to the clinical course of cancer. Studies resulting from the creation of the Consortium may allow for better classification of cancer types, more accurate assessment of disease prognosis, a better ability to identify the most appropriate individuals for clinical trial participation, and better surrogate markers of disease progression and/or response to therapy.
PMCID: PMC2675833  PMID: 19455246
10.  Challenges and Solutions in Proteomics 
Current Genomics  2007;8(1):21-28.
The accelerated growth of proteomics data presents both opportunities and challenges. Large-scale proteomic profiling of biological samples such as cells, organelles or biological fluids has led to discovery of numerous key and novel proteins involved in many biological/disease processes including cancers, as well as to the identification of novel disease biomarkers and potential therapeutic targets. While proteomic data analysis has been greatly assisted by the many bioinformatics tools developed in recent years, a careful analysis of the major steps and flow of data in a typical highthroughput analysis reveals a few gaps that still need to be filled to fully realize the value of the data. To facilitate functional and pathway discovery for large-scale proteomic data, we have developed an integrated proteomic expression analysis system, iProXpress, which facilitates protein identification using a comprehensive sequence library and functional interpretation using integrated data. With its modular design, iProXpress complements and can be integrated with other software in a proteomic data analysis pipeline. This novel approach to complex biological questions involves the interrogation of multiple data sources, thereby facilitating hypothesis generation and knowledge discovery from the genomic-scale studies and fostering disease diagnosis and drug development.
PMCID: PMC2474689  PMID: 18645629
Proteomic profiling; high-throughput analysis; biomarkers; bioinformatic tools; iProXpress; sequence library; pathway discovery; stage specific proteins
11.  Cancer Screening: A Mathematical Model Relating Secreted Blood Biomarker Levels to Tumor Sizes  
PLoS Medicine  2008;5(8):e170.
Background
Increasing efforts and financial resources are being invested in early cancer detection research. Blood assays detecting tumor biomarkers promise noninvasive and financially reasonable screening for early cancer with high potential of positive impact on patients' survival and quality of life. For novel tumor biomarkers, the actual tumor detection limits are usually unknown and there have been no studies exploring the tumor burden detection limits of blood tumor biomarkers using mathematical models. Therefore, the purpose of this study was to develop a mathematical model relating blood biomarker levels to tumor burden.
Methods and Findings
Using a linear one-compartment model, the steady state between tumor biomarker secretion into and removal out of the intravascular space was calculated. Two conditions were assumed: (1) the compartment (plasma) is well-mixed and kinetically homogenous; (2) the tumor biomarker consists of a protein that is secreted by tumor cells into the extracellular fluid compartment, and a certain percentage of the secreted protein enters the intravascular space at a continuous rate. The model was applied to two pathophysiologic conditions: tumor biomarker is secreted (1) exclusively by the tumor cells or (2) by both tumor cells and healthy normal cells. To test the model, a sensitivity analysis was performed assuming variable conditions of the model parameters. The model parameters were primed on the basis of literature data for two established and well-studied tumor biomarkers (CA125 and prostate-specific antigen [PSA]). Assuming biomarker secretion by tumor cells only and 10% of the secreted tumor biomarker reaching the plasma, the calculated minimally detectable tumor sizes ranged between 0.11 mm3 and 3,610.14 mm3 for CA125 and between 0.21 mm3 and 131.51 mm3 for PSA. When biomarker secretion by healthy cells and tumor cells was assumed, the calculated tumor sizes leading to positive test results ranged between 116.7 mm3 and 1.52 × 106 mm3 for CA125 and between 27 mm3 and 3.45 × 105 mm3 for PSA. One of the limitations of the study is the absence of quantitative data available in the literature on the secreted tumor biomarker amount per cancer cell in intact whole body animal tumor models or in cancer patients. Additionally, the fraction of secreted tumor biomarkers actually reaching the plasma is unknown. Therefore, we used data from published cell culture experiments to estimate tumor cell biomarker secretion rates and assumed a wide range of secretion rates to account for their potential changes due to field effects of the tumor environment.
Conclusions
This study introduced a linear one-compartment mathematical model that allows estimation of minimal detectable tumor sizes based on blood tumor biomarker assays. Assuming physiological data on CA125 and PSA from the literature, the model predicted detection limits of tumors that were in qualitative agreement with the actual clinical performance of both biomarkers. The model may be helpful in future estimation of minimal detectable tumor sizes for novel proteomic biomarker assays if sufficient physiologic data for the biomarker are available. The model may address the potential and limitations of tumor biomarkers, help prioritize biomarkers, and guide investments into early cancer detection research efforts.
Sanjiv Gambhir and colleagues describe a linear one-compartment mathematical model that allows estimation of minimal detectable tumor sizes based on blood tumor biomarker assays.
Editors' Summary
Background.
Cancers—disorganized masses of cells that can occur in any tissue—develop when cells acquire genetic changes that allow them to grow uncontrollably and to spread around the body (metastasize). If a cancer (tumor) is detected when it is small, surgery can often provide a cure. Unfortunately, many cancers (particularly those deep inside the body) are not detected until they are large enough to cause pain or other symptoms by pressing against surrounding tissue. By this time, it may be impossible to remove the original tumor surgically and there may be metastases scattered around the body. In such cases, radiotherapy and chemotherapy can sometimes help, but the outlook for patients whose cancers are detected late is often poor. Consequently, researchers are trying to develop early detection tests for different types of cancer. Many tumors release specific proteins—“cancer biomarkers”—into the blood and the hope is that it might be possible to find sets of blood biomarkers that detect cancers when they are still small and thus save many lives.
Why Was This Study Done?
For most biomarkers, it is not known how the amount of protein detected in the blood relates to tumor size or how sensitive the assays for biomarkers must be to improve patient survival. In this study, the researchers develop a “linear one-compartment” mathematical model to predict how large tumors need to be before blood biomarkers can be used to detect them and test this model using published data on two established cancer biomarkers—CA125 and prostate-specific antigen (PSA). CA125 is used to monitor the progress of patients with ovarian cancer after treatment; ovarian cancer is rarely diagnosed in its early stages and only one-fourth of women with advanced disease survive for 5 y after diagnosis. PSA is used to screen for prostate cancer and has increased the detection of this cancer in its early stages when it is curable.
What Did the Researchers Do and Find?
To develop a model that relates secreted blood biomarker levels to tumor sizes, the researchers assumed that biomarkers mix evenly throughout the patient's blood, that cancer cells secrete biomarkers into the fluid that surrounds them, that 0.1%–20% of these secreted proteins enter the blood at a continuous rate, and that biomarkers are continuously removed from the blood. The researchers then used their model to calculate the smallest tumor sizes that might be detectable with these biomarkers by feeding in existing data on CA125 and on PSA, including assay detection limits and the biomarker secretion rates of cancer cells growing in dishes. When only tumor cells secreted the biomarker and 10% of the secreted biomarker reach the blood, the model predicted that ovarian tumors between 0.11 mm3 (smaller than a grain of salt) and nearly 4,000 mm3 (about the size of a cherry) would be detectable by measuring CA125 blood levels (the range was determined by varying the amount of biomarker secreted by the tumor cells and the assay sensitivity); for prostate cancer, the detectable tumor sizes ranged from similar lower size to about 130 mm3 (pea-sized). However, healthy cells often also secrete small quantities of cancer biomarkers. With this condition incorporated into the model, the estimated detectable tumor sizes (or total tumor burden including metastases) ranged between grape-sized and melon-sized for ovarian cancers and between pea-sized to about grapefruit-sized for prostate cancers.
What Do These Findings Mean?
The accuracy of the calculated tumor sizes provided by the researchers' mathematical model is limited by the lack of data on how tumors behave in the human body and by the many assumptions incorporated into the model. Nevertheless, the model predicts detection limits for ovarian and prostate cancer that broadly mirror the clinical performance of both biomarkers. Somewhat worryingly, the model also indicates that a tumor may have to be very large for blood biomarkers to reveal its presence, a result that could limit the clinical usefulness of biomarkers, especially if they are secreted not only by tumor cells but also by healthy cells. Given this finding, as more information about how biomarkers behave in the human body becomes available, this model (and more complex versions of it) should help researchers decide which biomarkers are likely to improve early cancer detection and patient outcomes.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050170.
The US National Cancer Institute provides a brief description of what cancer is and how it develops and a fact sheet on tumor markers; it also provides information on all aspects of ovarian and prostate cancer for patients and professionals, including information on screening and testing (in English and Spanish)
The UK charity Cancerbackup also provides general information about cancer and more specific information about ovarian and prostate cancer, including the use of CA125 and PSA for screening and follow-up
The American Society of Clinical Oncology offers a wide range of information on various cancer types, including online published articles on the current status of cancer diagnosis and management from the educational book developed by the annual meeting faculty and presenters. Registration is mandatory, but information is free
doi:10.1371/journal.pmed.0050170
PMCID: PMC2517618  PMID: 18715113
12.  Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics 
BMC Bioinformatics  2008;9:542.
Background
Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics.
Results
We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling.
Conclusion
The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.
doi:10.1186/1471-2105-9-542
PMCID: PMC2651178  PMID: 19087345
13.  The expanding proteome of the molecular chaperone HSP90 
Cell Cycle  2012;11(7):1301-1308.
The molecular chaperone HSP90 maintains the activity and stability of a diverse set of “client” proteins that play key roles in normal and disease biology. Around 20 HSP90 inhibitors that deplete the oncogenic clientele have entered clinical trials for cancer. However, the full extent of the HSP90-dependent proteome, which encompasses not only clients but also proteins modulated by downstream transcriptional responses, is still incompletely characterized and poorly understood. Earlier large-scale efforts to define the HSP90 proteome have been valuable but are incomplete because of limited technical sensitivity. Here, we discuss previous large-scale surveys of proteome perturbations induced by HSP90 inhibitors in light of a significant new study using state-of-the-art stable isotope labeling by amino acids (SILAC) technology combined with more sensitive high-resolution mass spectrometry (MS) that extends the catalog of proteomic changes in inhibitor-treated cancer cells. Among wide-ranging changes, major functional responses include downregulation of protein kinase activity and the DNA damage response alongside upregulation of the protein degradation machinery. Despite this improved proteomic coverage, there was surprisingly little overlap with previous studies. This may be due in part to technical issues but is likely also due to the variability of the HSP90 proteome with the inhibitor conditions used, the cancer cell type and the genetic status of client proteins. We suggest future proteomic studies to address these factors, to help distinguish client protein components from indirect transcriptional components and to address other key questions in fundamental and translational HSP90 research. Such studies should also reveal new biomarkers for patient selection and novel targets for therapeutic intervention.
doi:10.4161/cc.19722
PMCID: PMC3350876  PMID: 22421145
HSP90; HSP90 proteome; HSP90 inhibitors; HSP90 biomarkers; cancer
14.  The Proteogenomic Path towards Biomarker Discovery 
Pediatric transplantation  2008;12(7):737-747.
The desire for biomarkers for diagnosis and prognosis of diseases has never been greater. With the availability of genome data and an increased availability of proteome data, the discovery of biomarkers has become increasingly feasible. However, the task is daunting and requires collaborations among researchers working in the fields of transplantation, immunology, genetics, molecular biology, biostatistics, and bioinformatics. With the advancement of high throughput omic techniques such as genomics and proteomics (collectively known as proteogenomics), efforts have been made to develop diagnostic tools from new and to-be discovered biomarkers. Yet biomarker validation, particularly in organ transplantation, remains challenging because of the lack of a true gold standard for diagnostic categories and analytical bottlenecks that face high-throughput data deconvolution. Even though microarray technique is relatively mature, proteomics is still growing with regards to data normalization and analysis methods. Study design, sample selection, and rigorous data analysis are the critical issues for biomarker discovery using high-throughout proteogenomic technologies that combine the use and strengths of both genomics and proteomics. In this review, we look into the current status and latest developments in the field of biomarker discovery using genomics and proteomics related to organ transplantation, with an emphasis on the evolution of proteomic technologies.
doi:10.1111/j.1399-3046.2008.01018.x
PMCID: PMC2574627  PMID: 18764911
Biomarker discovery; proteogenomics; genomics; proteomics; microarray; transplantation; acute rejection; peptidomics
15.  Proteomics: a subcellular look at spermatozoa 
Background
Male-factor infertility presents a vexing problem for many reproductively active couples. Many studies have focused on abnormal sperm parameters. Recent advances in proteomic techniques, especially in mass spectrometry, have aided in the study of sperm and more specifically, sperm proteins. The aim of this study was to review the current literature on the various proteomic techniques, and their usefulness in diagnosing sperm dysfunction and potential applications in the clinical setting.
Methods
Review of PubMed database. Key words: spermatozoa, proteomics, protein, proteome, 2D-PAGE, mass spectrometry.
Results
Recently employed proteomic methods, such as two-dimensional polyacrylamide gel electrophoresis, mass spectrometry, and differential in gel electrophoresis, have identified numerous sperm-specific proteins. They also have provided a further understanding of protein function involved in sperm processes and for the differentiation between normal and abnormal states. In addition, studies on the sperm proteome have demonstrated the importance of post-translational modifications, and their ability to bring about physiological changes in sperm function. No longer do researchers believe that in order for them to elucidate the biochemical functions of genes, mere knowledge of the human genome sequence is sufficient. Moreover, a greater understanding of the physiological function of every protein in the tissue-specific proteome is essential in order to unravel the biological display of the human genome.
Conclusion
Recent advances in proteomic techniques have provided insight into sperm function and dysfunction. Several multidimensional separation techniques can be utilized to identify and characterize spermatozoa. Future developments in bioinformatics can further assist researchers in understanding the vast amount of data collected in proteomic studies. Moreover, such advances in proteomics may help to decipher metabolites which can act as biomarkers in the detection of sperm impairments and to potentially develop treatment for infertile couples.
Further comprehensive studies on sperm-specific proteome, mechanisms of protein function and its proteolytic regulation, biomarkers and functional pathways, such as oxidative-stress induced mechanisms, will provide better insight into physiological functions of the spermatozoa. Large-scale proteomic studies using purified protein assays will eventually lead to the development of novel biomarkers that may allow for detection of disease states, genetic abnormalities, and risk factors for male infertility. Ultimately, these biomarkers will allow for a better diagnosis of sperm dysfunction and aid in drug development.
doi:10.1186/1477-7827-9-36
PMCID: PMC3071316  PMID: 21426553
16.  Microbial proteomics: a mass spectrometry primer for biologists 
It is now more than 10 years since the publication of the first microbial genome sequence and science is now moving towards a post genomic era with transcriptomics and proteomics offering insights into cellular processes and function. The ability to assess the entire protein network of a cell at a given spatial or temporal point will have a profound effect upon microbial science as the function of proteins is inextricably linked to phenotype. Whilst such a situation is still beyond current technologies rapid advances in mass spectrometry, bioinformatics and protein separation technologies have produced a step change in our current proteomic capabilities. Subsequently a small, but steadily growing, number of groups are taking advantage of this cutting edge technology to discover more about the physiology and metabolism of microorganisms. From this research it will be possible to move towards a systems biology understanding of a microorganism. Where upon researchers can build a comprehensive cellular map for each microorganism that links an accurately annotated genome sequence to gene expression data, at a transcriptomic and proteomic level.
In order for microbiologists to embrace the potential that proteomics offers, an understanding of a variety of analytical tools is required. The aim of this review is to provide a basic overview of mass spectrometry (MS) and its application to protein identification. In addition we will describe how the protein complexity of microbial samples can be reduced by gel-based and gel-free methodologies prior to analysis by MS. Finally in order to illustrate the power of microbial proteomics a case study of its current application within the Bacilliaceae is given together with a description of the emerging discipline of metaproteomics.
doi:10.1186/1475-2859-6-26
PMCID: PMC1971468  PMID: 17697372
17.  Laboratory markers in ulcerative colitis: Current insights and future advances 
Ulcerative colitis (UC) and Crohn’s disease (CD) are the major forms of inflammatory bowel diseases (IBD) in man. Despite some common features, these forms can be distinguished by different genetic predisposition, risk factors and clinical, endoscopic and histological characteristics. The aetiology of both CD and UC remains unknown, but several evidences suggest that CD and perhaps UC are due to an excessive immune response directed against normal constituents of the intestinal bacterial flora. Tests sometimes invasive are routine for the diagnosis and care of patients with IBD. Diagnosis of UC is based on clinical symptoms combined with radiological and endoscopic investigations. The employment of non-invasive biomarkers is needed. These biomarkers have the potential to avoid invasive diagnostic tests that may result in discomfort and potential complications. The ability to determine the type, severity, prognosis and response to therapy of UC, using biomarkers has long been a goal of clinical researchers. We describe the biomarkers assessed in UC, with special reference to acute-phase proteins and serologic markers and thereafter, we describe the new biological markers and the biological markers could be developed in the future: (1) serum markers of acute phase response: The laboratory tests most used to measure the acute-phase proteins in clinical practice are the serum concentration of C-reactive protein and the erythrocyte sedimentation rate. Other biomarkers of inflammation in UC include platelet count, leukocyte count, and serum albumin and serum orosomucoid concentrations; (2) serologic markers/antibodies: In the last decades serological and immunologic biomarkers have been studied extensively in immunology and have been used in clinical practice to detect specific pathologies. In UC, the presence of these antibodies can aid as surrogate markers for the aberrant host immune response; and (3) future biomarkers: The development of biomarkers in UC will be very important in the future. The progress of molecular biology tools (microarrays, proteomics and nanotechnology) have revolutionised the field of the biomarker discovery. The advances in bioinformatics coupled with cross-disciplinary collaborations have greatly enhanced our ability to retrieve, characterize and analyse large amounts of data generated by the technological advances. The techniques available for biomarkers development are genomics (single nucleotide polymorphism genotyping, pharmacogenetics and gene expression analyses) and proteomics. In the future, the addition of new serological markers will add significant benefit. Correlating serologic markers with genotypes and clinical phenotypes should enhance our understanding of pathophysiology of UC.
doi:10.4291/wjgp.v6.i1.13
PMCID: PMC4325297
Inflammatory bowel diseases; Ulcerative colitis; Crohn’s disease; Serologic markers; Acute phase response
18.  A dynamic model of proteome changes reveals new roles for transcript alteration in yeast 
By characterizing dynamic changes in yeast protein abundance following osmotic shock, this study shows that the correlation between protein and mRNA differs for transcripts that increase versus decrease in abundance, and reveals physiological reasons for these differences.
The correlation between protein and mRNA change is very high at transcripts that increase in abundance, but negligible at reduced transcripts following NaCl shock.Modeling and experimental data suggest that reducing levels of high-abundance transcripts helps to direct translational machinery to newly made transcripts.The transient burst of transcript increase serves to accelerate changes in protein abundance.Post-transcriptional regulation of protein abundance is pervasive, although most of the variance in protein change is explained by changes in mRNA abundance.
Natural microenvironments change rapidly, and living creatures must respond quickly and efficiently to thrive within this flux. At all cellular levels—signaling, transcription, translation, metabolism, cell growth, and division—the response is dynamic and coordinated. Some aspects of this response, such as dynamic changes of the transcriptome, are well understood. But other aspects, like the response of the proteome, have remained obscured primarily because of previous limitations in technology. Without coordinated time-course data, it has remained impossible to correctly characterize the correlations and dependencies between these two essential levels of cell biology.
This work presents an extended picture of the coordinated response of the transcriptome and proteome as cells respond to an abrupt environmental change. To assay proteomic dynamics, we developed a strategy for large-scale, multiplexed quantitation using isobaric tags and high mass accuracy mass spectrometry. This sensitive yet efficient platform allows for the expedient collection of quantitative time-course proteomic data at six time points, sufficiently reproducible to permit meaningful interpretation of variation across biological replicates. Time-course transcriptome data were generated from paired biological samples, allowing us to examine the relationships between changes in mRNA and protein for each gene in terms of direction and intensity, as well as the characteristics of the temporal profiles for each gene.
It was immediately obvious that a single measure of correlation across the entire data set was a meaningless metric. We therefore analyzed relationships between mRNA and protein for different subsets of data. In response to osmotic shock, hundreds of transcripts are highly induced, and their temporal pattern reveals a transient peak of maximal induction, which resolves into a new elevated level as cells acclimate (Figure 2). For this group of genes, there is extremely high correlation between peak mRNA change and protein change (R2∼0.8). But the dynamics of the molecules differ: while mRNA levels transiently overshoot their final levels, proteins gradually rise in abundance toward their new, elevated state. We observed, however, that a measure of efficiency connects the two profiles. The time it takes for a protein to acclimate to its new state correlates with the magnitude of the excess mRNA induction. Thus, the cell imparts an urgency to protein induction by transiently producing excess transcript.
The most surprising result, however, involves transcripts that decrease in abundance. In response to osmotic shock, the cell transiently reduces over 600 transcripts, many of which are among the most highly expressed in unstressed cells. But protein levels for these genes remain, for the most part, almost completely unchanged. The stark absence of protein repression is independent of basal protein abundance, independent of reported protein half-lives, reproducible across biological replicates, and validated by quantitative western blots. Furthermore, since we do detect a handful of proteins whose abundance is significantly reduced, our technology is capable of identifying protein loss. Thus, we conclude that transcript reduction serves another purpose besides reducing protein levels.
To explore alternate interpretations of the consequence of transcriptional repression, we devised a mass-action kinetic model, which describes protein changes based on mRNA dynamics in the context of transient changes in the rates of cell division. The model successfully recapitulated the observed data, allowing us to alter modeling parameters to test various hypotheses.
In response to osmotic shock, overall rates of translation temporarily decrease and cell growth transiently arrests before resuming at a slower rate. We reasoned that mRNA reduction might lower the rate of new protein synthesis, but that retarded production is balanced by reduced cell division. We explored both aspects of this logic with our model.
As expected, removing cell division from our model led to a calculated decrease of protein levels, indicating that reduced growth is necessary for maintaining protein levels. However, when we computationally held mRNA levels stable and calculated protein levels in the absence of mRNA repression, we did not find the expected increase in protein abundance.
We then considered the possibility that one function of the regulated repression of these highly abundant transcripts was to liberate proteins essential for translation, such as ribosomes or translation initiation factors. To explore this, we examined a mutant lacking the Dot6p/Tod6p transcriptional repressors, which fails to properly repress ∼250 genes in response to osmotic shock. In the wild type, the mRNA for a Dot6p/Tod6p target (ARX1) decreased seven-fold, and the remaining transcript was generally unassociated with poly-ribosomes. In the mutant, however, the mRNA levels were reduced only two-fold, while the remaining transcript continued to bind ribosomes. Therefore, failure to reduce transcript levels led to a persistent association with poly-ribosomes, thereby consuming translational machinery.
Our hypothesis is, therefore, that widespread changes in the transcriptome promote efficient translation of new proteins. Transcript increase serves to increase abundance of the encoded proteins, while reduction of some of the most abundant and highly translated mRNAs supports this project by liberating translational capacity. While it is not clear what factors are the limiting elements, it is clear that a full picture of cellular biology requires exploring the dynamics of the cellular response.
The transcriptome and proteome change dynamically as cells respond to environmental stress; however, prior proteomic studies reported poor correlation between mRNA and protein, rendering their relationships unclear. To address this, we combined high mass accuracy mass spectrometry with isobaric tagging to quantify dynamic changes in ∼2500 Saccharomyces cerevisiae proteins, in biological triplicate and with paired mRNA samples, as cells acclimated to high osmolarity. Surprisingly, while transcript induction correlated extremely well with protein increase, transcript reduction produced little to no change in the corresponding proteins. We constructed a mathematical model of dynamic protein changes and propose that the lack of protein reduction is explained by cell-division arrest, while transcript reduction supports redistribution of translational machinery. Furthermore, the transient ‘burst' of mRNA induction after stress serves to accelerate change in the corresponding protein levels. We identified several classes of post-transcriptional regulation, but show that most of the variance in protein changes is explained by mRNA. Our results present a picture of the coordinated physiological responses at the levels of mRNA, protein, protein-synthetic capacity, and cellular growth.
doi:10.1038/msb.2011.48
PMCID: PMC3159980  PMID: 21772262
dynamics; modeling; proteomics; stress; transcriptomics
19.  Absolute quantification of microbial proteomes at different states by directed mass spectrometry 
The developed, directed mass spectrometry workflow allows to generate consistent and system-wide quantitative maps of microbial proteomes in a single analysis. Application to the human pathogen L. interrogans revealed mechanistic proteome changes over time involved in pathogenic progression and antibiotic defense, and new insights about the regulation of absolute protein abundances within operons.
The developed, directed proteomic approach allowed consistent detection and absolute quantification of 1680 proteins of the human pathogen L. interrogans in a single LC–MS/MS experiment.The comparison of 25 extensive, consistent and quantitative proteome maps revealed new insights about the proteome changes involved in pathogenic progression and antibiotic defense of L. interrogans, and about the regulation of protein abundances within operons.The generated time-resolved data sets are compatible with pattern analysis algorithms developed for transcriptomics, including hierarchical clustering and functional enrichment analysis of the detected profile clusters.This is the first study that describes the absolute quantitative behavior of any proteome over multiple states and represents the most comprehensive proteome abundance pattern comparison for any organism to date.
Over the last decade, mass spectrometry (MS)-based proteomics has evolved as the method of choice for system-wide proteome studies and now allows for the characterization of several thousands of proteins in a single sample. Despite these great advances, redundant monitoring of protein levels over large sample numbers in a high-throughput manner remains a challenging task. New directed MS strategies have shown to overcome some of the current limitations, thereby enabling the acquisition of consistent and system-wide data sets of proteomes with low-to-moderate complexity at high throughput.
In this study, we applied this integrated, two-stage MS strategy to investigate global proteome changes in the human pathogen L. interrogans. In the initial discovery phase, 1680 proteins (out of around 3600 gene products) could be identified (Schmidt et al, 2008) and, by focusing precious MS-sequencing time on the most dominant, specific peptides per protein, all proteins could be accurately and consistently monitored over 25 different samples within a few days of instrument time in the following scoring phase (Figure 1). Additionally, the co-analysis of heavy reference peptides enabled us to obtain absolute protein concentration estimates for all identified proteins in each perturbation (Malmström et al, 2009). The detected proteins did not show any biases against functional groups or protein classes, including membrane proteins, and span an abundance range of more than three orders of magnitude, a range that is expected to cover most of the L. interrogans proteome (Malmström et al, 2009).
To elucidate mechanistic proteome changes over time involved in pathogenic progression and antibiotic defense of L. interrogans, we generated time-resolved proteome maps of cells perturbed with serum and three different antibiotics at sublethal concentrations that are currently used to treat Leptospirosis. This yielded an information-rich proteomic data set that describes, for the first time, the absolute quantitative behavior of any proteome over multiple states, and represents the most comprehensive proteome abundance pattern comparison for any organism to date. Using this unique property of the data set, we could quantify protein components of entire pathways across several time points and subject the data sets to cluster analysis, a tool that was previously limited to the transcript level due to incomplete sampling on protein level (Figure 4). Based on these analyses, we could demonstrate that Leptospira cells adjust the cellular abundance of a certain subset of proteins and pathways as a general response to stress while other parts of the proteome respond highly specific. The cells furthermore react to individual treatments by ‘fine tuning' the abundance of certain proteins and pathways in order to cope with the specific cause of stress. Intriguingly, the most specific and significant expression changes were observed for proteins involved in motility, tissue penetration and virulence after serum treatment where we tried to simulate the host environment. While many of the detected protein changes demonstrate good agreement with available transcriptomics data, most proteins showed a poor correlation. This includes potential virulence factors, like Loa22 or OmpL1, with confirmed expression in vivo that were significantly up-regulated on the protein level, but not on the mRNA level, strengthening the importance of proteomic studies. The high resolution and coverage of the proteome data set enabled us to further investigate protein abundance changes of co-regulated genes within operons. This suggests that although most proteins within an operon respond to regulation synchronously, bacterial cells seem to have subtle means to adjust the levels of individual proteins or protein groups outside of the general trend, a phenomena that was recently also observed on the transcript level of other bacteria (Güell et al, 2009).
The method can be implemented with standard high-resolution mass spectrometers and software tools that are readily available in the majority of proteomics laboratories. It is scalable to any proteome of low-to-medium complexity and can be extended to post-translational modifications or peptide-labeling strategies for quantification. We therefore expect the approach outlined here to become a cornerstone for microbial systems biology.
Over the past decade, liquid chromatography coupled with tandem mass spectrometry (LC–MS/MS) has evolved into the main proteome discovery technology. Up to several thousand proteins can now be reliably identified from a sample and the relative abundance of the identified proteins can be determined across samples. However, the remeasurement of substantially similar proteomes, for example those generated by perturbation experiments in systems biology, at high reproducibility and throughput remains challenging. Here, we apply a directed MS strategy to detect and quantify sets of pre-determined peptides in tryptic digests of cells of the human pathogen Leptospira interrogans at 25 different states. We show that in a single LC–MS/MS experiment around 5000 peptides, covering 1680 L. interrogans proteins, can be consistently detected and their absolute expression levels estimated, revealing new insights about the proteome changes involved in pathogenic progression and antibiotic defense of L. interrogans. This is the first study that describes the absolute quantitative behavior of any proteome over multiple states, and represents the most comprehensive proteome abundance pattern comparison for any organism to date.
doi:10.1038/msb.2011.37
PMCID: PMC3159967  PMID: 21772258
absolute quantification; directed mass spectrometry; Leptospira interrogans; microbiology; proteomics
20.  Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery 
Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed.
doi:10.3389/fneur.2013.00061
PMCID: PMC3668328  PMID: 23750150
proteomics; biomarkers; traumatic brain injury; bioinformatics; systems biology
21.  Computational Biomarker Pipeline from Discovery to Clinical Implementation: Plasma Proteomic Biomarkers for Cardiac Transplantation 
PLoS Computational Biology  2013;9(4):e1002963.
Recent technical advances in the field of quantitative proteomics have stimulated a large number of biomarker discovery studies of various diseases, providing avenues for new treatments and diagnostics. However, inherent challenges have limited the successful translation of candidate biomarkers into clinical use, thus highlighting the need for a robust analytical methodology to transition from biomarker discovery to clinical implementation. We have developed an end-to-end computational proteomic pipeline for biomarkers studies. At the discovery stage, the pipeline emphasizes different aspects of experimental design, appropriate statistical methodologies, and quality assessment of results. At the validation stage, the pipeline focuses on the migration of the results to a platform appropriate for external validation, and the development of a classifier score based on corroborated protein biomarkers. At the last stage towards clinical implementation, the main aims are to develop and validate an assay suitable for clinical deployment, and to calibrate the biomarker classifier using the developed assay. The proposed pipeline was applied to a biomarker study in cardiac transplantation aimed at developing a minimally invasive clinical test to monitor acute rejection. Starting with an untargeted screening of the human plasma proteome, five candidate biomarker proteins were identified. Rejection-regulated proteins reflect cellular and humoral immune responses, acute phase inflammatory pathways, and lipid metabolism biological processes. A multiplex multiple reaction monitoring mass-spectrometry (MRM-MS) assay was developed for the five candidate biomarkers and validated by enzyme-linked immune-sorbent (ELISA) and immunonephelometric assays (INA). A classifier score based on corroborated proteins demonstrated that the developed MRM-MS assay provides an appropriate methodology for an external validation, which is still in progress. Plasma proteomic biomarkers of acute cardiac rejection may offer a relevant post-transplant monitoring tool to effectively guide clinical care. The proposed computational pipeline is highly applicable to a wide range of biomarker proteomic studies.
Author Summary
Novel proteomic technology has led to the generation of vast amounts of biological data and the identification of numerous potential biomarkers. However, computational approaches to translate this information into knowledge capable of impacting clinical care have been lagging. We propose a computational proteomic pipeline for biomarker studies that is founded on the combination of advanced statistical methodologies. We demonstrate our approach through the analysis of data obtained from heart transplant patients. Heart transplantation is the gold standard treatment for patients with end-stage heart failure, but is complicated by episodes of immune rejection that can adversely impact patient outcomes. Current rejection monitoring approaches are highly invasive, requiring a biopsy of the heart. This work aims to reduce the need for biopsies, and demonstrate the power and utility of computational approaches in proteomic biomarker discovery. Our work utilizes novel high-throughput proteomic technology combined with advanced statistical techniques to identify blood markers that guide the decision as to whether a biopsy is warranted, reduce the number of unnecessary biopsies, and ultimately diagnose the presence of rejection in heart transplant patients. Additionally, the proposed computational methodologies can be applied to a range of proteomic biomarker studies of various diseases and conditions.
doi:10.1371/journal.pcbi.1002963
PMCID: PMC3617196  PMID: 23592955
22.  Improvements in proteomic metrics of low abundance proteins through proteome equalization using ProteoMiner prior to MudPIT 
Journal of proteome research  2011;10(8):3690-3700.
Ideally shotgun proteomics would facilitate the identification of an entire proteome with 100% protein sequence coverage. In reality, the large dynamic range and complexity of cellular proteomes results in oversampling of abundant proteins, while peptides from low abundance proteins are undersampled or remain undetected. We tested the proteome equalization technology, ProteoMiner, in conjunction with Multidimensional Protein Identification Technology (MudPIT) to determine how the equalization of protein dynamic range could improve shotgun proteomics methods for the analysis of cellular proteomes. Our results suggest low abundance protein identifications were improved by two mechanisms: (1) depletion of high abundance proteins freed ion trap sampling space usually occupied by high abundance peptides and (2) enrichment of low abundance proteins increased the probability of sampling their corresponding more abundant peptides. Both mechanisms also contributed to dramatic increases in the quantity of peptides identified and the quality of MS/MS spectra acquired due to increases in precursor intensity of peptides from low abundance proteins. From our large data set of identified proteins, we categorized the dominant physicochemical factors which facilitate proteome equalization with a hexapeptide library. These results illustrate that equalization of the dynamic range of the cellular proteome is a promising methodology to improve low abundance protein identification confidence, reproducibility, and sequence coverage in shotgun proteomics experiments, opening a new avenue of research for improving proteome coverage.
doi:10.1021/pr200304u
PMCID: PMC3161494  PMID: 21702434
shotgun proteomics; peptide identification; proteome coverage; protein abundance dynamic range
23.  A novel strategy for the comprehensive analysis of the biomolecular composition of isolated plasma membranes 
A methodology for rapid, high-purity isolation of plasma membranes using superparamagnetic nanoparticles is described. The method is illustrated with high-resolution proteomic, glycomic and lipidomic analyses of presenilin-deficient cells.
We present a novel strategy based on cationic phospholipids-coated superparamagnetic nanoparticles (SPMNPs) to isolate plasma membranes with very high purity and at a preparative scale.The SPMNP-based isolation method is compatible with subsequent analysis of the biomolecular composition of plasma membranes, including proteomics, lipidomics and N-glycomics analysis.A comparative ‘omics' analysis of plasma membranes from wild-type, presenilin-deficient and human presenilin-1-rescued fibroblasts revealed convergent changes in proteins and lipids, suggesting an underlying endosomal transport defect.Our methodology allows for the systematic set-up of comprehensive plasma membrane inventories: alterations in the composition of the cell surface may potentially identify novel biomarkers or drug targets.
One of the major goals of this paper was to establish a robust method for plasma membrane (PM) isolation in order to perform a full analysis of their biomolecular composition. Using thermal decomposition, we manufactured superparamagnetic nanoparticles (SPMNPs) that we rendered water soluble and monodisperse by subsequent coupling of NH2 phospholipids. When incubated with cell monolayers, these cationic phospholipids-SPMNPs remain predominately localized at the cell surface. We applied these unexpected feature of phospholipids-SPMNPs to establish a novel protocol to isolate high yields of highly pure PMs. Due to the superb quality and quantity of isolated PM fractions, we could perform a comprehensive and comparative biomolecular profiling on this subcellular compartment that included proteomics, N-glycoproteomics, lipidomics and N-glycan profiling. This method was subsequently applied to compare the biomolecular composition of PMs isolated from wild-type and presenilin-deficient and human presenilin-1-rescued mouse embryonic fibroblasts (MEFs). For the first time, we succeeded in identifying convergent changes in the biomolecular composition of the cell surface caused by presenilin gene deficiency. Furthermore, the observed proteomic/cholesterol changes were restored in presenilin-deficient MEFs rescued with the human presenilin-1 ortholog. These (subtle) changes in protein and lipid composition suggest an underlying endosomal transport defect in presenilin-deficient cell lines that is the subject of ongoing research.
Moreover, and extending the versatility of our method, we could show that our PM isolations are compatible with fluorophore-assisted carbohydrate electrophoresis (FACE), allowing for N-glycan profiling of the cell surface. Hitherto, the N-glycan composition was measured on total cell extracts where the sensitivity to detect mature N-glycan chains is significantly hampered by the higher abundances of intracellular immature N-glycan intermediates. Finally, using the γ-secretase protein complex as a model, we confirm that we can study the activity and composition of active protein complexes in their native membrane environment. Our strategy incorporates for the first time most available omics analyses and this on a single isolated membrane compartment. As such, it allows for the monitoring and identification of systematic changes at the cell surface, for instance during differentiation and polarization or as a consequence of environmental insults, ER-stress, apoptotic stimuli and altered lipogenesis. The identification of alterations in the PM protein and lipid composition, as they occur as a consequence of disease, is therefore of paramount importance in several fields of experimental medicine, including immunology, cancer and stem cell research. Our methodology provides also a foundation to systematically start collecting PM ‘fingerprints' of an increasing number of cells. The resulting integrated and comprehensive databases may become the starting point of a ‘subcellular systems biology' approach of the cell's limiting membrane. Since PM proteins provide many pathological relevant biomarkers representing two-thirds of the currently used drug targets, this novel technology has great potential for biomedical and pharmaceutical applications.
We manufactured a novel type of lipid-coated superparamagnetic nanoparticles that allow for a rapid isolation of plasma membranes (PMs), enabling high-resolution proteomic, glycomic and lipidomic analyses of the cell surface. We used this technology to characterize the effects of presenilin knockout on the PM composition of mouse embryonic fibroblasts. We found that many proteins are selectively downregulated at the cell surface of presenilin knockout cells concomitant with lowered surface levels of cholesterol and certain sphingomyelin species, indicating defects in specific endosomal transport routes to and/or from the cell surface. Snapshots of N-glycoproteomics and cell surface glycan profiling further underscored the power and versatility of this novel methodology. Since PM proteins provide many pathologically relevant biomarkers representing two-thirds of the currently used drug targets, this novel technology has great potential for biomedical and pharmaceutical applications.
doi:10.1038/msb.2011.74
PMCID: PMC3261717  PMID: 22027552
eukaryotic cell systems; glycomics; lipidomics; presenilin, proteomics
24.  Systems Integration of Biodefense Omics Data for Analysis of Pathogen-Host Interactions and Identification of Potential Targets 
PLoS ONE  2009;4(9):e7162.
The NIAID (National Institute for Allergy and Infectious Diseases) Biodefense Proteomics program aims to identify targets for potential vaccines, therapeutics, and diagnostics for agents of concern in bioterrorism, including bacterial, parasitic, and viral pathogens. The program includes seven Proteomics Research Centers, generating diverse types of pathogen-host data, including mass spectrometry, microarray transcriptional profiles, protein interactions, protein structures and biological reagents. The Biodefense Resource Center (www.proteomicsresource.org) has developed a bioinformatics framework, employing a protein-centric approach to integrate and support mining and analysis of the large and heterogeneous data. Underlying this approach is a data warehouse with comprehensive protein + gene identifier and name mappings and annotations extracted from over 100 molecular databases. Value-added annotations are provided for key proteins from experimental findings using controlled vocabulary. The availability of pathogen and host omics data in an integrated framework allows global analysis of the data and comparisons across different experiments and organisms, as illustrated in several case studies presented here. (1) The identification of a hypothetical protein with differential gene and protein expressions in two host systems (mouse macrophage and human HeLa cells) infected by different bacterial (Bacillus anthracis and Salmonella typhimurium) and viral (orthopox) pathogens suggesting that this protein can be prioritized for additional analysis and functional characterization. (2) The analysis of a vaccinia-human protein interaction network supplemented with protein accumulation levels led to the identification of human Keratin, type II cytoskeletal 4 protein as a potential therapeutic target. (3) Comparison of complete genomes from pathogenic variants coupled with experimental information on complete proteomes allowed the identification and prioritization of ten potential diagnostic targets from Bacillus anthracis. The integrative analysis across data sets from multiple centers can reveal potential functional significance and hidden relationships between pathogen and host proteins, thereby providing a systems approach to basic understanding of pathogenicity and target identification.
doi:10.1371/journal.pone.0007162
PMCID: PMC2745575  PMID: 19779614
25.  Enzymes and Related Proteins as Cancer Biomarkers: a Proteomic Approach 
Background
The discovery of cancer biomarkers has become a major focus of cancer research, which holds promising future for early detection, diagnosis, monitoring disease recurrence and therapeutic treatment efficacy to improve long-term survival of cancer patients. Most of the functional information of the cancer-associated genes resides in the proteome. Since cancer is a complex disease, it might require a panel of multiple biomarkers in order to achieve sufficient clinical efficacy.
Methods
Serum/plasma is the most accessible biological specimen collected from patients. Therefore, serum proteomic diagnostics would be the most promising new test for cancer. With the advent of new and improved proteomic technologies, such as protein chips and mass spectrometry coupled with advanced bioinformatic tools, it is possible to develop potential cancer biomarkers. However, specimen collection, handling, study design and data analysis are essential components for successful biomarker discovery and validation. Multi-center case control study should be conducted with extensive clinical validation to minimize the impact of possible confounding variables (non-biological).
Conclusions
Enzymes and related proteins, such as inhibitors, are promising candidates for cancer diagnostics.
doi:10.1016/j.cca.2007.02.017
PMCID: PMC4104743  PMID: 17382922
clinical proteomics; cancer biomarkers; mass spectrometry; prostate specific antigen

Results 1-25 (1277289)