Reverse phase protein arrays (RPPA) have been demonstrated to be a useful experimental platform for quantitative protein profiling in a high-throughput format. Target protein detection relies on the readout obtained from a single detection antibody. For this reason, antibody specificity is a key factor for RPPA. RNAi allows the specific knockdown of a target protein in complex samples and was therefore examined for its utility to assess antibody performance for RPPA applications.
To proof the feasibility of our strategy, two different anti-EGFR antibodies were compared by RPPA. Both detected the knockdown of EGFR but at a different rate. Western blot data were used to identify the most reliable antibody. The RNAi approach was also used to characterize commercial anti-STAT3 antibodies. Out of ten tested anti-STAT3 antibodies, four antibodies detected the STAT3-knockdown at 80-85%, and the most sensitive anti-STAT3 antibody was identified by comparing detection limits. Thus, the use of RNAi for RPPA antibody validation was demonstrated to be a stringent approach to identify highly specific and highly sensitive antibodies. Furthermore, the RNAi/RPPA strategy is also useful for the validation of isoform-specific antibodies as shown for the identification of AKT1/AKT2 and CCND1/CCND3-specific antibodies.
RNAi is a valuable tool for the identification of very specific and highly sensitive antibodies, and is therefore especially useful for the validation of RPPA-suitable detection antibodies. On the other hand, when a set of well-characterized RPPA-antibodies is available, large-scale RNAi experiments analyzed by RPPA might deliver useful information for network reconstruction.
Reverse phase protein arrays (RPPA) emerged as a useful experimental platform to analyze biological samples in a high-throughput format. Different signal detection methods have been described to generate a quantitative readout on RPPA including the use of fluorescently labeled antibodies. Increasing the sensitivity of RPPA approaches is important since many signaling proteins or posttranslational modifications are present at a low level.
A new antibody-mediated signal amplification (AMSA) strategy relying on sequential incubation steps with fluorescently-labeled secondary antibodies reactive against each other is introduced here. The signal quantification is performed in the near-infrared range. The RPPA-based analysis of 14 endogenous proteins in seven different cell lines demonstrated a strong correlation (r = 0.89) between AMSA and standard NIR detection. Probing serial dilutions of human cancer cell lines with different primary antibodies demonstrated that the new amplification approach improved the limit of detection especially for low abundant target proteins.
Antibody-mediated signal amplification is a convenient and cost-effective approach for the robust and specific quantification of low abundant proteins on RPPAs. Contrasting other amplification approaches it allows target protein detection over a large linear range.
The current study analyzed reverse phase protein arrays (RPPA) as a means to experimentally validate biomarkers in blood samples. One µl samples of sera (n=71), and plasma (n=78) were serially diluted and printed on nitrocellulose-coated slides. CA19-9 levels from RPPA results were compared with identical patient samples as measured by ELISA. There was a strong correlation between RPPA and ELISA (r=0.87) as determined by scatter plots. Sample reproducibility of CA19-9 levels was excellent (interslide correlation r=0.88; intraslide correlation r=0.83). The ability of RPPA to accurately distinguish CA19-9 levels between cancer and non-cancer samples were determined using receiver operating characteristic curves and compared with ELISA. The AUC for RPPA and ELISA was comparable (0.87 and 0.86, respectively). When the mean CA19-9 levels of normal samples was used as a cutoff for RPPA and compared with the standard clinical ELISA cutoff, comparable specificities (71% for both) were observed. Notably, RPPA samples normalized to albumin showed increased sensitivity compared to ELISA (90% vs 75%). As RPPA is a high throughput method that shows results comparable to that of ELISA, we propose that RPPA is a viable technique for rapid experimental screening and validation of candidate biomarkers in blood samples.
biomarker; CA19-9; pancreatic cancer; reverse phase protein array
Reverse phase protein arrays (RPPAs) are a powerful high-throughput tool for measuring protein concentrations in a large number of samples. In RPPA technology, the original samples are often diluted successively multiple times, forming dilution series to extend the dynamic range of the measurements and to increase confidence in quantitation. An RPPA experiment is equivalent to running multiple ELISA assays concurrently except that there is usually no known protein concentration from which one can construct a standard response curve. Here, we describe a new method called ‘serial dilution curve for RPPA data analysis’. Compared with the existing methods, the new method has the advantage of using fewer parameters and offering a simple way of visualizing the raw data. We showed how the method can be used to examine data quality and to obtain robust quantification of protein concentrations.
Availability: A computer program in R for using serial dilution curve for RPPA data analysis is freely available at http://odin.mdacc.tmc.edu/~zhangli/RPPA.
Reverse-phase protein arrays (RPPAs) have become an important tool for the sensitive and high-throughput detection of proteins from minute amounts of lysates from cell lines and cryopreserved tissue. The current standard method for tissue preservation in almost all hospitals worldwide is formalin fixation and paraffin embedding, and it would be highly desirable if RPPA could also be applied to formalin-fixed and paraffin embedded (FFPE) tissue. We investigated whether the analysis of FFPE tissue lysates with RPPA would result in biologically meaningful data in two independent studies. In the first study on breast cancer samples, we assessed whether a human epidermal growth factor receptor (HER) 2 score based on immunohistochemistry (IHC) could be reproduced with RPPA. The results showed very good concordance between the IHC and RPPA classifications of HER2 expression. In the second study, we profiled FFPE tumor specimens from patients with adenocarcinoma and squamous cell carcinoma in order to find new markers for differentiating these two subtypes of non-small cell lung cancer. p21-activated kinase 2 could be identified as a new differentiation marker for squamous cell carcinoma. Overall, the results demonstrate the technical feasibility and the merits of RPPA for protein expression profiling in FFPE tissue lysates.
The ability to predict the developmental and implantation ability of embryos remains a major goal in human assisted-reproductive technology (ART) and most ART laboratories use morphological criteria to evaluate the oocyte competence despite the poor predictive value of this analysis. Transcriptomic and proteomic approaches on somatic cells surrounding the oocyte (granulosa cells, cumulus cells [CCs]) have been proposed for the identification of biomarkers of oocyte competence. We propose to use a Reverse Phase Protein Array (RPPA) approach to investigate new potential biomarkers of oocyte competence in human CCs at the protein level, an approach that is already used in cancer research to identify biomarkers in clinical diagnostics.
Antibodies targeting proteins of interest were validated for their utilisation in RPPA by measuring siRNA-mediated knockdown efficiency in HEK293 cells in parallel with Western blotting (WB) and RPPA from the same lysates. The proteins of interests were measured by RPPA across 13 individual human CCs from four patients undergoing intracytoplasmic sperm injection procedure.
The knockdown efficiency of VCL, RGS2 and SRC were measured in HEK293 cells by WB and by RPPA and were acceptable for VCL and SRC proteins. The antibodies targeting these proteins were used for their detection in human CCs by RPPA. The detection of protein VCL, SRC and ERK2 (by using an antibody already validated for RPPA) was then carried out on individual CCs and signals were detected for each individual sample. After normalisation by VCL, we showed that the level of expression of ERK2 was almost the same across the 13 individual CCs while the level of expression of SRC was different between the 13 individual CCs of the four patients and between the CCs from one individual patient.
The exquisite sensitivity of RPPA allowed detection of specific proteins in individual CCs. Although the validation of antibodies for RPPA is labour intensive, RRPA is a sensitive and quantitative technique allowing the detection of specific proteins from very small quantities of biological samples. RPPA may be of great interest in clinical diagnostics to predict the oocyte competence prior to transfer of the embryo using robust protein biomarkers expressed by CCs.
Biomarkers; Cumulus cells; Oocyte developmental competence; Reverse phase protein array
Reverse phase protein array (RPPA) is a powerful dot-blot technology that allows studying protein expression levels as well as post-translational modifications in a large number of samples simultaneously. Yet, correct interpretation of RPPA data has remained a major challenge for its broad-scale application and its translation into clinical research. Satisfying quantification tools are available to assess a relative protein expression level from a serial dilution curve. However, appropriate tools allowing the normalization of the data for external sources of variation are currently missing.
Here we propose a new method, called NormaCurve, that allows simultaneous quantification and normalization of RPPA data. For this, we modified the quantification method SuperCurve in order to include normalization for (i) background fluorescence, (ii) variation in the total amount of spotted protein and (iii) spatial bias on the arrays. Using a spike-in design with a purified protein, we test the capacity of different models to properly estimate normalized relative expression levels. The best performing model, NormaCurve, takes into account a negative control array without primary antibody, an array stained with a total protein stain and spatial covariates. We show that this normalization is reproducible and we discuss the number of serial dilutions and the number of replicates that are required to obtain robust data. We thus provide a ready-to-use method for reliable and reproducible normalization of RPPA data, which should facilitate the interpretation and the development of this promising technology.
The raw data, the scripts and the NormaCurve package are available at the following web site: http://microarrays.curie.fr.
The goal of personalized medicine is to provide patients optimal drug screening and treatment based on individual genomic or proteomic profiles. Reverse-Phase Protein Array (RPPA) technology offers proteomic information of cancer patients which may be directly related to drug sensitivity. For cancer patients with different drug sensitivity, the proteomic profiling reveals important pathophysiologic information which can be used to predict chemotherapy responses.
The goal of this paper is to present a framework for personalized medicine using both RPPA and drug sensitivity (drug resistance or intolerance). In the proposed personalized medicine system, the prediction of drug sensitivity is obtained by a proposed augmented naive Bayesian classifier (ANBC) whose edges between attributes are augmented in the network structure of naive Bayesian classifier. For discriminative structure learning of ANBC, local classification rate (LCR) is used to score augmented edges, and greedy search algorithm is used to find the discriminative structure that maximizes classification rate (CR). Once a classifier is trained by RPPA and drug sensitivity using cancer patient samples, the classifier is able to predict the drug sensitivity given RPPA information from a patient.
In this paper we proposed a framework for personalized medicine where a patient is profiled by RPPA and drug sensitivity is predicted by ANBC and LCR. Experimental results with lung cancer data demonstrate that RPPA can be used to profile patients for drug sensitivity prediction by Bayesian network classifier, and the proposed ANBC for personalized cancer medicine achieves better prediction accuracy than naive Bayes classifier in small sample size data on average and outperforms other the state-of-the-art classifier methods in terms of classification accuracy.
Aberrations in oncogenes and tumor suppressors frequently affect the activity of critical signal transduction pathways. To analyze systematically the relationship between the activation status of protein networks and other characteristics of cancer cells, we performed reverse phase protein array (RPPA) profiling of the NCI60 cell lines for total protein expression and activation-specific markers of critical signaling pathways. To extend the scope of the study, we merged those data with previously published RPPA results for the NCI60. Integrative analysis of the expanded RPPA data set revealed 5 major clusters of cell lines and 5 principal proteomic signatures. Comparison of mutations in the NCI60 cell lines with patterns of protein expression demonstrated significant associations for PTEN, PIK3CA, BRAF and APC mutations with proteomic clusters. PIK3CA and PTEN mutation enrichment were not cell lineage-specific but were associated with dominant yet distinct groups of proteins. The five RPPA-defined clusters were strongly associated with sensitivity to standard anti-cancer agents. RPPA analysis identified 27 protein features significantly associated with sensitivity to paclitaxel. The functional status of those proteins was interrogated in a paclitaxel whole genome siRNA library synthetic lethality screen, and confirmed the predicted associations with drug sensitivity. These studies expand our understanding of the activation status of protein networks in the NCI60 cancer cell lines, demonstrate the importance of the direct study of protein expression and activation, and provide a basis for further studies integrating the information with other molecular and pharmacological characteristics of cancer.
NCI60; reverse phase protein arrays; signal transduction
Reverse phase protein arrays (RPPA) are an efficient, high-throughput, cost-effective method for the quantification of specific proteins in complex biological samples. The quality of RPPA data may be affected by various sources of error. One of these, spatial variation, is caused by uneven exposure of different parts of an RPPA slide to the reagents used in protein detection. We present a method for the determination and correction of systematic spatial variation in RPPA slides using positive control spots printed on each slide. The method uses a simple bi-linear interpolation technique to obtain a surface representing the spatial variation occurring across the dimensions of a slide. This surface is used to calculate correction factors that can normalize the relative protein concentrations of the samples on each slide. The adoption of the method results in increased agreement between technical and biological replicates of various tumor and cell-line derived samples. Further, in data from a study of the melanoma cell-line SKMEL-133, several slides that had previously been rejected because they had a coefficient of variation (CV) greater than 15%, are rescued by reduction of CV below this threshold in each case. The method is implemented in the R statistical programing language. It is compatible with MicroVigene and SuperCurve, packages commonly used in RPPA data analysis. The method is made available, along with suggestions for implementation, at http://bitbucket.org/rppa_preprocess/rppa_preprocess/src.
Motivation: Reverse phase protein arrays (RPPA) measure the relative expression levels of a protein in many samples simultaneously. A set of identically spotted arrays can be used to measure the levels of more than one protein. Protein expression within each sample on an array is estimated by borrowing strength across all the samples, but using only within array information. When comparing across slides, it is essential to account for sample loading, the total amount of protein printed per sample. Currently, total protein is estimated using either a housekeeping protein or the sample median across all slides. When the variability in sample loading is large, these methods are suboptimal because they do not account for the fact that the protein expression for each slide is estimated separately.
Results: We propose a new normalization method for RPPA data, called variable slope (VS) normalization, that takes into account that quantification of RPPA slides is performed separately. This method is better able to remove loading bias and recover true correlation structures between proteins.
Availability: Code to implement the method in the statistical package R and anonymized data are available at http://bioinformatics.mdanderson.org/supplements.html.
Supplementary data are available at Bioinformatics online.
Protein extraction from formalin-fixed paraffin-embedded (FFPE) tissues is challenging due to extensive molecular crosslinking that occurs upon formalin fixation. Reverse-phase protein array (RPPA) is a high-throughput technology, which can detect changes in protein levels and protein functionality in numerous tissue and cell sources. It has been used to evaluate protein expression mainly in frozen preparations or FFPE-based studies of limited scope. Reproducibility and reliability of the technique in FFPE samples has not yet been demonstrated extensively. We developed and optimized an efficient and reproducible procedure for extraction of proteins from FFPE cells and xenografts, and then applied the method to FFPE patient tissues and evaluated its performance on RPPA.
Fresh frozen and FFPE preparations from cell lines, xenografts and breast cancer and renal tissues were included in the study. Serial FFPE cell or xenograft sections were deparaffinized and extracted by six different protein extraction protocols. The yield and level of protein degradation were evaluated by SDS-PAGE and Western Blots. The most efficient protocol was used to prepare protein lysates from breast cancer and renal tissues, which were subsequently subjected to RPPA. Reproducibility was evaluated and Spearman correlation was calculated between matching fresh frozen and FFPE samples.
The most effective approach from six protein extraction protocols tested enabled efficient extraction of immunoreactive protein from cell line, breast cancer and renal tissue sample sets. 85% of the total of 169 markers tested on RPPA demonstrated significant correlation between FFPE and frozen preparations (p < 0.05) in at least one cell or tissue type, with only 23 markers common in all three sample sets. In addition, FFPE preparations yielded biologically meaningful observations related to pathway signaling status in cell lines, and classification of renal tissues.
With optimized protein extraction methods, FFPE tissues can be a valuable source in generating reproducible and biologically relevant proteomic profiles using RPPA, with specific marker performance varying according to tissue type.
Formalin-fixed; Paraffin-embedded tissue; Protein extraction; Reverse phase protein array; Breast cancer; Renal cancer
Loading control (LC) and variance stabilization of reverse-phase protein array (RPPA) data have been challenging mainly due to the small number of proteins in an experiment and the lack of reliable inherent control markers. In this study, we compare eight different normalization methods for LC and variance stabilization. The invariant marker set concept was first applied to the normalization of high-throughput gene expression data. A set of “invariant” markers are selected to create a virtual reference sample. Then all the samples are normalized to the virtual reference. We propose a variant of this method in the context of RPPA data normalization and compare it with seven other normalization methods previously reported in the literature. The invariant marker set method performs well with respect to LC, variance stabilization and association with the immunohistochemistry/florescence in situ hybridization data for three key markers in breast tumor samples, while the other methods have inferior performance. The proposed method is a promising approach for improving the quality of RPPA data.
reverse-phase protein array; RPPA; normalization; proteomics
Reporting and sharing experimental metadata- such as the experimental design, characteristics of the samples, and procedures applied, along with the analysis results, in a standardised manner ensures that datasets are comprehensible and, in principle, reproducible, comparable and reusable. Furthermore, sharing datasets in formats designed for consumption by humans and machines will also maximize their use. The Investigation/Study/Assay (ISA) open source metadata tracking framework facilitates standards-compliant collection, curation, visualization, storage and sharing of datasets, leveraging on other platforms to enable analysis and publication. The ISA software suite includes several components used in increasingly diverse set of life science and biomedical domains; it is underpinned by a general-purpose format, ISA-Tab, and conversions exist into formats required by public repositories. While ISA-Tab works well mainly as a human readable format, we have also implemented a linked data approach to semantically define the ISA-Tab syntax.
We present a semantic web representation of the ISA-Tab syntax that complements ISA-Tab's syntactic interoperability with semantic interoperability. We introduce the linkedISA conversion tool from ISA-Tab to the Resource Description Framework (RDF), supporting mappings from the ISA syntax to multiple community-defined, open ontologies and capitalising on user-provided ontology annotations in the experimental metadata. We describe insights of the implementation and how annotations can be expanded driven by the metadata. We applied the conversion tool as part of Bio-GraphIIn, a web-based application supporting integration of the semantically-rich experimental descriptions. Designed in a user-friendly manner, the Bio-GraphIIn interface hides most of the complexities to the users, exposing a familiar tabular view of the experimental description to allow seamless interaction with the RDF representation, and visualising descriptors to drive the query over the semantic representation of the experimental design. In addition, we defined queries over the linkedISA RDF representation and demonstrated its use over the linkedISA conversion of datasets from Nature' Scientific Data online publication.
Our linked data approach has allowed us to: 1) make the ISA-Tab semantics explicit and machine-processable, 2) exploit the existing ontology-based annotations in the ISA-Tab experimental descriptions, 3) augment the ISA-Tab syntax with new descriptive elements, 4) visualise and query elements related to the experimental design. Reasoning over ISA-Tab metadata and associated data will facilitate data integration and knowledge discovery.
The lack of large panels of validated antibodies, tissue handling variability, and intratumoral heterogeneity potentially hamper comprehensive study of the functional proteome in non-microdissected solid tumors. The purpose of this study was to address these concerns and to demonstrate clinical utility for the functional analysis of proteins in non-microdissected breast tumors using reverse phase protein arrays (RPPA).
Herein, 82 antibodies that recognize kinase and steroid signaling proteins and effectors were validated for RPPA. Intraslide and interslide coefficients of variability were <15%. Multiple sites in non-microdissected breast tumors were analyzed using RPPA after intervals of up to 24 h on the benchtop at room temperature following surgical resection.
Twenty-one of 82 total and phosphoproteins demonstrated time-dependent instability at room temperature with most variability occurring at later time points between 6 and 24 h. However, the 82-protein functional proteomic “fingerprint” was robust in most tumors even when maintained at room temperature for 24 h before freezing. In repeat samples from each tumor, intratumoral protein levels were markedly less variable than intertumoral levels. Indeed, an independent analysis of prognostic biomarkers in tissue from multiple tumor sites accurately and reproducibly predicted patient outcomes. Significant correlations were observed between RPPA and immunohistochemistry. However, RPPA demonstrated a superior dynamic range. Classification of 128 breast cancers using RPPA identified six subgroups with markedly different patient outcomes that demonstrated a significant correlation with breast cancer subtypes identified by transcriptional profiling.
Thus, the robustness of RPPA and stability of the functional proteomic “fingerprint” facilitate the study of the functional proteome in non-microdissected breast tumors.
Functional proteome; RPPA; Breast cancer; Kinase signaling; Steroid signaling
Vascular endothelial growth factor (VEGF) is a critical pro-angiogenic factor, found in a number of cancers, and a target of therapy. It is typically assessed by immunohistochemistry (IHC) in clinical research. However, IHC is not a quantitative assay and is rarely reproducible. We compared VEGF levels in colon cancer by IHC and a quantitative immunoassay on proteins isolated from formalin fixed, paraffin embedded tissues.
VEGF expression was studied by means of a well-based reverse phase protein array (RPPA) and immunohistochemistry in 69 colon cancer cases, and compared with various clinicopathologic factors. Protein lysates derived from formalin fixed, paraffin embedded tissue contained measurable immunoreactive VEGF molecules. The VEGF expression level of well differentiated colon cancer was significantly higher than those with moderately and poorly differentiated carcinomas by immunohistochemistry (P = 0.04) and well-based RPPA (P = 0.04). VEGF quantification by well-based RPPA also demonstrated an association with nodal metastasis status (P = 0.05). In addition, the normalized VEGF value by well-based RPPA correlated (r = 0.283, P = 0.018). Furthermore, subgroup analysis by histologic type revealed that adenocarcinoma cases showed significant correlation (r = 0.315, P = 0.031) between well-based RPPA and IHC.
The well-based RPPA method is a high throughput and sensitive approach, is an excellent tool for quantification of marker proteins. Notably, this method may be helpful for more objective evaluation of protein expression in cancer patients.
Vascular endothelial growth factor; Formalin-fixed paraffin-embedded; Colon cancer; Immunohistochemistry; Reverse-phase protein array
Motivation: Reverse-phase protein arrays (RPPAs) allow sensitive quantification of relative protein abundance in thousands of samples in parallel. Typical challenges involved in this technology are antibody selection, sample preparation and optimization of staining conditions. The issue of combining effective sample management and data analysis, however, has been widely neglected.
Results: This motivated us to develop MIRACLE, a comprehensive and user-friendly web application bridging the gap between spotting and array analysis by conveniently keeping track of sample information. Data processing includes correction of staining bias, estimation of protein concentration from response curves, normalization for total protein amount per sample and statistical evaluation. Established analysis methods have been integrated with MIRACLE, offering experimental scientists an end-to-end solution for sample management and for carrying out data analysis. In addition, experienced users have the possibility to export data to R for more complex analyses. MIRACLE thus has the potential to further spread utilization of RPPAs as an emerging technology for high-throughput protein analysis.
Availability: Project URL: http://www.nanocan.org/miracle/
Supplementary data are available at Bioinformatics online.
Recent advancements in technology and methodology have led to growing amounts of increasingly complex neuroscience data recorded from various species, modalities, and levels of study. The rapid data growth has made efficient data access and flexible, machine-readable data annotation a crucial requisite for neuroscientists. Clear and consistent annotation and organization of data is not only an important ingredient for reproducibility of results and re-use of data, but also essential for collaborative research and data sharing. In particular, efficient data management and interoperability requires a unified approach that integrates data and metadata and provides a common way of accessing this information. In this paper we describe GNData, a data management platform for neurophysiological data. GNData provides a storage system based on a data representation that is suitable to organize data and metadata from any electrophysiological experiment, with a functionality exposed via a common application programming interface (API). Data representation and API structure are compatible with existing approaches for data and metadata representation in neurophysiology. The API implementation is based on the Representational State Transfer (REST) pattern, which enables data access integration in software applications and facilitates the development of tools that communicate with the service. Client libraries that interact with the API provide direct data access from computing environments like Matlab or Python, enabling integration of data management into the scientist's experimental or analysis routines.
electrophysiology; data management; neuroinformatics; web service; collaboration; neo; odml
Using a new type of array technology, the reverse phase protein array (RPPA), we measure time-course protein expression for a set of selected markers that are known to co-regulate biological functions in a pathway structure. To accommodate the complex dependent nature of the data, including temporal correlation and pathway dependence for the protein markers, we propose a mixed effects model with temporal and protein-specific components. We develop a sequence of random probability measures (RPM) to account for the dependence in time of the protein expression measurements. Marginally, for each RPM we assume a Dirichlet process (DP) model. The dependence is introduced by defining multivariate beta distributions for the unnormalized weights of the stick breaking representation. We also acknowledge the pathway dependence among proteins via a conditionally autoregressive (CAR) model. Applying our model to the RPPA data, we reveal a pathway-dependent functional profile for the set of proteins as well as marginal expression profiles over time for individual markers.
Bayesian nonparametrics; dependent random measures; Markov beta process; mixed effects model; stick breaking processes; time series analysis
maxdLoad2 is a relational database schema and Java® application for microarray experimental annotation and storage. It is compliant with all standards for microarray meta-data capture; including the specification of what data should be recorded, extensive use of standard ontologies and support for data exchange formats. The output from maxdLoad2 is of a form acceptable for submission to the ArrayExpress microarray repository at the European Bioinformatics Institute. maxdBrowse is a PHP web-application that makes contents of maxdLoad2 databases accessible via web-browser, the command-line and web-service environments. It thus acts as both a dissemination and data-mining tool.
maxdLoad2 presents an easy-to-use interface to an underlying relational database and provides a full complement of facilities for browsing, searching and editing. There is a tree-based visualization of data connectivity and the ability to explore the links between any pair of data elements, irrespective of how many intermediate links lie between them. Its principle novel features are:
• the flexibility of the meta-data that can be captured,
• the tools provided for importing data from spreadsheets and other tabular representations,
• the tools provided for the automatic creation of structured documents,
• the ability to browse and access the data via web and web-services interfaces.
Within maxdLoad2 it is very straightforward to customise the meta-data that is being captured or change the definitions of the meta-data. These meta-data definitions are stored within the database itself allowing client software to connect properly to a modified database without having to be specially configured. The meta-data definitions (configuration file) can also be centralized allowing changes made in response to revisions of standards or terminologies to be propagated to clients without user intervention.
maxdBrowse is hosted on a web-server and presents multiple interfaces to the contents of maxd databases. maxdBrowse emulates many of the browse and search features available in the maxdLoad2 application via a web-browser. This allows users who are not familiar with maxdLoad2 to browse and export microarray data from the database for their own analysis. The same browse and search features are also available via command-line and SOAP server interfaces. This both enables scripting of data export for use embedded in data repositories and analysis environments, and allows access to the maxd databases via web-service architectures.
maxdLoad2 and maxdBrowse are portable and compatible with all common operating systems and major database servers. They provide a powerful, flexible package for annotation of microarray experiments and a convenient dissemination environment. They are available for download and open sourced under the Artistic License.
Measuring the states of cell signaling pathways in tumor samples promises to advance understanding of oncogenesis and identify response biomarkers. Here, we describe the use of Reverse Phase Protein Arrays (RPPAs or RPLAs) to profile signaling proteins in 56 breast cancers and matched normal tissue. In RPPAs, hundreds to thousands of lysates are arrayed in dense regular grids and each grid is probed with a different antibody (100 in the current work, of which 71 yielded strong signals with breast tissue). Although RPPA technology is quite widely used, measuring changes in phosphorylation reflective of protein activation remains challenging. Using repeat deposition and well-validated antibodies we show that diverse patterns of phosphorylation can be monitored in tumor samples and changes mapped onto signaling networks in a coherent fashion. The patterns are consistent with biomarker-based classification of breast cancers and known mechanisms of oncogenesis. We explore in detail one tumor-associated pattern that involves changes in the abundance of the Axl receptor tyrosine kinase (RTK) and phosphorylation of the cMet RTK. Both cMet and Axl have been implicated in breast cancer, or in resistance to anti-cancer drugs, but the two RTKs are not known to be linked functionally. Protein depletion and over-expression studies in a “triple-negative” breast cells line reveal crosstalk between Axl and cMet involving Axl-mediated modification of cMet, a requirement for cMet in efficient and timely signal transduction by the Axl ligand Gas6 and the potential for the two receptors to interact physically. These findings have potential therapeutic implications since they imply that bi-specific receptor inhibitors (e.g. ATP-competitive small kinase inhibitors such as GSK1363089, BMS-777607 or MP470) may be more efficacious than the monospecific therapeutic antibodies currently in development (e.g. MetMAb).
reverse phase protein arrays; breast cancer; tumor lysate; cell signaling; MET/AXL
The identification of key pathways dysregulated in non-small cell lung cancer (NSCLC) is an important step toward understanding lung pathogenesis and developing new therapeutic approaches.
Toward this goal, reverse-phase protein lysate arrays (RPPA) were used to compare signaling pathways between NSCLC tumors and paired normal lung tissue from 46 patients and assess their association with clinical outcome.
After RPPA quantification of 63 proteins and phosphoproteins, tissue pairs were randomized to a training set (n = 25 pairs) and test set (n = 21 pairs). In the training set, 15 protein markers were differentially expressed between tumors and normal lung (p ≤ 0.01), including markers in the PI3K/AKT and p38 MAPK signaling pathways (e.g., p70S6K, S6, p38, and phospho-p38), as well as caveolin-1 and β-catenin. A four-protein signature (p70S6K, cyclin B1, pSrc(Y527), and caveolin-1) independent of histology classified specimens as tumor versus normal with a predicted accuracy of 83%, sensitivity of 67%, and specificity of 100%. The signature was validated in the test set, correctly classifying all normal tissues and 14 of 21 tumor tissues. RPPA results were confirmed by immunohistochemistry for caveolin-1 and p70S6K. In tumors from patients with resected NSCLC, expression of proteins in the energy-sensing AMPK pathway (pLKB1, AMPK, p-Acetyl-CoA, pTSC2), adhesion, EGFR, and Rb signaling pathways was inversely associated with NSCLC recurrence.
These data provide evidence for dysregulation of several pathways including those involving energy sensing and adhesion that are potentially associated with NSCLC pathogenesis and disease recurrence.
NSCLC; Proteomics; Recurrence; AMPK; Adhesion
Peroxiredoxin-1 (PRDX1) is a multifunctional protein, acting as a hydrogen peroxide (H2O2) scavenger, molecular chaperone and immune modulator. Although differential PRDX1 expression has been described in many tumors, the potential role of PRDX1 in breast cancer remains highly ambiguous. Using a comprehensive antibody-based proteomics approach, we interrogated PRDX1 protein as a putative biomarker in estrogen receptor (ER)-positive breast cancer.
An anti-PRDX1 antibody was validated in breast cancer cell lines using immunoblotting, immunohistochemistry and reverse phase protein array (RPPA) technology. PRDX1 protein expression was evaluated in two independent breast cancer cohorts, represented on a screening RPPA (n = 712) and a validation tissue microarray (n = 498). In vitro assays were performed exploring the functional contribution of PRDX1, with oxidative stress conditions mimicked via treatment with H2O2, peroxynitrite, or adenanthin, a PRDX1/2 inhibitor.
In ER-positive cases, high PRDX1 protein expression is a biomarker of improved prognosis across both cohorts. In the validation cohort, high PRDX1 expression was an independent predictor of improved relapse-free survival (hazard ratio (HR) = 0.62, 95% confidence interval (CI) = 0.40 to 0.96, P = 0.032), breast cancer-specific survival (HR = 0.44, 95% CI = 0.24 to 0.79, P = 0.006) and overall survival (HR = 0.61, 95% CI = 0.44 to 0.85, P = 0.004). RPPA screening of cancer signaling proteins showed that ERα protein was upregulated in PRDX1 high tumors. Exogenous H2O2 treatment decreased ERα protein levels in ER-positive cells. PRDX1 knockdown further sensitized cells to H2O2- and peroxynitrite-mediated effects, whilst PRDX1 overexpression protected against this response. Inhibition of PRDX1/2 antioxidant activity with adenanthin dramatically reduced ERα levels in breast cancer cells.
PRDX1 is shown to be an independent predictor of improved outcomes in ER-positive breast cancer. Through its antioxidant function, PRDX1 may prevent oxidative stress-mediated ERα loss, thereby potentially contributing to maintenance of an ER-positive phenotype in mammary tumors. These results for the first time imply a close connection between biological activity of PRDX1 and regulation of estrogen-mediated signaling in breast cancer.
Modern gene perturbation techniques, like RNA interference (RNAi), enable us to study effects of targeted interventions in cells efficiently. In combination with mRNA or protein expression data this allows to gain insights into the behavior of complex biological systems.
In this paper, we propose Deterministic Effects Propagation Networks (DEPNs) as a special Bayesian Network approach to reverse engineer signaling networks from a combination of protein expression and perturbation data. DEPNs allow to reconstruct protein networks based on combinatorial intervention effects, which are monitored via changes of the protein expression or activation over one or a few time points. Our implementation of DEPNs allows for latent network nodes (i.e. proteins without measurements) and has a built in mechanism to impute missing data. The robustness of our approach was tested on simulated data. We applied DEPNs to reconstruct the ERBB signaling network in de novo trastuzumab resistant human breast cancer cells, where protein expression was monitored on Reverse Phase Protein Arrays (RPPAs) after knockdown of network proteins using RNAi.
DEPNs offer a robust, efficient and simple approach to infer protein signaling networks from multiple interventions. The method as well as the data have been made part of the latest version of the R package "nem" available as a supplement to this paper and via the Bioconductor repository.
Contemporary informatics and genomics research require efficient, flexible and robust management of large heterogeneous data, advanced computational tools, powerful visualization, reliable hardware infrastructure, interoperability of computational resources, and detailed data and analysis-protocol provenance. The Pipeline is a client-server distributed computational environment that facilitates the visual graphical construction, execution, monitoring, validation and dissemination of advanced data analysis protocols.
This paper reports on the applications of the LONI Pipeline environment to address two informatics challenges - graphical management of diverse genomics tools, and the interoperability of informatics software. Specifically, this manuscript presents the concrete details of deploying general informatics suites and individual software tools to new hardware infrastructures, the design, validation and execution of new visual analysis protocols via the Pipeline graphical interface, and integration of diverse informatics tools via the Pipeline eXtensible Markup Language syntax. We demonstrate each of these processes using several established informatics packages (e.g., miBLAST, EMBOSS, mrFAST, GWASS, MAQ, SAMtools, Bowtie) for basic local sequence alignment and search, molecular biology data analysis, and genome-wide association studies. These examples demonstrate the power of the Pipeline graphical workflow environment to enable integration of bioinformatics resources which provide a well-defined syntax for dynamic specification of the input/output parameters and the run-time execution controls.
The LONI Pipeline environment http://pipeline.loni.ucla.edu provides a flexible graphical infrastructure for efficient biomedical computing and distributed informatics research. The interactive Pipeline resource manager enables the utilization and interoperability of diverse types of informatics resources. The Pipeline client-server model provides computational power to a broad spectrum of informatics investigators - experienced developers and novice users, user with or without access to advanced computational-resources (e.g., Grid, data), as well as basic and translational scientists. The open development, validation and dissemination of computational networks (pipeline workflows) facilitates the sharing of knowledge, tools, protocols and best practices, and enables the unbiased validation and replication of scientific findings by the entire community.