The current gold standard for diagnosis of hepatic fibrosis and cirrhosis is the traditional invasive liver biopsy. It is desirable to assess hepatic fibrosis with noninvasive means. Targeted proteomic techniques allow an unbiased assessment of proteins and might be useful to identify proteins related to hepatic fibrosis. We utilized Selected Reaction Monitoring (SRM) targeted proteomics combined with an organ-specific blood protein strategy to identify and quantify 38 liver-specific proteins. A combination of protein C and retinol binding protein 4 in serum gave promising preliminary results as candidate biomarkers to distinguish patients at different stages of hepatic fibrosis due to chronic infection with hepatitis C virus (HCV). Also, alpha-1-B glycoprotein, complement factor H and insulin-like growth factor binding protein acid labile subunit performed well in distinguishing patients from healthy controls.
hepatitis C; fibrosis; liver-specific blood biomarkers; quantitation; selected reaction monitoring
We utilized abundant transcriptomic data for the primary classes of brain cancers to study the feasibility of separating all of these diseases simultaneously based on molecular data alone. These signatures were based on a new method reported herein – Identification of Structured Signatures and Classifiers (ISSAC) – that resulted in a brain cancer marker panel of 44 unique genes. Many of these genes have established relevance to the brain cancers examined herein, with others having known roles in cancer biology. Analyses on large-scale data from multiple sources must deal with significant challenges associated with heterogeneity between different published studies, for it was observed that the variation among individual studies often had a larger effect on the transcriptome than did phenotype differences, as is typical. For this reason, we restricted ourselves to studying only cases where we had at least two independent studies performed for each phenotype, and also reprocessed all the raw data from the studies using a unified pre-processing pipeline. We found that learning signatures across multiple datasets greatly enhanced reproducibility and accuracy in predictive performance on truly independent validation sets, even when keeping the size of the training set the same. This was most likely due to the meta-signature encompassing more of the heterogeneity across different sources and conditions, while amplifying signal from the repeated global characteristics of the phenotype. When molecular signatures of brain cancers were constructed from all currently available microarray data, 90% phenotype prediction accuracy, or the accuracy of identifying a particular brain cancer from the background of all phenotypes, was found. Looking forward, we discuss our approach in the context of the eventual development of organ-specific molecular signatures from peripheral fluids such as the blood.
From a multi-study, integrated transcriptomic dataset, we identified a marker panel for differentiating major human brain cancers at the gene-expression level. The ISSAC molecular signatures for brain cancers, composed of 44 unique genes, are based on comparing expression levels of pairs of genes, and phenotype prediction follows a diagnostic hierarchy. We found that sufficient dataset integration across multiple studies greatly enhanced diagnostic performance on truly independent validation sets, whereas signatures learned from only one dataset typically led to high error rate. Molecular signatures of brain cancers, when obtained using all currently available gene-expression data, achieved 90% phenotype prediction accuracy. Thus, our integrative approach holds significant promise for developing organ-level, comprehensive, molecular signatures of disease.
Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape.
Since microRNAs (miRNAs) were discovered, their impact on regulating various biological activities has been a surprising and exciting field. Knowing the entire repertoire of these small molecules is the first step to gain a better understanding of their function. High throughput discovery tools such as next-generation sequencing significantly increased the number of known miRNAs in different organisms in recent years. However, the process of being able to accurately identify miRNAs is still a complex and difficult task, requiring the integration of experimental approaches with computational methods. A number of prediction algorithms based on characteristics of miRNA molecules have been developed to identify new miRNA species. Different approaches have certain strengths and weaknesses and in this review, we aim to summarize several commonly used tools in metazoan miRNA discovery.
isomer; machine learning; miRNA conservation; RNA secondary structure; sequence homology
Studying complex biological systems in a holistic rather than a “one gene or one protein” at a time approach requires the concerted effort of scientists from a wide variety of disciplines. The Institute for Systems Biology (ISB) has seamlessly integrated these disparate fields to create a cross-disciplinary platform and culture in which “biology drives technology drives computation.” To achieve this platform/culture, it has been necessary for cross-disciplinary ISB scientists to learn one another’s languages and work together effectively in teams. The focus of this “systems” approach on disease has led to a discipline denoted systems medicine. The advent of technological breakthroughs in the fields of genomics, proteomics, and, indeed, the other “omics” is catalyzing striking advances in systems medicine that have and are transforming diagnostic and therapeutic strategies. Systems medicine has united genomics and genetics through family genomics to more readily identify disease genes. It has made blood a window into health and disease. It is leading to the stratification of diseases (division into discrete subtypes) for proper impedance match against drugs and the stratification of patients into subgroups that respond to environmental challenges in a similar manner (e.g. response to drugs, response to toxins, etc.). The convergence of patient-activated social networks, big data and their analytics, and systems medicine has led to a P4 medicine that is predictive, preventive, personalized, and participatory. Medicine will focus on each individual. It will become proactive in nature. It will increasingly focus on wellness rather than disease. For example, in 10 years each patient will be surrounded by a virtual cloud of billions of data points, and we will have the tools to reduce this enormous data dimensionality into simple hypotheses about how to optimize wellness and avoid disease for each individual. P4 medicine will be able to detect and treat perturbations in healthy individuals long before disease symptoms appear, thus optimizing the wellness of individuals and avoiding disease. P4 medicine will 1) improve health care, 2) reduce the cost of health care, and 3) stimulate innovation and new company creation. Health care is not the only subject that can benefit from such integrative, cross-disciplinary, and systems-driven platforms and cultures. Many other challenges plaguing our planet, such as energy, environment, nutrition, and agriculture can be transformed by using such an integrated and systems-driven approach.
P4 medicine; systems medicine; systems biology; personalized medicine; disease stratification; patient stratification; systems-driven diagnostics
E14.Tg2a mouse embryonic stem (mES) cells are a widely used host in gene trap and gene targeting techniques. Molecular characterization of host cells will provide background information for a better understanding of functions of the knockout genes. Using a highly selective glycopeptide-capture approach but ordinary liquid chromatography coupled mass spectrometry (LC-MS), we characterized the N-glycoproteins of E14.Tg2a cells and analyzed the close relationship between the obtained N-glycoproteome and cell-surface proteomes. Our results provide a global view of cell surface protein molecular properties, in which receptors seem to be much more diverse but lower in abundance than transporters on average. In addition, our results provide a systematic view of the E14.Tg2a N-glycosylation, from which we discovered some striking patterns, including an evolutionarily preserved and maybe functionally selected complementarity between N-glycosylation and the transmembrane structure in protein sequences. We also observed an environmentally influenced N-glycosylation pattern among glycoenzymes and extracellular matrix proteins. We hope that the acquired information enhances our molecular understanding of mES E14.Tg2a as well as the biological roles played by N-glycosylation in cell biology in general.
Rheumatoid arthritis (RA) is a chronic autoimmune disease that primarily attacks synovial joints. Despite the advances in diagnosis and treatment of RA, novel molecular targets are still needed to improve the accuracy of diagnosis and the therapeutic outcomes. Here, we present a systems approach that can effectively 1) identify core RA-associated genes (RAGs), 2) reconstruct RA-perturbed networks, and 3) select potential targets for diagnosis and treatments of RA. By integrating multiple gene expression datasets previously reported, we first identified 983 core RAGs that show RA dominant differential expression, compared to osteoarthritis (OA), in the multiple datasets. Using the core RAGs, we then reconstructed RA-perturbed networks that delineate key RA associated cellular processes and transcriptional regulation. The networks revealed that synovial fibroblasts play major roles in defining RA-perturbed processes, anti-TNF-α therapy restored many RA-perturbed processes, and 19 transcription factors (TFs) have major contribution to deregulation of the core RAGs in the RA-perturbed networks. Finally, we selected a list of potential molecular targets that can act as metrics or modulators of the RA-perturbed networks. Therefore, these network models identify a panel of potential targets that will serve as an important resource for the discovery of therapeutic targets and diagnostic markers, as well as providing novel insights into RA pathogenesis.
Summary: With the rapidly expanding availability of data from personal genomes, exomes and transcriptomes, medical researchers will frequently need to test whether observed genomic variants are novel or known. This task requires downloading and handling large and diverse datasets from a variety of sources, and processing them with bioinformatics tools and pipelines. Alternatively, researchers can upload data to online tools, which may conflict with privacy requirements. We present here Kaviar, a tool that greatly simplifies the assessment of novel variants. Kaviar includes: (i) an integrated and growing database of genomic variation from diverse sources, including over 55 million variants from personal genomes, family genomes, transcriptomes, SNV databases and population surveys; and (ii) software for querying the database efficiently.
Availability: Kaviar is programmed in Perl and offered free of charge as Open Source Software. Kaviar may be used online as a programmatic web service or downloaded for local use from http://db.systemsbiology.net/kaviar. The database is also provided.
Supplementary Information: Supplementary data are available at Bioinformatics online.
SOX2 is an important stem cell marker and plays important roles in development and carcinogenesis. However, the role of SOX2 in Epithelial-Mesenchymal Transition has not been investigated. We demonstrated, for the first time, that SOX2 is involved in the Epithelial-Mesenchymal Transition (EMT) process as knock downof SOX2 in colorectal cancer (CRC) SW620 cells induced a Mesenchymal-Epithelial Transition (MET) process with recognized changes in the expression of key genes involved in the EMT process including E-cadherin and vimentin. In addition, we provided a link between SOX2 activity and the WNT pathway by showing that knock down of SOX2 reduced the WNT pathway activity in colorectal cancer (CRC) cells. We further demonstrated that SOX2 is involved in cell migration and invasion in vitro and in metastasis in vivo for CRC cells, and that the process might be mediated through the MMP2 activity. Finally, an IHC analysis of 44 cases of colorectal cancer patients suggested that SOX2 is a prognosis marker for metastasis of colorectal cancers.
Complexity is the grand challenge for science and engineering in the 21st century. We suggest that biology is a discipline that is uniquely situated to tackle complexity, through a diverse array of technologies for characterizing molecular structure, interactions and function. A major difficulty in the analysis of complex biological systems is dealing with the low signal-to-noise inherent to nearly all large-scale biological data sets. We discuss powerful bioinformatic concepts for boosting signal-to-noise through external knowledge incorporated in processing units we call Filters and Integrators. These concepts are illustrated in four landmark studies that have provided model implementations of Filters, Integrators, or both.
Motivation: Systems biology attempts to describe complex systems behaviors in terms of dynamic operations of biological networks. However, there is lack of tools that can effectively decode complex network dynamics over multiple conditions.
Results: We present principal network analysis (PNA) that can automatically capture major dynamic activation patterns over multiple conditions and then generate protein and metabolic subnetworks for the captured patterns. We first demonstrated the utility of this method by applying it to a synthetic dataset. The results showed that PNA correctly captured the subnetworks representing dynamics in the data. We further applied PNA to two time-course gene expression profiles collected from (i) MCF7 cells after treatments of HRG at multiple doses and (ii) brain samples of four strains of mice infected with two prion strains. The resulting subnetworks and their interactions revealed network dynamics associated with HRG dose-dependent regulation of cell proliferation and differentiation and early PrPSc accumulation during prion infection.
Availability: The web-based software is available at: http://sbm.postech.ac.kr/pna.
Contact: email@example.com; firstname.lastname@example.org
Supplementary information: Supplementary data are available at Bioinformatics online.
Next-generation sequencing (NGS) technologies-based transcriptomic profiling method often called RNA-seq has been widely used to study global gene expression, alternative exon usage, new exon discovery, novel transcriptional isoforms and genomic sequence variations. However, this technique also poses many biological and informatics challenges to extracting meaningful biological information. The RNA-seq data analysis is built on the foundation of high quality initial genome localization and alignment information for RNA-seq sequences. Toward this goal, we have developed RNASEQR to accurately and effectively map millions of RNA-seq sequences. We have systematically compared RNASEQR with four of the most widely used tools using a simulated data set created from the Consensus CDS project and two experimental RNA-seq data sets generated from a human glioblastoma patient. Our results showed that RNASEQR yields more accurate estimates for gene expression, complete gene structures and new transcript isoforms, as well as more accurate detection of single nucleotide variants (SNVs). RNASEQR analyzes raw data from RNA-seq experiments effectively and outputs results in a manner that is compatible with a wide variety of specialized downstream analyses on desktop computers.
MicroRNAs (miRNAs) are a recently discovered class of small, non-coding RNAs that regulate protein levels post-transcriptionally. miRNAs play important regulatory roles in many cellular processes, including differentiation, neoplastic transformation, and cell replication and regeneration. Because of these regulatory roles, it is not surprising that aberrant miRNA expression has been implicated in several diseases. Recent studies have reported significant levels of miRNAs in serum and other body fluids, raising the possibility that circulating miRNAs could serve as useful clinical biomarkers. Here, we provide a brief overview of miRNA biogenesis and function, the identification and potential roles of circulating extracellular miRNAs, and the prospective uses of miRNAs as clinical biomarkers. Finally, we address several issues associated with the accurate measurement of miRNAs from biological samples.
During prion infections of the central nervous system (CNS) the cellular prion protein, PrPC, is templated to a conformationally distinct form, PrPSc. Recent studies have demonstrated that the Sprn gene encodes a GPI-linked glycoprotein Shadoo (Sho), which localizes to a similar membrane environment as PrPC and is reduced in the brains of rodents with terminal prion disease. Here, analyses of prion-infected mice revealed that down-regulation of Sho protein was not related to Sprn mRNA abundance at any stage in prion infection. Down-regulation was robust upon propagation of a variety of prion strains in Prnpa and Prnpb mice, with the exception of the mouse-adapted BSE strain 301 V. In addition, Sho encoded by a TgSprn transgene was down-regulated to the same extent as endogenous Sho. Reduced Sho levels were not seen in a tauopathy, in chemically induced spongiform degeneration or in transgenic mice expressing the extracellular ADan amyloid peptide of familial Danish dementia. Insofar as prion-infected Prnp hemizygous mice exhibited accumulation of PrPSc and down-regulation of Sho hundreds of days prior to onset of neurologic symptoms, Sho depletion can be excluded as an important trigger for clinical disease or as a simple consequence of neuronal damage. These studies instead define a disease-specific effect, and we hypothesize that membrane-associated Sho comprises a bystander substrate for processes degrading PrPSc. Thus, while protease-resistant PrP detected by in vitro digestion allows post mortem diagnosis, decreased levels of endogenous Sho may trace an early response to PrPSc accumulation that operates in the CNS in vivo. This cellular response may offer new insights into the homeostatic mechanisms involved in detection and clearance of the misfolded proteins that drive prion disease pathogenesis.
In prion infections of the nervous system the cellular prion protein, PrPC, changes to a distinct form, PrPSc. Recent studies have demonstrated that another glycoprotein Shadoo (Sho), which occupies a similar membrane environment as PrPC, is reduced in the brains of rodents with terminal prion disease. Our analyses of prion-infected mice revealed that reduction of Sho protein was not due to reductions in the corresponding messenger RNA. Reduction in Sho was clearly evident upon propagation of a variety of prion strains, but was not seen in mice with other types of neurodegenerative disease. Also, as prion-infected mice with only one copy of the PrP gene exhibited both accumulation of PrPSc and a reduction of Sho protein hundreds of days prior to onset of neurologic symptoms, the drop in Sho protein level can be excluded as an important trigger for clinical disease, or a non-specific consequence of brain cell damage. Instead, our studies define a effect restricted to prion disease and we hypothesize that Sho protein is a “bystander” for degradative processes aimed at destroying PrPSc.
An endogenous molecular-cellular network for both normal and abnormal functions is assumed to exist. This endogenous network forms a nonlinear stochastic dynamical system, with many stable attractors in its functional landscape. Normal or abnormal robust states can be decided by this network in a manner similar to the neural network. In this context cancer is hypothesized as one of its robust intrinsic states.
This hypothesis implies that a nonlinear stochastic mathematical cancer model is constructible based on available experimental data and its quantitative prediction is directly testable. Within such model the genesis and progression of cancer may be viewed as stochastic transitions between different attractors. Thus it further suggests that progressions are not arbitrary. Other important issues on cancer, such as genetic vs epigenetics, double-edge effect, dormancy, are discussed in the light of present hypothesis. A different set of strategies for cancer prevention, cure, and care, is therefore suggested.
The pioneering work of Jean Dausset on the HLA system established several principles that were later reflected in the Human Genome Project and contributed to the foundations of predictive, preventive, personalized and participatory (P4) medicine. To effectively develop systems medicine, we should take advantage of the lessons of the HLA saga, emphasizing the importance of exploring a fascinating but mysterious biology, now using systems principles, pioneering new technology developments and creating shared biological and information resources.
To understand the chemotherapy response program in ovarian cancer cells at deep transcript sequencing levels.
Two next-generation sequencing technologies—MPSS (massively parallel signature sequencing) and SBS (sequencing by synthesis) — were used to sequence the transcripts of IGROV1 and IGROV1-CP cells, and to sequence the transcripts of a highly chemotherapy responsive and a highly chemotherapy resistant ovarian cancer tissue.
We identified 3,422 signatures (2957 genes) that are significantly different between IGROV1 and IGROV1-CP cells (P <0.001). Gene Ontology (GO) term GO:0001837 (epithelial to mesenchymal transition) and GO:0034330 (cell junction assembly and maintenance) are enriched in genes that are over expressed in IGROV1-CP cells while apoptosis related GO terms are enriched in genes over expressed in IGROV1 cells. We identified 1,187 tags (corresponding to 1,040 genes) that are differentially expressed between the chemotherapy responsive and the persistently chemotherapy resistant ovarian cancer tissues. GO term GO:0050673 (epithelial cell proliferation) and GO:0050678 (regulation of epithelial cell proliferation) are enriched in the genes over expressed in the chemotherapy resistant tissue while the GO:0007229 (integrin-mediated signaling pathway) is enriched in the genes over expressed in the chemotherapy sensitive tissue. An integrative analysis identified 111 common differentially expressed genes including two bone morphogenetic proteins (BMP4 and BMP7), six solute carrier proteins (SLC10A3, SLC16A3, SLC25A1, SLC35B3, SLC7A5 and SLC7A7), transcription factor POU5F1 (POU class 5 homeobox 1), and KLK10 (kallikrein-related peptidase 10). A network analysis revealed a subnetwork with three gene BMP7, NR2F2 and AP2B1 that were consistently over expressed in the chemoresistant tissue or cells compared to the chemosensitive tissue or cells.
Our database offers the first comprehensive view of the digital transcriptomes of ovarian cancer cell lines and tissues with different chemotherapy response phenotypes.
We analyzed the whole genome sequences of a family of four, consisting of two siblings and their parents. Family-based sequencing allowed us to delineate recombination sites precisely, identify 70% of the sequencing errors, and identify very rare SNVs. We also directly estimated a human intergeneration mutation rate of ∼1.1×10-8 per position per haploid genome. Both offspring in this family have two recessive disorders--Miller syndrome, for which the gene was concurrently identified, and primary ciliary dyskinesia, for which causative genes have been previously identified. Family-based genome analysis enabled us to narrow the candidate genes for both of these Mendelian disorders to only four. Our results demonstrate the unique value of complete genome sequencing in families.
whole genome sequencing; rare genetic disease; inheritance analysis; recessive models; de novo mutations; recombination hotspot; crossover; haploidentity; haploidentical block; inheritance state; inheritance vector; HMM; haplotype; Miller syndrome; POADS; DHODH; DNAH5; KIAA0556; CES1
Cancer stem cells (CSC), also called tumor initiating cells (TIC), are considered to be the origin of replicating malignant tumor cells in a variety of human cancers. Their presence in the tumor may herald malignancy potential, mediate resistance to conventional chemotherapy or radiotherapy, and confer poor survival outcomes. Thus, CSC may serve as critical cellular targets for treatment. The ability to therapeutically target CSC hinges upon identifying their unique cell surface markers and the underlying survival signaling pathways. While accumulating evidence suggests cell-surface antigens (such as CD44, CD133) as CSC markers for several tumor tissues, emerging clinical needs exist for the identification of new markers to completely separate CSC from normal stem cells. Recent studies have demonstrated the critical role of the tumor suppressor PTEN/PI3 kinase pathway in regulating TIC in leukemia, brain, and intestinal tissues. The successful eradication of tumors by therapies targeting CSC will require an in-depth understanding of the molecular mechanisms governing CSC self renewal, differentiation, and escape from conventional therapy. Here we review recent progress from brain tumor and intestinal stem cell research with a focus on the PTEN-Akt-Wnt pathway, and how the components of CSC pathways may serve as biomarkers for diagnosis, prognosis, and therapeutics.
cancer stem cells; cell surface marker; CD133; PTEN; Akt; Wnt
Systems biology is an approach to the science that views biology as an information science, studies biological systems as a whole and their interactions with the environment. This approach, for the reasons described here, has particular power in the search for informative diagnostic biomarkers of diseases because it focuses on the fundamental causes and keys on the identification and understanding of disease- perturbed molecular networks. In this review, we describe some recent developments that have used systems biology to address complex diseases – prion disease and drug induced liver injury- and use these as examples to illustrate the importance of understanding network structure and dynamics. The knowledge of network dynamics through in vitro experimental perturbation and modeling allows us to determine the state of the networks, to identify molecular correlates, and to derive new disease treatment approaches to reverse the pathology or prevent its progress into a more severe state through the manipulation of network states. This general approach, including diagnostics and therapeutics, is becoming known as systems medicine.
Systems biology; biomarkers; systems medicine; prion disease; drug induced liver injury; microRNA; organ-specific proteins
SOX2 is a key gene implicated in maintaining the stemness of embryonic and adult stem cells. SOX2 appears to re-activate in several human cancers including glioblastoma multiforme (GBM), however, the detailed response program of SOX2 in GBM has not yet been defined.
We show that knockdown of the SOX2 gene in LN229 GBM cells reduces cell proliferation and colony formation. We then comprehensively characterize the SOX2 response program by an integrated analysis using several advanced genomic technologies including ChIP-seq, microarray profiling, and microRNA sequencing. Using ChIP-seq technology, we identified 4883 SOX2 binding regions in the GBM cancer genome. SOX2 binding regions contain the consensus sequence wwTGnwTw that occurred 3931 instances in 2312 SOX2 binding regions. Microarray analysis identified 489 genes whose expression altered in response to SOX2 knockdown. Interesting findings include that SOX2 regulates the expression of SOX family proteins SOX1 and SOX18, and that SOX2 down regulates BEX1 (brain expressed X-linked 1) and BEX2 (brain expressed X-linked 2), two genes with tumor suppressor activity in GBM. Using next generation sequencing, we identified 105 precursor microRNAs (corresponding to 95 mature miRNAs) regulated by SOX2, including down regulation of miR-143, -145, -253-5p and miR-452. We also show that miR-145 and SOX2 form a double negative feedback loop in GBM cells, potentially creating a bistable system in GBM cells.
We present an integrated dataset of ChIP-seq, expression microarrays and microRNA sequencing representing the SOX2 response program in LN229 GBM cells. The insights gained from our integrated analysis further our understanding of the potential actions of SOX2 in carcinogenesis and serves as a useful resource for the research community.
Expressed prostatic secretions (EPS) contain proteins of prostate origin that may reflect the health status of the prostate and be used as diagnostic markers for prostate diseases including prostatitis, benign prostatic hyperplasia, and prostate cancer. Despite their importance and potential applications, a complete catalog of EPS proteins is not yet available. We, therefore, undertook a comprehensive analysis of the EPS proteome using 2-D micro-LC combined with MS/MS. Using stringent filtering criteria, we identified a list of 114 proteins with at least two unique-peptide hits and an additional 75 proteins with only a single unique-peptide hit. The proteins identified include kallikrein 2 (KLK2), KLK3 (prostate-specific antigen), KLK11, and nine cluster of differentiation (CD) molecules including CD10, CD13, CD14, CD26, CD66a, CD66c, CD 143, CD177, and CD224. To our knowledge, this list represents the first comprehensive characterization of the EPS proteome, and it provides a candidate biomarker list for targeted quantitative proteomics analysis using a multiple reaction monitoring (MRM) approach. To help prioritize candidate biomarkers, we constructed a protein–protein interaction network of the EPS proteins using Cytoscape (www.cytoscape.org), and overlaid the expression level changes from the Oncomine database onto the network.
Biomarker; Expressed prostatic secretions; Mass spectrometry; Prostate cancer
Epithelial ovarian cancer (EOC) ranks fifth as a cause of cancer deaths in women. Current diagnostic and monitoring markers have limited reliability for the detection of disease. We have tested the possibility of identifying candidate biomarkers present at low nanogram to picogram levels after removing both the 12 most abundant and 77 moderately abundant proteins from serum samples of EOC patients using antibody affinity columns. We showed that this approach allows the identification of proteins that are expressed at nanogram per liter levels in the serum. Using ICAT/MS/MS analysis, we identified 51 proteins that are differentially expressed by at least twofold. These proteins include leucine-rich α-2-glycoprotein, matrix metalloproteinase-9 (MMP-9), inter-α-trypsin inhibitor heavy chain H1, insulin-like growth factor-binding protein 6, insulin-like growth factor-binding protein 3, isoform 1 of epidermal growth factor receptor, angiopoietin-like protein 3 (ANGPTL3) and phosphatidylcholine-sterol acyltransferase. We confirmed the differential expression of MMP9 and ANGPTL3 in normal and ovarian cancer sera by ELISA assays. Further robust clinical evaluation of the candidate markers identified is necessary.
Angiopoietin-like protein 3; Matrix metalloproteinase-9; MS; Ovarian cancer; Serum depletion
A powerful way to separate signal from noise in biology is to convert the molecular data from individual genes or proteins into an analysis of comparative biological network behaviors. One of the limitations of previous network analyses is that they do not take into account the combinatorial nature of gene interactions within the network. We report here a new technique, Differential Rank Conservation (DIRAC), which permits one to assess these combinatorial interactions to quantify various biological pathways or networks in a comparative sense, and to determine how they change in different individuals experiencing the same disease process. This approach is based on the relative expression values of participating genes—i.e., the ordering of expression within network profiles. DIRAC provides quantitative measures of how network rankings differ either among networks for a selected phenotype or among phenotypes for a selected network. We examined disease phenotypes including cancer subtypes and neurological disorders and identified networks that are tightly regulated, as defined by high conservation of transcript ordering. Interestingly, we observed a strong trend to looser network regulation in more malignant phenotypes and later stages of disease. At a sample level, DIRAC can detect a change in ranking between phenotypes for any selected network. Variably expressed networks represent statistically robust differences between disease states and serve as signatures for accurate molecular classification, validating the information about expression patterns captured by DIRAC. Importantly, DIRAC can be applied not only to transcriptomic data, but to any ordinal data type.
The systems approach to medicine derives from the idea that diseased cells arise from one or more perturbed biological networks due to the net effect of interactions among multiple molecular agents; by measuring differences in the abundance of biomolecules (e.g., mRNA, proteins, metabolites) we can identify reporters of network states and uncover molecular signatures of disease. However, a major limitation of previously published network analyses is the focus on small numbers of individual, differentially-expressed genes, hence the failure to take into account combinatorial interactions. We report a new technique, Differential Rank Conservation, for identifying and measuring network-level perturbations. Our rank conservation index is based entirely on the relative levels of expression for participating genes and allows us to detect differences in network orderings between networks for a given phenotype and between phenotypes for a given network. In examining cancer subtypes and neurological disorders, we identified networks that are tightly and loosely regulated, as defined by the level of conservation of transcript ordering, and observed a strong trend to looser network regulation in more malignant phenotypes and later stages of disease. We also demonstrate that variably expressed networks represent robust differences between disease states.