Serous ovarian cancer (SeOvCa) is an aggressive disease with differential and often inadequate therapeutic outcome after standard treatment. The Cancer Genome Atlas (TCGA) has provided rich molecular and genetic profiles from hundreds of primary surgical samples. These profiles confirm mutations of TP53 in ∼100% of patients and an extraordinarily complex profile of DNA copy number changes with considerable patient-to-patient diversity. This raises the joint challenge of exploiting all new available datasets and reducing their confounding complexity for the purpose of predicting clinical outcomes and identifying disease relevant pathway alterations. We therefore set out to use multi-data type genomic profiles (mRNA, DNA methylation, DNA copy-number alteration and microRNA) available from TCGA to identify prognostic signatures for the prediction of progression-free survival (PFS) and overall survival (OS).
We implemented a multivariate Cox Lasso model and median time-to-event prediction algorithm and applied it to two datasets integrated from the four genomic data types. We (1) selected features through cross-validation; (2) generated a prognostic index for patient risk stratification; and (3) directly predicted continuous clinical outcome measures, that is, the time to recurrence and survival time. We used Kaplan-Meier p-values, hazard ratios (HR), and concordance probability estimates (CPE) to assess prediction performance, comparing separate and integrated datasets. Data integration resulted in the best PFS signature (withheld data: p-value = 0.008; HR = 2.83; CPE = 0.72).
We provide a prediction tool that inputs genomic profiles of primary surgical samples and generates patient-specific predictions for the time to recurrence and survival, along with outcome risk predictions. Using integrated genomic profiles resulted in information gain for prediction of outcomes. Pathway analysis provided potential insights into functional changes affecting disease progression. The prognostic signatures, if prospectively validated, may be useful for interpreting therapeutic outcomes for clinical trials that aim to improve the therapy for SeOvCa patients.
This study aims to explore gene expression signatures and serum biomarkers to predict intrinsic chemoresistance in epithelial ovarian cancer (EOC).
Patients and Methods
Gene expression profiling data of 322 high-grade EOC cases between 2009 and 2010 in The Cancer Genome Atlas project (TCGA) were used to develop and validate gene expression signatures that could discriminate different responses to first-line platinum/paclitaxel-based treatments. A gene regulation network was then built to further identify hub genes responsible for differential gene expression between the complete response (CR) group and the progressive disease (PD) group. Further, to find more robust serum biomarkers for clinical application, we integrated our gene signatures and gene signatures reported previously to identify secretory protein-encoding genes by searching the DAVID database. In the end, gene-drug interaction network was constructed by searching Comparative Toxicogenomics Database (CTD) and literature.
A 349-gene predictive model and an 18-gene model independent of key clinical features with high accuracy were developed for prediction of chemoresistance in EOC. Among them, ten important hub genes and six critical signaling pathways were identified to have important implications in chemotherapeutic response. Further, ten potential serum biomarkers were identified for predicting chemoresistance in EOC. Finally, we suggested some drugs for individualized treatment.
We have developed the predictive models and serum biomarkers for platinum/paclitaxel response and established the new approach to discover potential serum biomarkers from gene expression profiles. The potential drugs that target hub genes are also suggested.
Small sample sizes used in previous studies result in a lack of overlap between the reported gene signatures for prediction of chemotherapy response. Although morphologic features, especially tumor nuclear morphology, are important for cancer grading, little research has been reported on quantitatively correlating cellular morphology with chemotherapy response, especially in a large data set. In this study, we have used a large population of patients to identify molecular and morphologic signatures associated with chemotherapy response in serous ovarian carcinoma.
A gene expression model that predicts response to chemotherapy is developed and validated using a large-scale data set consisting of 493 samples from The Cancer Genome Atlas (TCGA) and 244 samples from an Australian report. An identified 227-gene signature achieves an overall predictive accuracy of greater than 85% with a sensitivity of approximately 95% and specificity of approximately 70%. The gene signature significantly distinguishes between patients with unfavorable versus favorable prognosis, when applied to either an independent data set (P = 0.04) or an external validation set (P<0.0001). In parallel, we present the production of a tumor nuclear image profile generated from 253 sample slides by characterizing patients with nuclear features (such as size, elongation, and roundness) in incremental bins, and we identify a morphologic signature that demonstrates a strong association with chemotherapy response in serous ovarian carcinoma.
A gene signature discovered on a large data set provides robustness in accurately predicting chemotherapy response in serous ovarian carcinoma. The combination of the molecular and morphologic signatures yields a new understanding of potential mechanisms involved in drug resistance.
Ovarian cancer remains a significant public health burden, with the highest mortality rate of all the gynecological cancers. This is attributable to the late stage at which the majority of ovarian cancers are diagnosed, coupled with the low and variable response of advanced tumors to standard chemotherapies. To date, clinically useful predictors of treatment response remain lacking. Identifying the genetic determinants of ovarian cancer survival and treatment response is crucial to the development of prognostic biomarkers and personalized therapies that may improve outcomes for the late-stage patients who comprise the majority of cases.
To identify constitutional genetic variations contributing to ovarian cancer mortality, we systematically investigated associations between germline polymorphisms and ovarian cancer survival using data from The Cancer Genome Atlas Project (TCGA). Using stage-stratified Cox proportional hazards regression, we examined 650,000 SNP loci for association with survival. We additionally examined whether the association of significant SNPs with survival was modified by somatic alterations.
Germline polymorphisms at rs4934282 (AGAP11/C10orf116) and rs1857623 (DNAH14) were associated with stage-adjusted survival ( = 1.12e-07 and 1.80e-07, FDR = 1.2e-04 and 2.4e-04, respectively). A third SNP, rs4869 (C10orf116), was additionally identified as significant in the exome sequencing data; it is in near-perfect LD with rs4934282. The associations with survival remained significant when somatic alterations.
Discovery analysis of TCGA data reveals germline genetic variations that may play a role in ovarian cancer survival even among late-stage cases. The significant loci are located near genes previously reported as having a possible relationship to platinum and taxol response. Because the variant alleles at the significant loci are common (frequencies for rs4934282 A/C alleles = 0.54/0.46, respectively; rs1857623 A/G alleles = 0.55/0.45, respectively) and germline variants can be assayed noninvasively, our findings provide potential targets for further exploration as prognostic biomarkers and individualized therapies.
Ovarian cancer causes more deaths than any other gynecological cancer. Identifying the molecular mechanisms that drive disease progress in ovarian cancer is a critical step in providing therapeutics, improving diagnostics, and affiliating clinical behavior with disease etiology. Identification of molecular interactions that stratify prognosis is key in facilitating a clinical-molecular perspective.
The Cancer Genome Atlas has recently made available the molecular characteristics of more than 500 patients. We used the TCGA multi-analysis study, and two additional datasets and a set of computational algorithms that we developed. The computational algorithms are based on methods that identify network alterations and quantify network behavior through gene expression.
We identify a network biomarker that significantly stratifies survival rates in ovarian cancer patients. Interestingly, expression levels of single or sets of genes do not explain the prognostic stratification. The discovered biomarker is composed of the network around the PDGF pathway. The biomarker enables prognosis stratification.
The work presented here demonstrates, through the power of gene-expression networks, the criticality of the PDGF network in driving disease course. In uncovering the specific interactions within the network, that drive the phenotype, we catalyze targeted treatment, facilitate prognosis and offer a novel perspective into hidden disease heterogeneity.
The Cancer Genome Atlas (TCGA) Network recently comprehensively catalogued the molecular aberrations in 487 high-grade serous ovarian cancers, with much remaining to be elucidated regarding the microRNAs (miRNAs). Here, using TCGA ovarian data, we surveyed the miRNAs, in the context of their predicted gene targets.
Methods and Results
Integration of miRNA and gene patterns yielded evidence that proximal pairs of miRNAs are processed from polycistronic primary transcripts, and that intronic miRNAs and their host gene mRNAs derive from common transcripts. Patterns of miRNA expression revealed multiple tumor subtypes and a set of 34 miRNAs predictive of overall patient survival. In a global analysis, miRNA:mRNA pairs anti-correlated in expression across tumors showed a higher frequency of in silico predicted target sites in the mRNA 3′-untranslated region (with less frequency observed for coding sequence and 5′-untranslated regions). The miR-29 family and predicted target genes were among the most strongly anti-correlated miRNA:mRNA pairs; over-expression of miR-29a in vitro repressed several anti-correlated genes (including DNMT3A and DNMT3B) and substantially decreased ovarian cancer cell viability.
This study establishes miRNAs as having a widespread impact on gene expression programs in ovarian cancer, further strengthening our understanding of miRNA biology as it applies to human cancer. As with gene transcripts, miRNAs exhibit high diversity reflecting the genomic heterogeneity within a clinically homogeneous disease population. Putative miRNA:mRNA interactions, as identified using integrative analysis, can be validated. TCGA data are a valuable resource for the identification of novel tumor suppressive miRNAs in ovarian as well as other cancers.
New tools are needed to predict outcomes of ovarian cancer patients treated with platinum-based chemotherapy. We hypothesized that a molecular score based on expression of genes that are involved in platinum-induced DNA damage repair could provide such prognostic information.
Gene expression data was extracted from The Cancer Genome Atlas (TCGA) database for 151 DNA repair genes from tumors of serous ovarian cystadenocarcinoma patients (n = 511). A molecular score was generated based on the expression of 23 genes involved in platinum-induced DNA damage repair pathways. Patients were divided into low (scores 0–10) and high (scores 11–20) score groups, and overall survival (OS) was analyzed by Kaplan–Meier method. Results were validated in two gene expression microarray datasets. Association of the score with OS was compared with known clinical factors (age, stage, grade, and extent of surgical debulking) using univariate and multivariable Cox proportional hazards models. Score performance was evaluated by receiver operating characteristic (ROC) curve analysis. Correlations between the score and likelihood of complete response, recurrence-free survival, and progression-free survival were assessed. Statistical tests were two-sided.
Improved survival was associated with being in the high-scoring group (high vs low scores: 5-year OS, 40% vs 17%, P < .001), and results were reproduced in the validation datasets (P < .05). The score was the only pretreatment factor that showed a statistically significant association with OS (high vs low scores, hazard ratio of death = 0.40, 95% confidence interval = 0.32 to 0.66, P < .001). ROC curves indicated that the score outperformed the known clinical factors (score in a validation dataset vs clinical factors, area under the curve = 0.65 vs 0.52). The score positively correlated with complete response rate, recurrence-free survival, and progression-free survival (Pearson correlation coefficient [r2] = 0.60, 0.84, and 0.80, respectively; P < .001 for all).
The DNA repair pathway–focused score can be used to predict outcomes and response to platinum therapy in ovarian cancer patients.
Ovarian cancer is often called the ‘silent killer’ since it is difficult to have early detection and prognosis. Understanding the biological mechanism related to ovarian cancer becomes extremely important for the purpose of treatment. We propose an integrative framework to identify pathway related networks based on large-scale TCGA copy number data and gene expression profiles. The integrative approach first detects highly conserved copy number altered genes and regards them as seed genes, and then applies a network-based method to identify subnetworks that can differentiate gene expression patterns between different phenotypes of ovarian cancer patients. The identified subnetworks are further validated on an independent gene expression data set using a network-based classification method. The experimental results show that our approach can not only achieve good prediction performance across different data sets, but also identify biological meaningful subnetworks involved in many signaling pathways related to ovarian cancer.
The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to characterize several types of cancer. Datasets from biomedical domains such as TCGA present a particularly challenging task for those interested in dynamically aggregating its results because the data sources are typically both heterogeneous and distributed. The Linked Data best practices offer a solution to integrate and discover data with those characteristics, namely through exposure of data as Web services supporting SPARQL, the Resource Description Framework query language. Most SPARQL endpoints, however, cannot easily be queried by data experts. Furthermore, exposing experimental data as SPARQL endpoints remains a challenging task because, in most cases, data must first be converted to Resource Description Framework triples. In line with those requirements, we have developed an infrastructure to expose clinical, demographic and molecular data elements generated by TCGA as a SPARQL endpoint by assigning elements to entities of the Simple Sloppy Semantic Database (S3DB) management model. All components of the infrastructure are available as independent Representational State Transfer (REST) Web services to encourage reusability, and a simple interface was developed to automatically assemble SPARQL queries by navigating a representation of the TCGA domain. A key feature of the proposed solution that greatly facilitates assembly of SPARQL queries is the distinction between the TCGA domain descriptors and data elements. Furthermore, the use of the S3DB management model as a mediator enables queries to both public and protected data without the need for prior submission to a single data source.
TCGA; SPARQL; RDF; Linked Data; Data integration
Despite advances in radical surgery and chemotherapy delivery, ovarian cancer is the most lethal gynecologic malignancy. Standard therapy includes treatment with platinum-based combination chemotherapies yet there is no biomarker model to predict their responses to these agents. We here have developed and independently tested our multi-gene molecular predictors for forecasting patients' responses to individual drugs on a cohort of 55 ovarian cancer patients. To independently validate these molecular predictors, we performed microarray profiling on FFPE tumor samples of 55 ovarian cancer patients (UVA-55) treated with platinum-based adjuvant chemotherapy. Genome-wide chemosensitivity biomarkers were initially discovered from the in vitro drug activities and genomic expression data for carboplatin and paclitaxel, respectively. Multivariate predictors were trained with the cell line data and then evaluated with a historical patient cohort. For the UVA-55 cohort, the carboplatin, taxol, and combination predictors significantly stratified responder patients and non-responder patients (p = 0.019, 0.04, 0.014) with sensitivity = 91%, 96%, 93 and NPV = 57%, 67%, 67% in pathologic clinical response. The combination predictor also demonstrated a significant survival difference between predicted responders and non-responders with a median survival of 55.4 months vs. 32.1 months. Thus, COXEN single- and combination-drug predictors successfully stratified platinum resistance and taxane response in an independent cohort of ovarian cancer patients based on their FFPE tumor samples.
Epithelial ovarian cancer (EOC) has an innate susceptibility to become chemoresistant. Up to 30% of patients do not respond to conventional chemotherapy [paclitaxel (Taxol®) in combination with carboplatin] and, of those who have an initial response, many patients relapse. Therefore, an understanding of the molecular mechanisms that regulate cellular chemotherapeutic responses in EOC cells has the potential to impact significantly on patient outcome. The mitotic arrest deficiency protein 2 (MAD2), is a centrally important mediator of the cellular response to paclitaxel. MAD2 immunohistochemical analysis was performed on 82 high-grade serous EOC samples. A multivariate Cox regression analysis of nuclear MAD2 IHC intensity adjusting for stage, tumour grade and optimum surgical debulking revealed that low MAD2 IHC staining intensity was significantly associated with reduced progression-free survival (PFS) (p = 0.0003), with a hazard ratio of 4.689. The in vitro analyses of five ovarian cancer cell lines demonstrated that cells with low MAD2 expression were less sensitive to paclitaxel. Furthermore, paclitaxel-induced activation of the spindle assembly checkpoint (SAC) and apoptotic cell death was abrogated in cells transfected with MAD2 siRNA. In silico analysis identified a miR-433 binding domain in the MAD2 3′ UTR, which was verified in a series of experiments. Firstly, MAD2 protein expression levels were down-regulated in pre-miR-433 transfected A2780 cells. Secondly, pre-miR-433 suppressed the activity of a reporter construct containing the 3′-UTR of MAD2. Thirdly, blocking miR-433 binding to the MAD2 3′ UTR protected MAD2 from miR-433 induced protein down-regulation. Importantly, reduced MAD2 protein expression in pre-miR-433-transfected A2780 cells rendered these cells less sensitive to paclitaxel. In conclusion, loss of MAD2 protein expression results in increased resistance to paclitaxel in EOC cells. Measuring MAD2 IHC staining intensity may predict paclitaxel responses in women presenting with high-grade serous EOC. Copyright © 2012 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
miR-433; MAD2; chemoresistance; epithelial ovarian cancer; paclitaxel
Despite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in 3% of the patients. These include the biochemically putative drug target, cell cycle-regulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched profiles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classification of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern.
PBX1 is a TALE homeodomain transcription factor involved in organogenesis and tumorigenesis. Although it has been shown that ovarian, breast, and melanoma cancer cells depend on PBX1 for cell growth and survival, the molecular mechanism of how PBX1 promotes tumorigenesis remains unclear. Here, we applied an integrated approach by overlapping PBX1 ChIP-chip targets with the PBX1-regulated transcriptome in ovarian cancer cells to identify genes whose transcription was directly regulated by PBX1. We further determined if PBX1 target genes identified in ovarian cancer cells were co-overexpressed with PBX1 in carcinoma tissues. By analyzing TCGA gene expression microarray datasets from ovarian serous carcinomas, we found co-upregulation of PBX1 and a significant number of its direct target genes. Among the PBX1 target genes, a homeodomain protein MEOX1 whose DNA binding motif was enriched in PBX1-immunoprecipicated DNA sequences was selected for functional analysis. We demonstrated that MEOX1 protein interacts with PBX1 protein and inhibition of MEOX1 yields a similar growth inhibitory phenotype as PBX1 suppression. Furthermore, ectopically expressed MEOX1 functionally rescued the PBX1-withdrawn effect, suggesting MEOX1 mediates the cellular growth signal of PBX1. These results demonstrate that MEOX1 is a critical target gene and cofactor of PBX1 in ovarian cancers.
Despite improved outcomes in the past 30 years, less than half of all women diagnosed with epithelial ovarian cancer live five years beyond their diagnosis. Although typically treated as a single disease, epithelial ovarian cancer includes several distinct histological subtypes, such as papillary serous and endometrioid carcinomas. To address whether the morphological differences seen in these carcinomas represent distinct characteristics at the molecular level we analyzed DNA methylation patterns in 11 papillary serous tumors, 9 endometrioid ovarian tumors, 4 normal fallopian tube samples and 6 normal endometrial tissues, plus 8 normal fallopian tube and 4 serous samples from TCGA. For comparison within the endometrioid subtype we added 6 primary uterine endometrioid tumors and 5 endometrioid metastases from uterus to ovary. Data was obtained from 27,578 CpG dinucleotides occurring in or near promoter regions of 14,495 genes. We identified 36 locations with significant increases or decreases in methylation in comparisons of serous tumors and normal fallopian tube samples. Moreover, unsupervised clustering techniques applied to all samples showed three major profiles comprising mostly normal samples, serous tumors, and endometrioid tumors including ovarian, uterine and metastatic origins. The clustering analysis identified 60 differentially methylated sites between the serous group and the normal group. An unrelated set of 25 serous tumors validated the reproducibility of the methylation patterns. In contrast, >1,000 genes were differentially methylated between endometrioid tumors and normal samples. This finding is consistent with a generalized regulatory disruption caused by a methylator phenotype. Through DNA methylation analyses we have identified genes with known roles in ovarian carcinoma etiology, whereas pathway analyses provided biological insight to the role of novel genes. Our finding of differences between serous and endometrioid ovarian tumors indicates that intervention strategies could be developed to specifically address subtypes of epithelial ovarian cancer.
Because of the high risk of recurrence in high-grade serous ovarian carcinoma (HGS-OvCa), the development of outcome predictors could be valuable for patient stratification. Using the catalog of The Cancer Genome Atlas (TCGA), we developed subtype and survival gene expression signatures, which, when combined, provide a prognostic model of HGS-OvCa classification, named “Classification of Ovarian Cancer” (CLOVAR). We validated CLOVAR on an independent dataset consisting of 879 HGS-OvCa expression profiles. The worst outcome group, accounting for 23% of all cases, was associated with a median survival of 23 months and a platinum resistance rate of 63%, versus a median survival of 46 months and platinum resistance rate of 23% in other cases. Associating the outcome prediction model with BRCA1/BRCA2 mutation status, residual disease after surgery, and disease stage further optimized outcome classification. Ovarian cancer is a disease in urgent need of more effective therapies. The spectrum of outcomes observed here and their association with CLOVAR signatures suggests variations in underlying tumor biology. Prospective validation of the CLOVAR model in the context of additional prognostic variables may provide a rationale for optimal combination of patient and treatment regimens.
Introduction: Early identification of chemoresistance in patients with ovarian cancer is of utmost importance in order to provide them with the most appropriate therapy. Recently, we described the expression of MyD88 in ovarian cancer cells that were resistant to the cytotoxic agent paclitaxel. In addition to chemoresistance, in MyD88 positive ovarian cancer cells, paclitaxel stimulates growth and production of proinflammatory cytokines. The objective of this study was to determine the correlation of MyD88 expression in primary and recurrent epithelial ovarian cancers with the response to carboplatin and paclitaxel combination chemotherapy. Methods: Tumors are heterogeneous structures that contain different cell populations, thus rendering the identification of specific tumor markers difficult. Using laser capture microdissection, pure cancer cells were isolated from ovarian malignant tumors that were obtained from 20 patients at the time of surgery. The microdissected cells were evaluated for the expression of MyD88, FasL, and XIAP by western blot analysis. Results: Protein expression was observed in samples containing as low as 500 cells. The results were correlated with the clinical course of those patients. It was evident that MyD88 expression in ovarian cancer cells accurately predicts a poor response to paclitaxel chemotherapy as shown by a short progression-free interval and overall survival. Conclusion: We describe for the first time a molecular approach to identify paclitaxel chemoresistance. Toxicity from agents without therapeutic benefit can be avoided by identifying those patients who will not respond to a specific agent. Molecular markers will enable us to design individualized treatments and improve overall survival.
In the last decades, management of epithelial ovarian cancer (EOC) has been based on the staging system of the International Federation of Gynecology and Obstetrics (FIGO), and different classifications have been proposed for EOC that take account of grade of differentiation, histological subtype, and clinical features. However, despite taxonomic efforts, EOC appears to be not a unique disease; its subtypes differ for epidemiological and genetic risk factors, precursor lesions, patterns of spread, response to chemotherapy, and prognosis. Nevertheless, carboplatin plus paclitaxel combination represents the only standard treatment in adjuvant and advanced settings. This paper summarizes theories about the classification and origin of EOC and classical and new prognostic factors. It presents data about standard treatment and novel agents. We speculate about the possibility to create tailored therapy based on specific mutations in ovarian cancer and to personalize prevention.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at http://compbio.cs.umn.edu/Net-Cox/.
Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
Gene expression patterns characterizing clinically-relevant molecular subgroups of glioblastoma are difficult to reproduce. We suspect a combination of biological and analytic factors confounds interpretation of glioblastoma expression data. We seek to clarify the nature and relative contributions of these factors, to focus additional investigations, and to improve the accuracy and consistency of translational glioblastoma analyses.
We analyzed gene expression and clinical data for 340 glioblastomas in The Cancer Genome Atlas (TCGA). We developed a logic model to analyze potential sources of biological, technical, and analytic variability and used standard linear classifiers and linear dimensional reduction algorithms to investigate the nature and relative contributions of each factor.
Commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and technical noise make measurable but proportionally minor contributions to inconsistent molecular classification. Our analysis suggests that three, previously underappreciated factors may account for a larger fraction of classification errors: inherent non-linear/non-orthogonal relationships among the genes used in conjunction with classification algorithms that assume linearity; skewed data distributions assumed to be Gaussian; and biologic variability (noise) among tumors, of which we propose three types.
Our analysis of the TCGA data demonstrates a contributory role for technical factors in molecular classification inconsistencies in glioblastoma but also suggests that biological variability, abnormal data distribution, and non-linear relationships among genes may be responsible for a proportionally larger component of classification error. These findings may have important implications for both glioblastoma research and for translational application of other large-volume biological databases.
The Cancer Genome Atlas Project (TCGA) has produced an extensive collection of ‘-omic’ data on glioblastoma (GBM), resulting in several key insights on expression signatures. Despite the richness of TCGA GBM data, the absence of lower grade gliomas in this data set prevents analysis genes related to progression and the uncovering of predictive signatures. A complementary dataset exists in the form of the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt), which contains molecular and clinical data for diffuse gliomas across the full spectrum of histologic class and grade. Here we present an investigation of the significance of the TCGA consortium's expression classification when applied to Rembrandt gliomas. We demonstrate that the proneural signature predicts improved clinical outcome among 176 Rembrandt gliomas that includes all histologies and grades, including GBMs (log rank test p = 1.16e-6), but also among 75 grade II and grade III samples (p = 2.65e-4). This gene expression signature was enriched in tumors with oligodendroglioma histology and also predicted improved survival in this tumor type (n = 43, p = 1.25e-4). Thus, expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for lower grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy. Integrated DNA and RNA analysis of low-grade and high-grade proneural gliomas identified increased expression and gene amplification of several genes including GLIS3, TGFB2, TNC, AURKA, and VEGFA in proneural GBMs, with corresponding loss of DLL3 and HEY2. Pathway analysis highlights the importance of the Notch and Hedgehog pathways in the proneural subtype. This demonstrates that the expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for low-grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy.
We have profiled promoter DNA methylation alterations in 272 glioblastoma tumors in the context of The Cancer Genome Atlas (TCGA). We found that a distinct subset of samples displays concerted hypermethylation at a large number of loci, indicating the existence of a glioma-CpG Island Methylator Phenotype (G-CIMP). We validated G-CIMP in a set of non-TCGA glioblastomas and low-grade gliomas. G-CIMP tumors belong to the Proneural subgroup, are more prevalent among low-grade gliomas, display distinct copy-number alterations and are tightly associated with IDH1 somatic mutations. Patients with G-CIMP tumors are younger at the time of diagnosis and experience significantly improved outcome. These findings identify G-CIMP as a distinct subset of human gliomas on molecular and clinical grounds.
DNA methylation; glioma; CIMP; IDH1; TCGA
A stage-associated gene expression signature of coordinately expressed genes, including the transcription factor Slug (SNAI2) and other epithelial-mesenchymal transition (EMT) markers has been found present in samples from publicly available gene expression datasets in multiple cancer types, including nonepithelial cancers. The expression levels of the co-expressed genes vary in a continuous and coordinate manner across the samples, ranging from absence of expression to strong co-expression of all genes. These data suggest that tumor cells may pass through an EMT-like process of mesenchymal transition to varying degrees. Here we show that, in glioblastoma multiforme (GBM), this signature is associated with time to recurrence following initial treatment. By analyzing data from The Cancer Genome Atlas (TCGA), we found that GBM patients who responded to therapy and had long time to recurrence had low levels of the signature in their tumor samples (P = 3×10−7). We also found that the signature is strongly correlated in gliomas with the putative stem cell marker CD44, and is highly enriched among the differentially expressed genes in glioblastomas vs. lower grade gliomas. Our results suggest that long delay before tumor recurrence is associated with absence of the mesenchymal transition signature, raising the possibility that inhibiting this transition might improve the durability of therapy in glioma patients.
Ovarian cancer is one of the most sensitive solid tumors, with objective responses
ranging from 60 to 80% even in patients with advanced stage. However, most patients
ultimately recur and develop resistance to chemotherapy. As a result, the survival rate
for patients with ovarian cancer has not improved over the past 20 years. Resistance to
chemotherapy presents a major obstacle to attempt to improve the prognosis of patients
with ovarian cancer. A new strategy is necessary to improve the prognosis of patients
with ovarian cancer.
The mechanism of chemoresistance was reviewed to get over the resistance. Additionally,
the biological characteristics of ovarian cancer and molecular-targeted agents including
signal-transduction inhibitors and anti-angiogenesis were discussed.
Genetic diagnosis for chemosensitivity with drug-resistance genes may be a useful
predictor. Unfortunately, molecular-targeted therapy alone has been insufficient to improve
the prognosis for patients with advanced ovarian cancer. Molecular molecular-targeted therapy
should be carried out together with conventional cytotoxic agents. On the occasion of
the use of the molecular targeted-agents, care of the appearance of the unexpected adverse effect
should be important.
The future research in this field will enable to develop an effective strategy for conquest of
chemoresistance in ovarian cancer.
chemotherapy; ; molecular targeted agent; ovarian cancer; resistance
Previous reports have implicated an induction of genes in IFN/STAT1 (Interferon/STAT1) signaling in radiation resistant and prosurvival tumor phenotypes in a number of cancer cell lines, and we have hypothesized that upregulation of these genes may be predictive of poor survival outcome and/or treatment response in Glioblastoma Multiforme (GBM) patients. We have developed a list of 8 genes related to IFN/STAT1 that we hypothesize to be predictive of poor survival in GBM patients. Our working hypothesis that over-expression of this gene signature predicts poor survival outcome in GBM patients was confirmed, and in addition, it was demonstrated that the survival model was highly subtype-dependent, with strong dependence in the Proneural subtype and no detected dependence in the Classical and Mesenchymal subtypes. We developed a specific multi-gene survival model for the Proneural subtype in the TCGA (the Cancer Genome Atlas) discovery set which we have validated in the TCGA validation set. In addition, we have performed network analysis in the form of Bayesian Network discovery and Ingenuity Pathway Analysis to further dissect the underlying biology of this gene signature in the etiology of GBM. We theorize that the strong predictive value of the IFN/STAT1 gene signature in the Proneural subtype may be due to chemotherapy and/or radiation resistance induced through prolonged constitutive signaling of these genes during the course of the illness. The results of this study have implications both for better prediction models for survival outcome in GBM and for improved understanding of the underlying subtype-specific molecular mechanisms for GBM tumor progression and treatment response.
High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step.
data adjustment; batch effect; bias; sample annotation; RNAseq; high-throughput analysis