1.  Time to Recurrence and Survival in Serous Ovarian Tumors Predicted from Integrated Genomic Profiles 
PLoS ONE  2011;6(11):e24709.
Serous ovarian cancer (SeOvCa) is an aggressive disease with differential and often inadequate therapeutic outcome after standard treatment. The Cancer Genome Atlas (TCGA) has provided rich molecular and genetic profiles from hundreds of primary surgical samples. These profiles confirm mutations of TP53 in ∼100% of patients and an extraordinarily complex profile of DNA copy number changes with considerable patient-to-patient diversity. This raises the joint challenge of exploiting all new available datasets and reducing their confounding complexity for the purpose of predicting clinical outcomes and identifying disease relevant pathway alterations. We therefore set out to use multi-data type genomic profiles (mRNA, DNA methylation, DNA copy-number alteration and microRNA) available from TCGA to identify prognostic signatures for the prediction of progression-free survival (PFS) and overall survival (OS).
Methodology/Principal Findings
We implemented a multivariate Cox Lasso model and median time-to-event prediction algorithm and applied it to two datasets integrated from the four genomic data types. We (1) selected features through cross-validation; (2) generated a prognostic index for patient risk stratification; and (3) directly predicted continuous clinical outcome measures, that is, the time to recurrence and survival time. We used Kaplan-Meier p-values, hazard ratios (HR), and concordance probability estimates (CPE) to assess prediction performance, comparing separate and integrated datasets. Data integration resulted in the best PFS signature (withheld data: p-value = 0.008; HR = 2.83; CPE = 0.72).
We provide a prediction tool that inputs genomic profiles of primary surgical samples and generates patient-specific predictions for the time to recurrence and survival, along with outcome risk predictions. Using integrated genomic profiles resulted in information gain for prediction of outcomes. Pathway analysis provided potential insights into functional changes affecting disease progression. The prognostic signatures, if prospectively validated, may be useful for interpreting therapeutic outcomes for clinical trials that aim to improve the therapy for SeOvCa patients.
PMCID: PMC3207809  PMID: 22073136
2.  Integrated Analysis of Gene Expression Profiles Associated with Response of Platinum/Paclitaxel-Based Treatment in Epithelial Ovarian Cancer 
PLoS ONE  2012;7(12):e52745.
This study aims to explore gene expression signatures and serum biomarkers to predict intrinsic chemoresistance in epithelial ovarian cancer (EOC).
Patients and Methods
Gene expression profiling data of 322 high-grade EOC cases between 2009 and 2010 in The Cancer Genome Atlas project (TCGA) were used to develop and validate gene expression signatures that could discriminate different responses to first-line platinum/paclitaxel-based treatments. A gene regulation network was then built to further identify hub genes responsible for differential gene expression between the complete response (CR) group and the progressive disease (PD) group. Further, to find more robust serum biomarkers for clinical application, we integrated our gene signatures and gene signatures reported previously to identify secretory protein-encoding genes by searching the DAVID database. In the end, gene-drug interaction network was constructed by searching Comparative Toxicogenomics Database (CTD) and literature.
A 349-gene predictive model and an 18-gene model independent of key clinical features with high accuracy were developed for prediction of chemoresistance in EOC. Among them, ten important hub genes and six critical signaling pathways were identified to have important implications in chemotherapeutic response. Further, ten potential serum biomarkers were identified for predicting chemoresistance in EOC. Finally, we suggested some drugs for individualized treatment.
We have developed the predictive models and serum biomarkers for platinum/paclitaxel response and established the new approach to discover potential serum biomarkers from gene expression profiles. The potential drugs that target hub genes are also suggested.
PMCID: PMC3531383  PMID: 23300757
3.  Integrated Analysis of Gene Expression and Tumor Nuclear Image Profiles Associated with Chemotherapy Response in Serous Ovarian Carcinoma 
PLoS ONE  2012;7(5):e36383.
Small sample sizes used in previous studies result in a lack of overlap between the reported gene signatures for prediction of chemotherapy response. Although morphologic features, especially tumor nuclear morphology, are important for cancer grading, little research has been reported on quantitatively correlating cellular morphology with chemotherapy response, especially in a large data set. In this study, we have used a large population of patients to identify molecular and morphologic signatures associated with chemotherapy response in serous ovarian carcinoma.
Methodology/Principal Findings
A gene expression model that predicts response to chemotherapy is developed and validated using a large-scale data set consisting of 493 samples from The Cancer Genome Atlas (TCGA) and 244 samples from an Australian report. An identified 227-gene signature achieves an overall predictive accuracy of greater than 85% with a sensitivity of approximately 95% and specificity of approximately 70%. The gene signature significantly distinguishes between patients with unfavorable versus favorable prognosis, when applied to either an independent data set (P = 0.04) or an external validation set (P<0.0001). In parallel, we present the production of a tumor nuclear image profile generated from 253 sample slides by characterizing patients with nuclear features (such as size, elongation, and roundness) in incremental bins, and we identify a morphologic signature that demonstrates a strong association with chemotherapy response in serous ovarian carcinoma.
A gene signature discovered on a large data set provides robustness in accurately predicting chemotherapy response in serous ovarian carcinoma. The combination of the molecular and morphologic signatures yields a new understanding of potential mechanisms involved in drug resistance.
PMCID: PMC3348145  PMID: 22590536
4.  Discovery Analysis of TCGA Data Reveals Association between Germline Genotype and Survival in Ovarian Cancer Patients 
PLoS ONE  2013;8(3):e55037.
Ovarian cancer remains a significant public health burden, with the highest mortality rate of all the gynecological cancers. This is attributable to the late stage at which the majority of ovarian cancers are diagnosed, coupled with the low and variable response of advanced tumors to standard chemotherapies. To date, clinically useful predictors of treatment response remain lacking. Identifying the genetic determinants of ovarian cancer survival and treatment response is crucial to the development of prognostic biomarkers and personalized therapies that may improve outcomes for the late-stage patients who comprise the majority of cases.
To identify constitutional genetic variations contributing to ovarian cancer mortality, we systematically investigated associations between germline polymorphisms and ovarian cancer survival using data from The Cancer Genome Atlas Project (TCGA). Using stage-stratified Cox proportional hazards regression, we examined 650,000 SNP loci for association with survival. We additionally examined whether the association of significant SNPs with survival was modified by somatic alterations.
Germline polymorphisms at rs4934282 (AGAP11/C10orf116) and rs1857623 (DNAH14) were associated with stage-adjusted survival ( = 1.12e-07 and 1.80e-07, FDR  = 1.2e-04 and 2.4e-04, respectively). A third SNP, rs4869 (C10orf116), was additionally identified as significant in the exome sequencing data; it is in near-perfect LD with rs4934282. The associations with survival remained significant when somatic alterations.
Discovery analysis of TCGA data reveals germline genetic variations that may play a role in ovarian cancer survival even among late-stage cases. The significant loci are located near genes previously reported as having a possible relationship to platinum and taxol response. Because the variant alleles at the significant loci are common (frequencies for rs4934282 A/C alleles = 0.54/0.46, respectively; rs1857623 A/G alleles = 0.55/0.45, respectively) and germline variants can be assayed noninvasively, our findings provide potential targets for further exploration as prognostic biomarkers and individualized therapies.
PMCID: PMC3605427  PMID: 23555554
5.  Integrated Analyses of microRNAs Demonstrate Their Widespread Influence on Gene Expression in High-Grade Serous Ovarian Carcinoma 
PLoS ONE  2012;7(3):e34546.
The Cancer Genome Atlas (TCGA) Network recently comprehensively catalogued the molecular aberrations in 487 high-grade serous ovarian cancers, with much remaining to be elucidated regarding the microRNAs (miRNAs). Here, using TCGA ovarian data, we surveyed the miRNAs, in the context of their predicted gene targets.
Methods and Results
Integration of miRNA and gene patterns yielded evidence that proximal pairs of miRNAs are processed from polycistronic primary transcripts, and that intronic miRNAs and their host gene mRNAs derive from common transcripts. Patterns of miRNA expression revealed multiple tumor subtypes and a set of 34 miRNAs predictive of overall patient survival. In a global analysis, miRNA:mRNA pairs anti-correlated in expression across tumors showed a higher frequency of in silico predicted target sites in the mRNA 3′-untranslated region (with less frequency observed for coding sequence and 5′-untranslated regions). The miR-29 family and predicted target genes were among the most strongly anti-correlated miRNA:mRNA pairs; over-expression of miR-29a in vitro repressed several anti-correlated genes (including DNMT3A and DNMT3B) and substantially decreased ovarian cancer cell viability.
This study establishes miRNAs as having a widespread impact on gene expression programs in ovarian cancer, further strengthening our understanding of miRNA biology as it applies to human cancer. As with gene transcripts, miRNAs exhibit high diversity reflecting the genomic heterogeneity within a clinically homogeneous disease population. Putative miRNA:mRNA interactions, as identified using integrative analysis, can be validated. TCGA data are a valuable resource for the identification of novel tumor suppressive miRNAs in ovarian as well as other cancers.
PMCID: PMC3315571  PMID: 22479643
6.  A DNA Repair Pathway–Focused Score for Prediction of Outcomes in Ovarian Cancer Treated With Platinum-Based Chemotherapy 
New tools are needed to predict outcomes of ovarian cancer patients treated with platinum-based chemotherapy. We hypothesized that a molecular score based on expression of genes that are involved in platinum-induced DNA damage repair could provide such prognostic information.
Gene expression data was extracted from The Cancer Genome Atlas (TCGA) database for 151 DNA repair genes from tumors of serous ovarian cystadenocarcinoma patients (n = 511). A molecular score was generated based on the expression of 23 genes involved in platinum-induced DNA damage repair pathways. Patients were divided into low (scores 0–10) and high (scores 11–20) score groups, and overall survival (OS) was analyzed by Kaplan–Meier method. Results were validated in two gene expression microarray datasets. Association of the score with OS was compared with known clinical factors (age, stage, grade, and extent of surgical debulking) using univariate and multivariable Cox proportional hazards models. Score performance was evaluated by receiver operating characteristic (ROC) curve analysis. Correlations between the score and likelihood of complete response, recurrence-free survival, and progression-free survival were assessed. Statistical tests were two-sided.
Improved survival was associated with being in the high-scoring group (high vs low scores: 5-year OS, 40% vs 17%, P < .001), and results were reproduced in the validation datasets (P < .05). The score was the only pretreatment factor that showed a statistically significant association with OS (high vs low scores, hazard ratio of death = 0.40, 95% confidence interval = 0.32 to 0.66, P < .001). ROC curves indicated that the score outperformed the known clinical factors (score in a validation dataset vs clinical factors, area under the curve = 0.65 vs 0.52). The score positively correlated with complete response rate, recurrence-free survival, and progression-free survival (Pearson correlation coefficient [r2] = 0.60, 0.84, and 0.80, respectively; P < .001 for all).
The DNA repair pathway–focused score can be used to predict outcomes and response to platinum therapy in ovarian cancer patients.
PMCID: PMC3341307  PMID: 22505474
7.  Exposing the cancer genome atlas as a SPARQL endpoint 
Journal of biomedical informatics  2010;43(6):998-1008.
The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to characterize several types of cancer. Datasets from biomedical domains such as TCGA present a particularly challenging task for those interested in dynamically aggregating its results because the data sources are typically both heterogeneous and distributed. The Linked Data best practices offer a solution to integrate and discover data with those characteristics, namely through exposure of data as Web services supporting SPARQL, the Resource Description Framework query language. Most SPARQL endpoints, however, cannot easily be queried by data experts. Furthermore, exposing experimental data as SPARQL endpoints remains a challenging task because, in most cases, data must first be converted to Resource Description Framework triples. In line with those requirements, we have developed an infrastructure to expose clinical, demographic and molecular data elements generated by TCGA as a SPARQL endpoint by assigning elements to entities of the Simple Sloppy Semantic Database (S3DB) management model. All components of the infrastructure are available as independent Representational State Transfer (REST) Web services to encourage reusability, and a simple interface was developed to automatically assemble SPARQL queries by navigating a representation of the TCGA domain. A key feature of the proposed solution that greatly facilitates assembly of SPARQL queries is the distinction between the TCGA domain descriptors and data elements. Furthermore, the use of the S3DB management model as a mediator enables queries to both public and protected data without the need for prior submission to a single data source.
PMCID: PMC3071752  PMID: 20851208
TCGA; SPARQL; RDF; Linked Data; Data integration
Ovarian cancer is often called the ‘silent killer’ since it is difficult to have early detection and prognosis. Understanding the biological mechanism related to ovarian cancer becomes extremely important for the purpose of treatment. We propose an integrative framework to identify pathway related networks based on large-scale TCGA copy number data and gene expression profiles. The integrative approach first detects highly conserved copy number altered genes and regards them as seed genes, and then applies a network-based method to identify subnetworks that can differentiate gene expression patterns between different phenotypes of ovarian cancer patients. The identified subnetworks are further validated on an independent gene expression data set using a network-based classification method. The experimental results show that our approach can not only achieve good prediction performance across different data sets, but also identify biological meaningful subnetworks involved in many signaling pathways related to ovarian cancer.
PMCID: PMC3608394  PMID: 22174260
9.  Corosolic acid enhances the antitumor effects of chemotherapy on epithelial ovarian cancer by inhibiting signal transducer and activator of transcription 3 signaling 
Oncology Letters  2013;6(6):1619-1623.
Resistance to chemotherapy poses a serious problem for the treatment of advanced epithelial ovarian cancer patients. The mechanisms of chemoresistance are complex and studies have implicated signal transducer and activator of transcription 3 (STAT3) signaling in the chemoresistance of cancer cells. The present study investigated whether corosolic acid (CA), which has been previously reported to be a STAT3 inhibitor, was able to increase the sensitivity to chemotherapeutic drugs in epithelial ovarian cancer cells. CA also markedly enhanced the anticancer effect of paclitaxel, cisplatin and doxorubicin. In addition, CA abrogated the cell-cell interactions between macrophages and epithelial ovarian cancer cells and inhibited the macrophage-induced activation of epithelial ovarian cancer cells. These data indicated that CA was able to reverse the chemoresistance of epithelial ovarian cancer cells and suppress the cell-cell interaction with tumorigenic macrophages. Thus, CA may be useful as an adjuvant treatment to patients with advanced ovarian and other types of cancer due to the multiple anticancer effects.
PMCID: PMC3834045  PMID: 24260055
macrophage; signal transducer and activator of transcription 3; ovarian cancer; corosolic acid
10.  Multi-Gene Expression Predictors of Single Drug Responses to Adjuvant Chemotherapy in Ovarian Carcinoma: Predicting Platinum Resistance 
PLoS ONE  2012;7(2):e30550.
Despite advances in radical surgery and chemotherapy delivery, ovarian cancer is the most lethal gynecologic malignancy. Standard therapy includes treatment with platinum-based combination chemotherapies yet there is no biomarker model to predict their responses to these agents. We here have developed and independently tested our multi-gene molecular predictors for forecasting patients' responses to individual drugs on a cohort of 55 ovarian cancer patients. To independently validate these molecular predictors, we performed microarray profiling on FFPE tumor samples of 55 ovarian cancer patients (UVA-55) treated with platinum-based adjuvant chemotherapy. Genome-wide chemosensitivity biomarkers were initially discovered from the in vitro drug activities and genomic expression data for carboplatin and paclitaxel, respectively. Multivariate predictors were trained with the cell line data and then evaluated with a historical patient cohort. For the UVA-55 cohort, the carboplatin, taxol, and combination predictors significantly stratified responder patients and non-responder patients (p = 0.019, 0.04, 0.014) with sensitivity = 91%, 96%, 93 and NPV = 57%, 67%, 67% in pathologic clinical response. The combination predictor also demonstrated a significant survival difference between predicted responders and non-responders with a median survival of 55.4 months vs. 32.1 months. Thus, COXEN single- and combination-drug predictors successfully stratified platinum resistance and taxane response in an independent cohort of ovarian cancer patients based on their FFPE tumor samples.
PMCID: PMC3277593  PMID: 22348014
11.  The Cancer Genome Atlas Pan-Cancer Analysis Project 
Nature genetics  2013;45(10):1113-1120.
Cancer can take hundreds of different forms depending on the location, cell of origin and spectrum of genomic alterations that promote oncogenesis and affect therapeutic response. Although many genomic events with direct phenotypic impact have been identified, much of the complex molecular landscape remains incompletely charted for most cancer lineages. For that reason, The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumours to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences, and emergent themes across tumour lineages. The Pan-Cancer initiative compares the first twelve tumour types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumour types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.
PMCID: PMC3919969  PMID: 24071849
12.  Biomarker robustness reveals the PDGF network as driving disease outcome in ovarian cancer patients in multiple studies 
Ovarian cancer causes more deaths than any other gynecological cancer. Identifying the molecular mechanisms that drive disease progress in ovarian cancer is a critical step in providing therapeutics, improving diagnostics, and affiliating clinical behavior with disease etiology. Identification of molecular interactions that stratify prognosis is key in facilitating a clinical-molecular perspective.
The Cancer Genome Atlas has recently made available the molecular characteristics of more than 500 patients. We used the TCGA multi-analysis study, and two additional datasets and a set of computational algorithms that we developed. The computational algorithms are based on methods that identify network alterations and quantify network behavior through gene expression.
We identify a network biomarker that significantly stratifies survival rates in ovarian cancer patients. Interestingly, expression levels of single or sets of genes do not explain the prognostic stratification. The discovered biomarker is composed of the network around the PDGF pathway. The biomarker enables prognosis stratification.
The work presented here demonstrates, through the power of gene-expression networks, the criticality of the PDGF network in driving disease course. In uncovering the specific interactions within the network, that drive the phenotype, we catalyze targeted treatment, facilitate prognosis and offer a novel perspective into hidden disease heterogeneity.
PMCID: PMC3298526  PMID: 22236809
13.  Tumor Mutation Burden Forecasts Outcome in Ovarian Cancer with BRCA1 or BRCA2 Mutations 
PLoS ONE  2013;8(11):e80023.
Increased number of single nucleotide substitutions is seen in breast and ovarian cancer genomes carrying disease-associated mutations in BRCA1 or BRCA2. The significance of these genome-wide mutations is unknown. We hypothesize genome-wide mutation burden mirrors deficiencies in DNA repair and is associated with treatment outcome in ovarian cancer.
Methods and Results
The total number of synonymous and non-synonymous exome mutations (Nmut), and the presence of germline or somatic mutation in BRCA1 or BRCA2 (mBRCA) were extracted from whole-exome sequences of high-grade serous ovarian cancers from The Cancer Genome Atlas (TCGA). Cox regression and Kaplan-Meier methods were used to correlate Nmut with chemotherapy response and outcome. Higher Nmut correlated with a better response to chemotherapy after surgery. In patients with mBRCA-associated cancer, low Nmut was associated with shorter progression-free survival (PFS) and overall survival (OS), independent of other prognostic factors in multivariate analysis. Patients with mBRCA-associated cancers and a high Nmut had remarkably favorable PFS and OS. The association with survival was similar in cancers with either BRCA1 or BRCA2 mutations. In cancers with wild-type BRCA, tumor Nmut was associated with treatment response in patients with no residual disease after surgery.
Tumor Nmut was associated with treatment response and with both PFS and OS in patients with high-grade serous ovarian cancer carrying BRCA1 or BRCA2 mutations. In the TCGA cohort, low Nmut predicted resistance to chemotherapy, and for shorter PFS and OS, while high Nmut forecasts a remarkably favorable outcome in mBRCA-associated ovarian cancer. Our observations suggest that the total mutation burden coupled with BRCA1 or BRCA2 mutations in ovarian cancer is a genomic marker of prognosis and predictor of treatment response. This marker may reflect the degree of deficiency in BRCA-mediated pathways, or the extent of compensation for the deficiency by alternative mechanisms.
PMCID: PMC3827141  PMID: 24265793
14.  Validation of ovarian cancer gene expression signatures for survival and subtype in formalin fixed paraffin embedded tissues 
Gynecologic oncology  2012;129(1):159-164.
Gene expression signatures have been identified for epithelial ovarian cancer survival (TCGA) and intrinsic subtypes (Tothill et al.). One obstacle to clinical translation is these signatures were developed using frozen tissue, whereas usually only formalin-fixed, paraffin embedded (FFPE) tissue is available. The aim of this study was to determine if gene expression signatures can be translated to fixed archival tissues.
RNA extracted from FFPE sections from 240 primary ovarian cancers were analyzed by DASL on Illumina BeadChip arrays. Concordance of expression at the individual gene level was assessed by comparing array data from the same cancers (30 frozen samples analyzed on Affymetrix arrays versus FFPE DASL).
The correlation between FFPE and frozen survival signature estimates was 0.774. The TCGA signature using DASL was predictive of survival in 106 advanced stage high grade serous ovarian cancers (median survival 33 versus 60 months, estimated hazard ratio for death 2.30, p=0.0007). Similar to Tothill, we found using DASL that most high grade serous ovarian cancers (102/110, 93%) were assigned to subtypes 1, 2, 4 and 5, whereas most endometrioid, clear cell, mucinous and low grade serous cases (39/57, 68%) were assigned to subtypes 3 and 6 (p<10e-15).
Although individual probe estimates of microarrays may be weakly correlated between FFPE and frozen samples, combinations of probes have robust ability to predict survival and subtype. This suggests that it may be possible to use these signatures for prognostic and predictive purposes as we seek to individualize the treatment of ovarian cancer.
PMCID: PMC3733243  PMID: 23274563
ovarian cancer; gene expression; microarray; survival; histological subtype
15.  Differential Analysis of Ovarian and Endometrial Cancers Identifies a Methylator Phenotype 
PLoS ONE  2012;7(3):e32941.
Despite improved outcomes in the past 30 years, less than half of all women diagnosed with epithelial ovarian cancer live five years beyond their diagnosis. Although typically treated as a single disease, epithelial ovarian cancer includes several distinct histological subtypes, such as papillary serous and endometrioid carcinomas. To address whether the morphological differences seen in these carcinomas represent distinct characteristics at the molecular level we analyzed DNA methylation patterns in 11 papillary serous tumors, 9 endometrioid ovarian tumors, 4 normal fallopian tube samples and 6 normal endometrial tissues, plus 8 normal fallopian tube and 4 serous samples from TCGA. For comparison within the endometrioid subtype we added 6 primary uterine endometrioid tumors and 5 endometrioid metastases from uterus to ovary. Data was obtained from 27,578 CpG dinucleotides occurring in or near promoter regions of 14,495 genes. We identified 36 locations with significant increases or decreases in methylation in comparisons of serous tumors and normal fallopian tube samples. Moreover, unsupervised clustering techniques applied to all samples showed three major profiles comprising mostly normal samples, serous tumors, and endometrioid tumors including ovarian, uterine and metastatic origins. The clustering analysis identified 60 differentially methylated sites between the serous group and the normal group. An unrelated set of 25 serous tumors validated the reproducibility of the methylation patterns. In contrast, >1,000 genes were differentially methylated between endometrioid tumors and normal samples. This finding is consistent with a generalized regulatory disruption caused by a methylator phenotype. Through DNA methylation analyses we have identified genes with known roles in ovarian carcinoma etiology, whereas pathway analyses provided biological insight to the role of novel genes. Our finding of differences between serous and endometrioid ovarian tumors indicates that intervention strategies could be developed to specifically address subtypes of epithelial ovarian cancer.
PMCID: PMC3293923  PMID: 22403726
16.  Low MAD2 expression levels associate with reduced progression-free survival in patients with high-grade serous epithelial ovarian cancer 
The Journal of Pathology  2012;226(5):746-755.
Epithelial ovarian cancer (EOC) has an innate susceptibility to become chemoresistant. Up to 30% of patients do not respond to conventional chemotherapy [paclitaxel (Taxol®) in combination with carboplatin] and, of those who have an initial response, many patients relapse. Therefore, an understanding of the molecular mechanisms that regulate cellular chemotherapeutic responses in EOC cells has the potential to impact significantly on patient outcome. The mitotic arrest deficiency protein 2 (MAD2), is a centrally important mediator of the cellular response to paclitaxel. MAD2 immunohistochemical analysis was performed on 82 high-grade serous EOC samples. A multivariate Cox regression analysis of nuclear MAD2 IHC intensity adjusting for stage, tumour grade and optimum surgical debulking revealed that low MAD2 IHC staining intensity was significantly associated with reduced progression-free survival (PFS) (p = 0.0003), with a hazard ratio of 4.689. The in vitro analyses of five ovarian cancer cell lines demonstrated that cells with low MAD2 expression were less sensitive to paclitaxel. Furthermore, paclitaxel-induced activation of the spindle assembly checkpoint (SAC) and apoptotic cell death was abrogated in cells transfected with MAD2 siRNA. In silico analysis identified a miR-433 binding domain in the MAD2 3′ UTR, which was verified in a series of experiments. Firstly, MAD2 protein expression levels were down-regulated in pre-miR-433 transfected A2780 cells. Secondly, pre-miR-433 suppressed the activity of a reporter construct containing the 3′-UTR of MAD2. Thirdly, blocking miR-433 binding to the MAD2 3′ UTR protected MAD2 from miR-433 induced protein down-regulation. Importantly, reduced MAD2 protein expression in pre-miR-433-transfected A2780 cells rendered these cells less sensitive to paclitaxel. In conclusion, loss of MAD2 protein expression results in increased resistance to paclitaxel in EOC cells. Measuring MAD2 IHC staining intensity may predict paclitaxel responses in women presenting with high-grade serous EOC. Copyright © 2012 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
PMCID: PMC3593171  PMID: 22069160
miR-433; MAD2; chemoresistance; epithelial ovarian cancer; paclitaxel
17.  Identifying survival associated morphological features of triple negative breast cancer using multiple datasets 
Background and objective
Biomarkers for subtyping triple negative breast cancer (TNBC) are needed given the absence of responsive therapy and relatively poor prediction of survival. Morphology of cancer tissues is widely used in clinical practice for stratifying cancer patients, while genomic data are highly effective to classify cancer patients into subgroups. Thus integration of both morphological and genomic data is a promising approach in discovering new biomarkers for cancer outcome prediction. Here we propose a workflow for analyzing histopathological images and integrate them with genomic data for discovering biomarkers for TNBC.
Materials and methods
We developed an image analysis workflow for extracting a large collection of morphological features and deployed the same on histological images from The Cancer Genome Atlas (TCGA) TNBC samples during the discovery phase (n=44). Strong correlations between salient morphological features and gene expression profiles from the same patients were identified. We then evaluated the same morphological features in predicting survival using a local TNBC cohort (n=143). We further tested the predictive power on patient prognosis of correlated gene clusters using two other public gene expression datasets.
Results and conclusion
Using TCGA data, we identified 48 pairs of significantly correlated morphological features and gene clusters; four morphological features were able to separate the local cohort with significantly different survival outcomes. Gene clusters correlated with these four morphological features further proved to be effective in predicting patient survival using multiple public gene expression datasets. These results suggest the efficacy of our workflow and demonstrate that integrative analysis holds promise for discovering biomarkers of complex diseases.
PMCID: PMC3721170  PMID: 23585272
Triple Negative Breast Cancer; Computational Biology; Image Analysis; Cancer Survival; Biomarker Identification; The Cancer Genome Atlas
18.  The Proneural Molecular Signature Is Enriched in Oligodendrogliomas and Predicts Improved Survival among Diffuse Gliomas 
PLoS ONE  2010;5(9):e12548.
The Cancer Genome Atlas Project (TCGA) has produced an extensive collection of ‘-omic’ data on glioblastoma (GBM), resulting in several key insights on expression signatures. Despite the richness of TCGA GBM data, the absence of lower grade gliomas in this data set prevents analysis genes related to progression and the uncovering of predictive signatures. A complementary dataset exists in the form of the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt), which contains molecular and clinical data for diffuse gliomas across the full spectrum of histologic class and grade. Here we present an investigation of the significance of the TCGA consortium's expression classification when applied to Rembrandt gliomas. We demonstrate that the proneural signature predicts improved clinical outcome among 176 Rembrandt gliomas that includes all histologies and grades, including GBMs (log rank test p = 1.16e-6), but also among 75 grade II and grade III samples (p = 2.65e-4). This gene expression signature was enriched in tumors with oligodendroglioma histology and also predicted improved survival in this tumor type (n = 43, p = 1.25e-4). Thus, expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for lower grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy. Integrated DNA and RNA analysis of low-grade and high-grade proneural gliomas identified increased expression and gene amplification of several genes including GLIS3, TGFB2, TNC, AURKA, and VEGFA in proneural GBMs, with corresponding loss of DLL3 and HEY2. Pathway analysis highlights the importance of the Notch and Hedgehog pathways in the proneural subtype. This demonstrates that the expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for low-grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy.
PMCID: PMC2933229  PMID: 20838435
19.  Network-based Survival Analysis Reveals Subnetwork Signatures for Predicting Outcomes of Ovarian Cancer Treatment 
PLoS Computational Biology  2013;9(3):e1002975.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at
Author Summary
Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
PMCID: PMC3605061  PMID: 23555212
20.  Unique genome-wide map of TCF4 and STAT3 targets using ChIP-seq reveals their association with new molecular subtypes of glioblastoma 
Neuro-Oncology  2013;15(3):279-289.
Aberrant activation of beta-catenin/TCF4 and STAT3 signaling in glioblastoma multiforme (GBM) has been reported. However, the molecular mechanisms related to this process are still poorly understood.
Genome-wide screening of the binding characteristics of the transcription factors TCF4 and STAT3 in GBM cells was performed by chromatin immunoprecipitation sequencing (ChIP-seq) assay. Hierarchical clustering was used to analyze the association of TCF4 and STAT3 coregulated genes with The Cancer Genome Atlas (TCGA) GBM subtypes (classical, mesenchymal, neural, and proneural). New molecular classification of GBM was proposed and validated in Western and Asian populations.
We identified 1250 overlapping putative target genes that were coregulated by TCF4 and STAT3. Further, the coregulated genes had the potential to guide TCGA GBM subtypes. Finally, we proposed a new molecular classification of GBM into 2 subtypes (proneural-like and mesenchymal-like) and showed that the new classification could be applied to both Western and Asian populations. In addition, the GBM response to temozolomide therapy differed depending on its subtype; mesenchymal-like GBM benefited, while there was no benefit for proneural-like GBM.
This is the first comprehensive study to combine a ChIP-seq assay of TCF4 and STAT3 and data mining of patient cohorts to derive molecular subtypes of GBM.
PMCID: PMC3578485  PMID: 23295773
ChIP-seq; glioblastoma; molecular subtype; STAT3; TCF4
21.  New Hypothesis on Pathogenesis of Ovarian Cancer Lead to Future Tailored Approaches 
BioMed Research International  2013;2013:852839.
In the last decades, management of epithelial ovarian cancer (EOC) has been based on the staging system of the International Federation of Gynecology and Obstetrics (FIGO), and different classifications have been proposed for EOC that take account of grade of differentiation, histological subtype, and clinical features. However, despite taxonomic efforts, EOC appears to be not a unique disease; its subtypes differ for epidemiological and genetic risk factors, precursor lesions, patterns of spread, response to chemotherapy, and prognosis. Nevertheless, carboplatin plus paclitaxel combination represents the only standard treatment in adjuvant and advanced settings. This paper summarizes theories about the classification and origin of EOC and classical and new prognostic factors. It presents data about standard treatment and novel agents. We speculate about the possibility to create tailored therapy based on specific mutations in ovarian cancer and to personalize prevention.
PMCID: PMC3766984  PMID: 24063014
22.  GSVD Comparison of Patient-Matched Normal and Tumor aCGH Profiles Reveals Global Copy-Number Alterations Predicting Glioblastoma Multiforme Survival 
PLoS ONE  2012;7(1):e30098.
Despite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in 3% of the patients. These include the biochemically putative drug target, cell cycle-regulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched profiles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classification of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern.
PMCID: PMC3264559  PMID: 22291905
23.  Identification of PBX1 Target Genes in Cancer Cells by Global Mapping of PBX1 Binding Sites 
PLoS ONE  2012;7(5):e36054.
PBX1 is a TALE homeodomain transcription factor involved in organogenesis and tumorigenesis. Although it has been shown that ovarian, breast, and melanoma cancer cells depend on PBX1 for cell growth and survival, the molecular mechanism of how PBX1 promotes tumorigenesis remains unclear. Here, we applied an integrated approach by overlapping PBX1 ChIP-chip targets with the PBX1-regulated transcriptome in ovarian cancer cells to identify genes whose transcription was directly regulated by PBX1. We further determined if PBX1 target genes identified in ovarian cancer cells were co-overexpressed with PBX1 in carcinoma tissues. By analyzing TCGA gene expression microarray datasets from ovarian serous carcinomas, we found co-upregulation of PBX1 and a significant number of its direct target genes. Among the PBX1 target genes, a homeodomain protein MEOX1 whose DNA binding motif was enriched in PBX1-immunoprecipicated DNA sequences was selected for functional analysis. We demonstrated that MEOX1 protein interacts with PBX1 protein and inhibition of MEOX1 yields a similar growth inhibitory phenotype as PBX1 suppression. Furthermore, ectopically expressed MEOX1 functionally rescued the PBX1-withdrawn effect, suggesting MEOX1 mediates the cellular growth signal of PBX1. These results demonstrate that MEOX1 is a critical target gene and cofactor of PBX1 in ovarian cancers.
PMCID: PMC3342315  PMID: 22567123
24.  Targeted treatment of folate receptor-positive platinum-resistant ovarian cancer and companion diagnostics, with specific focus on vintafolide and etarfolatide 
Among the gynecological malignancies, ovarian cancer is the leading cause of mortality in developed countries. Treatment of ovarian cancer is based on surgery integrated with chemotherapy. Platinum-based drugs (cisplatin and carboplatin) comprise the core of first-line chemotherapy for patients with advanced ovarian cancer. Platinum-resistant ovarian cancer can be treated with cytotoxic chemotherapeutics such as paclitaxel, topotecan, PEGylated liposomal doxorubicin, or gemcitabine, but many patients eventually relapse on treatment. Targeted therapies based on agents specifically directed to overexpressed receptors, or to selected molecular targets, may be the future of clinical treatment. In this regard, overexpression of folate receptor-α on the surface of almost all epithelial ovarian cancers makes this receptor an excellent “tumor-associated antigen”. With appropriate use of spacers/linkers, folate-targeted drugs can be distributed within the body, where they preferentially bind to ovarian cancer cells and are released inside their target cells. Here they can exert their desired cytotoxic function. Based on this strategy, 12 years after it was first described, a folate-targeted vinblastine derivative has now reached Phase III clinical trials in ovarian cancer. This review examines the importance of folate targeting, the state of the art of a vinblastine folate-targeted agent (vintafolide) for treating platinum-resistant ovarian cancer, and its diagnostic companion (etarfolatide) as a prognostic agent. Etarfolatide is a valuable noninvasive diagnostic imaging agent with which to select ovarian cancer patient populations that may benefit from this specific targeted therapy.
PMCID: PMC3917542  PMID: 24516337
vintafolide; etarfolatide; platinum-resistant ovarian cancer; targeted therapy; biomarkers; folate receptor
25.  Why Is There a Lack of Consensus on Molecular Subgroups of Glioblastoma? Understanding the Nature of Biological and Statistical Variability in Glioblastoma Expression Data 
PLoS ONE  2011;6(7):e20826.
Gene expression patterns characterizing clinically-relevant molecular subgroups of glioblastoma are difficult to reproduce. We suspect a combination of biological and analytic factors confounds interpretation of glioblastoma expression data. We seek to clarify the nature and relative contributions of these factors, to focus additional investigations, and to improve the accuracy and consistency of translational glioblastoma analyses.
We analyzed gene expression and clinical data for 340 glioblastomas in The Cancer Genome Atlas (TCGA). We developed a logic model to analyze potential sources of biological, technical, and analytic variability and used standard linear classifiers and linear dimensional reduction algorithms to investigate the nature and relative contributions of each factor.
Commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and technical noise make measurable but proportionally minor contributions to inconsistent molecular classification. Our analysis suggests that three, previously underappreciated factors may account for a larger fraction of classification errors: inherent non-linear/non-orthogonal relationships among the genes used in conjunction with classification algorithms that assume linearity; skewed data distributions assumed to be Gaussian; and biologic variability (noise) among tumors, of which we propose three types.
Our analysis of the TCGA data demonstrates a contributory role for technical factors in molecular classification inconsistencies in glioblastoma but also suggests that biological variability, abnormal data distribution, and non-linear relationships among genes may be responsible for a proportionally larger component of classification error. These findings may have important implications for both glioblastoma research and for translational application of other large-volume biological databases.
PMCID: PMC3145641  PMID: 21829433

