Search tips
Search criteria

Results 1-25 (1539146)

Clipboard (0)

Related Articles

1.  Survival-Related Profile, Pathways, and Transcription Factors in Ovarian Cancer 
PLoS Medicine  2009;6(2):e1000024.
Ovarian cancer has a poor prognosis due to advanced stage at presentation and either intrinsic or acquired resistance to classic cytotoxic drugs such as platinum and taxoids. Recent large clinical trials with different combinations and sequences of classic cytotoxic drugs indicate that further significant improvement in prognosis by this type of drugs is not to be expected. Currently a large number of drugs, targeting dysregulated molecular pathways in cancer cells have been developed and are introduced in the clinic. A major challenge is to identify those patients who will benefit from drugs targeting these specific dysregulated pathways.The aims of our study were (1) to develop a gene expression profile associated with overall survival in advanced stage serous ovarian cancer, (2) to assess the association of pathways and transcription factors with overall survival, and (3) to validate our identified profile and pathways/transcription factors in an independent set of ovarian cancers.
Methods and Findings
According to a randomized design, profiling of 157 advanced stage serous ovarian cancers was performed in duplicate using ∼35,000 70-mer oligonucleotide microarrays. A continuous predictor of overall survival was built taking into account well-known issues in microarray analysis, such as multiple testing and overfitting. A functional class scoring analysis was utilized to assess pathways/transcription factors for their association with overall survival. The prognostic value of genes that constitute our overall survival profile was validated on a fully independent, publicly available dataset of 118 well-defined primary serous ovarian cancers. Furthermore, functional class scoring analysis was also performed on this independent dataset to assess the similarities with results from our own dataset. An 86-gene overall survival profile discriminated between patients with unfavorable and favorable prognosis (median survival, 19 versus 41 mo, respectively; permutation p-value of log-rank statistic = 0.015) and maintained its independent prognostic value in multivariate analysis. Genes that composed the overall survival profile were also able to discriminate between the two risk groups in the independent dataset. In our dataset 17/167 pathways and 13/111 transcription factors were associated with overall survival, of which 16 and 12, respectively, were confirmed in the independent dataset.
Our study provides new clues to genes, pathways, and transcription factors that contribute to the clinical outcome of serous ovarian cancer and might be exploited in designing new treatment strategies.
Ate van der Zee and colleagues analyze the gene expression profiles of ovarian cancer samples from 157 patients, and identify an 86-gene expression profile that seems to predict overall survival.
Editors' Summary
Ovarian cancer kills more than 100,000 women every year and is one of the most frequent causes of cancer death in women in Western countries. Most ovarian cancers develop when an epithelial cell in one of the ovaries (two small organs in the pelvis that produce eggs) acquires genetic changes that allow it to grow uncontrollably and to spread around the body (metastasize). In its early stages, ovarian cancer is confined to the ovaries and can often be treated successfully by surgery alone. Unfortunately, early ovarian cancer rarely has symptoms so a third of women with ovarian cancer have advanced disease when they first visit their doctor with symptoms that include vague abdominal pains and mild digestive disturbances. That is, cancer cells have spread into their abdominal cavity and metastasized to other parts of the body (so-called stage III and IV disease). The outlook for women diagnosed with stage III and IV disease, which are treated with a combination of surgery and chemotherapy, is very poor. Only 30% of women with stage III, and 5% with stage IV, are still alive five years after their cancer is diagnosed.
Why Was This Study Done?
If the cellular pathways that determine the biological behavior of ovarian cancer could be identified, it might be possible to develop more effective treatments for women with stage III and IV disease. One way to identify these pathways is to use gene expression profiling (a technique that catalogs all the genes expressed by a cell) to compare gene expression patterns in the ovarian cancers of women who survive for different lengths of time. Genes with different expression levels in tumors with different outcomes could be targets for new treatments. For example, it might be worth developing inhibitors of proteins whose expression is greatest in tumors with short survival times. In this study, the researchers develop an expression profile that is associated with overall survival in advanced-stage serous ovarian cancer (more than half of ovarian cancers originate in serous cells, epithelial cells that secrete a watery fluid). The researchers also assess the association of various cellular pathways and transcription factors (proteins that control the expression of other proteins) with survival in this type of ovarian carcinoma.
What Did the Researchers Do and Find?
The researchers analyzed the gene expression profiles of tumor samples taken from 157 patients with advanced stage serous ovarian cancer and used the “supervised principal components” method to build a predictor of overall survival from these profiles and patient survival times. This 86-gene predictor discriminated between patients with favorable and unfavorable outcomes (average survival times of 41 and 19 months, respectively). It also discriminated between groups of patients with these two outcomes in an independent dataset collected from 118 additional serous ovarian cancers. Next, the researchers used “functional class scoring” analysis to assess the association between pathway and transcription factor expression in the tumor samples and overall survival. Seventeen of 167 KEGG pathways (“wiring” diagrams of molecular interactions, reactions and relations involved in cellular processes and human diseases listed in the Kyoto Encyclopedia of Genes and Genomes) were associated with survival, 16 of which were confirmed in the independent dataset. Finally, 13 of 111 analyzed transcription factors were associated with overall survival in the tumor samples, 12 of which were confirmed in the independent dataset.
What Do These Findings Mean?
These findings identify an 86-gene overall survival gene expression profile that seems to predict overall survival for women with advanced serous ovarian cancer. However, before this profile can be used clinically, further validation of the profile and more robust methods for determining gene expression profiles are needed. Importantly, these findings also provide new clues about the genes, pathways and transcription factors that contribute to the clinical outcome of serous ovarian cancer, clues that can now be exploited in the search for new treatment strategies. Finally, these findings suggest that it might eventually be possible to tailor therapies to the needs of individual patients by analyzing which pathways are activated in their tumors and thus improve survival times for women with advanced ovarian cancer.
Additional Information.
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Simon Gayther and Kate Lawrenson
See also a related PLoS Medicine Research Article by Huntsman and colleagues
The US National Cancer Institute provides a brief description of what cancer is and how it develops, and information on all aspects of ovarian cancer for patients and professionals (in English and Spanish)
The UK charity Cancerbackup provides general information about cancer, and more specific information about ovarian cancer
MedlinePlus also provides links to other information about ovarian cancer (in English and Spanish)
The KEGG Pathway database provides pathway maps of known molecular networks involved in a wide range of cellular processes
PMCID: PMC2634794  PMID: 19192944
2.  Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia 
PLoS Computational Biology  2014;10(10):e1003908.
Gene expression is a combinatorial function of genetic/epigenetic factors such as copy number variation (CNV), DNA methylation (DM), transcription factors (TF) occupancy, and microRNA (miRNA) post-transcriptional regulation. At the maturity of microarray/sequencing technologies, large amounts of data measuring the genome-wide signals of those factors became available from Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA). However, there is a lack of an integrative model to take full advantage of these rich yet heterogeneous data. To this end, we developed RACER (Regression Analysis of Combined Expression Regulation), which fits the mRNA expression as response using as explanatory variables, the TF data from ENCODE, and CNV, DM, miRNA expression signals from TCGA. Briefly, RACER first infers the sample-specific regulatory activities by TFs and miRNAs, which are then used as inputs to infer specific TF/miRNA-gene interactions. Such a two-stage regression framework circumvents a common difficulty in integrating ENCODE data measured in generic cell-line with the sample-specific TCGA measurements. As a case study, we integrated Acute Myeloid Leukemia (AML) data from TCGA and the related TF binding data measured in K562 from ENCODE. As a proof-of-concept, we first verified our model formalism by 10-fold cross-validation on predicting gene expression. We next evaluated RACER on recovering known regulatory interactions, and demonstrated its superior statistical power over existing methods in detecting known miRNA/TF targets. Additionally, we developed a feature selection procedure, which identified 18 regulators, whose activities clustered consistently with cytogenetic risk groups. One of the selected regulators is miR-548p, whose inferred targets were significantly enriched for leukemia-related pathway, implicating its novel role in AML pathogenesis. Moreover, survival analysis using the inferred activities identified C-Fos as a potential AML prognostic marker. Together, we provided a novel framework that successfully integrated the TCGA and ENCODE data in revealing AML-specific regulatory program at global level.
Author Summary
Recent studies from The Cancer Genome Atlas (TCGA) showed that most Acute Myeloid Leukemia (AML) patients lack DNA mutations, which can potentially explain the tumorigenesis, and motivated a systematic approach to elucidate aberrant molecular signatures at the transcriptional and epigenetic levels. Using recently available data from two large consortia namely Encyclopedia of DNA Elements and TCGA, we developed a novel computational model to infer the regulatory activities of the expression regulators and their target genes in AML samples. Our analysis revealed 18 regulators whose dysregulation contributed significantly to explaining the global mRNA expression changes. Encouragingly, the inferred activities of these regulatory features followed a consistent pattern with cytogenetic phenotypes of the AML patients. Among these regulators, we identified microRNA hsa-miR-548p, whose regulatory relationships with leukemia-related genes including YY1 suggest its novel role in AML pathogenesis. Additionally, we discovered that the inferred activities of transcription factor C-Fos can be used as a prognostic marker to characterize survival rate of the AML patients. Together, we demonstrated an effective model that can integrate useful information from a large amount of heterogeneous data to dissect regulatory effects. Furthermore, the novel biological findings from this study may be constructive to future experimental research in AML.
PMCID: PMC4207489  PMID: 25340776
3.  Optimal chemotherapy treatment for women with recurrent ovarian cancer 
Current Oncology  2007;14(5):195-208.
What is the optimal chemotherapy treatment for women with recurrent ovarian cancer who have previously received platinum-based chemotherapy?
Currently, standard primary therapy for advanced disease involves a combination of maximal cytoreductive surgery and chemotherapy with carboplatin plus paclitaxel or with carboplatin alone. Despite initial high response rates, a large proportion of patients relapse, resulting in a therapeutic challenge. Because these patients are not curable, the goal of therapy becomes improvement in both quality and length of life. The search has therefore been to find active agents for women with recurrent disease following platinum-based chemotherapy.
Outcomes of interest included any combination of tumour response rate, progression-free survival, overall survival, adverse events, and quality of life.
The medline, embase, and Cochrane Library databases were systematically searched for primary articles and practice guidelines. The resulting evidence informed the development of clinical practice recommendations. The systematic review and recommendations were approved by the Report Approval Panel of the Program in Evidence-Based Care, and by the Gynecology Cancer Disease Site Group (dsg). The practice guideline was externally reviewed by a sample of practitioners from Ontario, Canada.
Thirteen randomized trials compared various chemotherapy regimens for patients with recurrent ovarian cancer.
In five of the thirteen trials in which 100% of patients were considered sensitive to platinum-containing chemotherapy, further platinum-based combination chemotherapy significantly improved response rates (two trials), progression-free survival (four trials), and overall survival (three trials) when compared with single-agent chemotherapy involving carboplatin or paclitaxel. Only two of these randomized trials compared the same chemotherapy regimens: carboplatin alone versus the combination of carboplatin and paclitaxel. Both trials were consistent in reporting improved survival outcomes with the combination of carboplatin and paclitaxel. In one trial, the combination of carboplatin and gemcitabine resulted in significantly higher response rates and improved progression-free survival when compared with carboplatin alone. Median survival with carboplatin alone ranged from 17 months to 24 months in four trials.
In eight of the thirteen trials in which 35%–100% of patients had platinum-refractory or -resistant disease, one trial reported a statistically significant 2-month improvement in overall survival with liposomal doxorubicin as compared with topotecan (15 months vs. 13 months, p = 0.038; hazard ratio: 1.23; 95% confidence interval: 1.01 to 1.50). In that trial, because of the limited clinical benefit and the unusual finding that a survival difference emerged only after a year of treatment with no corresponding improvement in the rate of response or of progression-free survival, the authors concluded that further confirmation by results from randomized trials were needed to establish the superiority of one agent over another in their trial. In one trial, topotecan was superior to treosulphan in patient progression-free survival by a span of approximately 2 months (5.4 months vs. 3.0 months, p < 0.001).
Toxicity was reported in all of the randomized trials, and although data on adverse events varied by treatment regimen, the observed adverse events correlated with known toxicity profiles. As expected, combination chemotherapy was associated with higher rates of adverse events.
Practice Guideline
Target Population
This clinical recommendation applies to women with recurrent epithelial ovarian cancer who have previously received platinum-based chemotherapy. Of specific interest are women who have previously shown sensitivity to platinum therapy and those who previously were refractory or resistant to platinum-based chemotherapy. As a general categorization within what is actually a continuum, “platinum sensitivity” refers to disease recurrence 6 months or more after prior platinum-containing chemotherapy, and “platinum resistance” refers to a response to platinum-based chemotherapy followed by relapse less than 6 months after chemotherapy is stopped. “Platinum-refractory disease” refers to a lack of response or to progression while on platinum-based chemotherapy.
Although the body of evidence that informs the clinical recommendations is based on randomized trial data, those data are incomplete. Based on the available data and expert consensus opinion, the Gynecology Cancer dsg makes these recommendations:
Systemic therapy for recurrent ovarian cancer is not curative. It is therefore recognized that each patient must be individually assessed to determine optimal therapy in terms of recurrence, sensitivity to platinum, toxicity, ease of administration, and patient preference. All suitable patients should be offered the opportunity to participate in randomized trials, if available.
In the absence of contraindications, combination platinum-based chemotherapy should be considered for patients with prior sensitivity to platinum-containing chemotherapy. As compared with carboplatin alone, the combination of carboplatin and paclitaxel significantly improved both progression-free and overall survival.
If combination platinum-based chemotherapy is not indicated, then a single platinum agent should be considered. Carboplatin has demonstrated efficacy across trials and has a manageable toxicity profile.
If a single platinum agent is not being considered, then monotherapy with paclitaxel, topotecan, or pegylated liposomal doxorubicin are seen as reasonable treatment options.
Some patients may be repeatedly sensitive to treatment and may benefit from multiple lines of chemotherapy.
For patients with platinum-refractory or platinum-resistant disease, the goals of treatment should be to improve quality of life by extending the symptom-free interval, by reducing symptom intensity, and by increasing progression-free interval, and, if possible, to prolong life.
With non-platinum agents, monotherapy should be considered because no advantage appears to accrue to the use of non-platinum-containing combination chemotherapy in this group of patients. Single-agent paclitaxel, topotecan, or pegylated liposomal doxorubicin have demonstrated activity in this patient population and are reasonable treatment options.
No evidence either supports or refutes the use of more than one line of chemotherapy in patients with platinum-refractory or platinum-resistant recurrence. Many treatment options have shown modest response rates, but their benefits over best supportive care have not been studied in clinical trials.
PMCID: PMC2002482  PMID: 17938703
Chemotherapy; drug therapy; ovarian cancer; ovarian neoplasms; practice guideline; systematic review
4.  Network-based Survival Analysis Reveals Subnetwork Signatures for Predicting Outcomes of Ovarian Cancer Treatment 
PLoS Computational Biology  2013;9(3):e1002975.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at
Author Summary
Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
PMCID: PMC3605061  PMID: 23555212
5.  High level of chromosomal aberration in ovarian cancer genome correlates with poor clinical outcome 
Gynecologic oncology  2012;128(3):500-505.
Structural aberration in chromosomes characterizes almost all human solid cancers and analysis of those alterations may reveal the history of chromosomal instability. However, the clinical significance of massive chromosomal abnormality in ovarian high-grade serous carcinoma (HGSC) remains elusive. In this study, we addressed this issue by analyzing the genomic profiles in 455 ovarian HGSCs available from The Cancer Genome Atlas (TCGA).
DNA copy number, mRNA expression, and clinical information were downloaded from the TCGA data portal. A chromosomal disruption index (CDI) was developed to summarize the extent of copy number aberrations across the entire genome. A Cox regression model was applied to identify factors associated with poor prognosis. Genes whose expression was associated with CDI were identified by a 2-stage multivariate linear regression and were used to find enriched pathways by Ingenuity Pathway Analysis.
Multivariate survival analysis showed that a higher CDI was significantly associated with a worse overall survival in patients. Interestingly, the pattern of DNA copy number alterations across all the chromosomes was similar between tumors with high and low CDI, suggesting they did not arise from different mechanisms. We also observed that expression of several genes was highly correlated with the CDI, even after adjusting for local copy number variation. We found that molecular pathways involving DNA damage response and mitosis were significantly enriched in these CDI-correlated genes.
Our results provide a new insight into the role of chromosomal rearrangement in the development of HGSC and the promise of applying CDI in risk-stratifying HGSC patients, perhaps for different clinical managements. The genes whose expression is correlated with CDI are worthy of further study to elucidate the mechanism of chromosomal instability in HGSC.
PMCID: PMC4364416  PMID: 23200914
Chromosomal instability; Ovarian cancer; High-grade serous carcinoma
6.  TP53 oncomorphic mutations predict resistance to platinum- and taxane-based standard chemotherapy in patients diagnosed with advanced serous ovarian carcinoma 
International Journal of Oncology  2014;46(2):607-618.
Individual mutations in the tumor suppressor TP53 alter p53 protein function. Some mutations create a non-functional protein, whereas others confer oncogenic activity, which we term ‘oncomorphic’. Since mutations in TP53 occur in nearly all ovarian tumors, the objective of this study was to determine the relationship of oncomorphic TP53 mutations with patient outcomes in advanced serous ovarian cancer patients. Clinical and molecular data from 264 high-grade serous ovarian cancer patients uniformly treated with standard platinum- and taxane-based adjuvant chemotherapy were downloaded from The Cancer Genome Atlas (TCGA) portal. Additionally, patient samples were obtained from the University of Iowa and individual mutations were analyzed in ovarian cancer cell lines. Mutations in the TP53 were annotated and categorized as oncomorphic, loss of function (LOF), or unclassified. Associations between mutation types, chemoresistance, recurrence, and progression-free survival (PFS) were calculated. Oncomorphic TP53 mutations were present in 21.3% of ovarian cancers in the TCGA dataset. Patients with oncomorphic TP53 mutations demonstrated significantly worse PFS, a 60% higher risk of recurrence (HR=1.60, 95% confidence intervals 1.09, 2.33, p=0.015), and higher rates of platinum resistance (χ2 test p=0.0024) when compared with single nucleotide mutations not categorized as oncomorphic. Furthermore, tumors containing oncomorphic TP53 mutations displayed unique protein expression profiles, and some mutations conferred increased clonogenic capacity in ovarian cancer cell models. Our study reveals that oncomorphic TP53 mutations are associated with worse patient outcome. These data suggest that future studies should take into consideration the functional consequences of TP53 mutations when determining treatment options.
PMCID: PMC4277253  PMID: 25385265
oncomorphic p53 mutation; TP53; gain-of-function; ovarian cancer; chemoresistance
7.  Identifying survival associated morphological features of triple negative breast cancer using multiple datasets 
Background and objective
Biomarkers for subtyping triple negative breast cancer (TNBC) are needed given the absence of responsive therapy and relatively poor prediction of survival. Morphology of cancer tissues is widely used in clinical practice for stratifying cancer patients, while genomic data are highly effective to classify cancer patients into subgroups. Thus integration of both morphological and genomic data is a promising approach in discovering new biomarkers for cancer outcome prediction. Here we propose a workflow for analyzing histopathological images and integrate them with genomic data for discovering biomarkers for TNBC.
Materials and methods
We developed an image analysis workflow for extracting a large collection of morphological features and deployed the same on histological images from The Cancer Genome Atlas (TCGA) TNBC samples during the discovery phase (n=44). Strong correlations between salient morphological features and gene expression profiles from the same patients were identified. We then evaluated the same morphological features in predicting survival using a local TNBC cohort (n=143). We further tested the predictive power on patient prognosis of correlated gene clusters using two other public gene expression datasets.
Results and conclusion
Using TCGA data, we identified 48 pairs of significantly correlated morphological features and gene clusters; four morphological features were able to separate the local cohort with significantly different survival outcomes. Gene clusters correlated with these four morphological features further proved to be effective in predicting patient survival using multiple public gene expression datasets. These results suggest the efficacy of our workflow and demonstrate that integrative analysis holds promise for discovering biomarkers of complex diseases.
PMCID: PMC3721170  PMID: 23585272
Triple Negative Breast Cancer; Computational Biology; Image Analysis; Cancer Survival; Biomarker Identification; The Cancer Genome Atlas
8.  Comparing gene expression data from formalin-fixed, paraffin embedded tissues and qPCR with that from snap-frozen tissue and microarrays for modeling outcomes of patients with ovarian carcinoma 
Previously, we have used clinical and gene expression data from The Cancer Genome Atlas (TCGA) to model a pathway-based index predicting outcomes in ovarian carcinoma. This data were obtained from snap-frozen tissue measured with the Affymetrix U133 platform. In the current study, we correlate the data used to model with data derived from TaqMan qPCR both snap frozen and paraffin embedded (FFPE) samples.
To compare the effect of preservation methods on gene expression measured by qPCR, we assessed 18 patient and tumor sample matched snap-frozen and FFPE ovarian carcinoma samples. To compare gene measurement technologies, we correlated qPCR data from 10 patients with tumor sample matched snap-frozen ovarian carcinoma samples with the microarray data from TCGA. We normalized results to the average expression of three housekeeping genes. We scaled and centered the data for comparison to the Affymetrix output.
For the 18 specimens, gene expression data obtained from snap-frozen tissue correlated highly with that from FFPE samples in our TaqMan assay (r > 0.82). For the 10 duplicate TCGA specimens, the reported microarray data correlated well (r = 0.6) with our qPCR data, and ranges of expression along pathways were similar.
Gene expression data obtained by qPCR from FFPE serous ovarian carcinoma samples can be used to assess in the pathway-based predictive model. The normalization procedures described control variations in expression, and the range calculated along a specific pathway can be interpreted for a patient’s risk profile.
Electronic supplementary material
The online version of this article (doi:10.1186/s12907-015-0017-1) contains supplementary material, which is available to authorized users.
PMCID: PMC4582729  PMID: 26412982
Neuro-Oncology  2014;16(Suppl 3):iii8.
BACKGROUND: Glioma sphere-forming cells (GSCs) derived from surgical specimens are a fundamental resource to study glioblastoma (GBM) biology. Mesenchymal-expressing GSCs have been proposed as a source of treatment resistance and mesenchymal tumors correlate with poorer survival. Recently, we found that the anti-angiogensis drug bevacizumab appeared to provide no benefit to patients with mesenchymal tumors, in contradiction to expectations that a mesenchymal microenvironment may benefit from anti-angiogenesis therapy. We have developed a collection of GSCs that have undergone comprehensive genomic characterization, similar to that performed by the Cancer Genome Atlas (TCGA) for whole tumor specimens. We hypothesized that the genomic landscape of GSCs would recapitulate what was observed by TCGA. METHODS: 47 GSCs were obtained from primary culture of fresh tumor specimens obtained at surgery and cultured as 3-dimensional spheres in the absence of serum. All lines were subjected to RNAseq (75bp paired-end, 100X coverage), copy number analysis (Affymetrix Oncoscan 2.0), whole methylome (Illumina Infinium 450k bead array), and targeted resequencing of known cancer-associated genes. Whole exome sequencing was performed for 22 GSCs. Gene expression was determined by reads per kilobase per million (RPKM) using an RNA sequencing data analysis pipeline (PRADA) and somatic mutations identified by a commonly used detection method (MuTech). Consensus clustering based on none-negative matrix factorization (CNMF) was performed on expression data and correlation to TCGA clusters determined by single-sample gene set enrichment analysis (ssGSEA). RESULTS: While global copy number alterations such as gain of chromosome 7 at the EGFR locus or loss of chromosome 10 at the PTEN locus were shared between tumor and matched GSC, the rate of somatic events was significantly higher in GSCs compared to tumors (range 47-570, median 124 vs range 2-255, median 65). Optimization of CNMF identified a total of 5 gene-expression clusters. GSCs in only one of these clusters showed enrichment for a unique TCGA class, mesenchymal. GSCs in other clusters were divided among multiple TCGA classes. CONCLUSIONS: Mesenchymal glioblastomas are derived from mesenchymal GSCs, suggesting that the tumor component is the largest contributor to the aggressive biology of this subtype. GSCs from other tumor subtypes correlate to multiple TCGA classes, suggesting that tumor stroma may contribute to the expression phenotype in those cases. Therapeutics targeting the microenvironment, such as anti-angiogenesis drugs, may have a greater role in non-mesenchymal tumors where the stromal contribution is more prominent. SECONDARY CATEGORY: Neuropathology & Tumor Biomarkers.
PMCID: PMC4144493
10.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma 
DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma.
We introduce a modeling approach termed EPoC (Endogenous Perturbation analysis of Cancer), enabling the construction of global, gene-level models that causally connect gene copy number with expression in glioblastoma.On the basis of the resulting model, we predict genes that are likely to be disease-driving and validate selected predictions experimentally. We also demonstrate that further analysis of the network model by sparse singular value decomposition allows stratification of patients with glioblastoma into short-term and long-term survivors, introducing decomposed network models as a useful principle for biomarker discovery.Finally, in systematic comparisons, we demonstrate that EPoC is computationally efficient and yields more consistent results than mRNA-only methods, standard eQTL methods, and two recent multivariate methods for genotype–mRNA coupling.
Gains and losses of chromosomal material (DNA copy number aberrations; CNAs) are a characteristic feature of cancer genomes. At the level of a single locus, it is well known that increased copy number (gene amplification) typically leads to increased gene expression, whereas decreased copy number (gene deletion) leads to decreased gene expression (Pollack et al, 2002; Lee et al, 2008; Nilsson et al, 2008). However, CNAs also affect the expression of genes located outside the amplified/deleted region itself via indirect mechanisms. To fully understand the action of CNAs, it is therefore necessary to analyze their action in a network context. Toward this goal, improved computational approaches will be important, if not essential.
To determine the global effects on transcription of CNAs in the brain tumor glioblastoma, we develop EPoC (Endogenous Perturbation analysis of Cancer), a computational technique capable of inferring sparse, causal network models by combining genome-wide, paired CNA- and mRNA-level data. EPoC aims to detect disease-driving copy number aberrations and their effect on target mRNA expression, and stratify patients into long-term and short-term survivors. Technically, EPoC relates CNA perturbations to mRNA responses by matrix equations, derived from a steady-state approximation of the transcriptional network. Patient prognostic scores are obtained from singular value decompositions of the network matrix. The models are constructed by solving a large-scale, regularized regression problem.
We apply EPoC to glioblastoma data from The Cancer Genome Atlas (TCGA) consortium (186 patients). The identified CNA-driven network comprises 10 672 genes, and contains a number of copy number-altered genes that control multiple downstream genes. Highly connected hub genes include well-known oncogenes and tumor supressor genes that are frequently deleted or amplified in glioblastoma, including EGFR, PDGFRA, CDKN2A and CDKN2B, confirming a clear association between these aberrations and transcriptional variability of these brain tumors. In addition, we identify a number of hub genes that have previously not been associated with glioblastoma, including interferon alpha 1 (IFNA1), myeloid/lymphoid or mixed-lineage leukemia translocated to 10 (MLLT10, a well-known leukemia gene), glutamate decarboxylase 2 GAD2, a postulated glutamate receptor GPR158 and Necdin (NDN). Furthermore, we demonstrate that the network model contains useful information on downstream target genes (including stem cell regulators), and possible drug targets.
We proceed to explore the validity of a small network region experimentally. Introducing experimental perturbations of NDN and other targets in four glioblastoma cell lines (T98G, U-87MG, U-343MG and U-373MG), we confirm several predicted mechanisms. We also demonstrate that the TCGA glioblastoma patients can be stratified into long-term and short-term survivors, using our proposed prognostic scores derived from a singular vector decomposition of the network model. Finally, we compare EPoC to existing methods for mRNA networks analysis and expression quantitative locus methods, and demonstrate that EPoC produces more consistent models between technically independent glioblastoma data sets, and that the EPoC models exhibit better overlap with known protein–protein interaction networks and pathway maps.
In summary, we conclude that large-scale integrative modeling reveals mechanistically and prognostically informative networks in human glioblastoma. Our approach operates at the gene level and our data support that individual hub genes can be identified in practice. Very large aberrations, however, cannot be fully resolved by the current modeling strategy.
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.
PMCID: PMC3101951  PMID: 21525872
cancer biology; cancer genomics; glioblastoma
11.  Time to Recurrence and Survival in Serous Ovarian Tumors Predicted from Integrated Genomic Profiles 
PLoS ONE  2011;6(11):e24709.
Serous ovarian cancer (SeOvCa) is an aggressive disease with differential and often inadequate therapeutic outcome after standard treatment. The Cancer Genome Atlas (TCGA) has provided rich molecular and genetic profiles from hundreds of primary surgical samples. These profiles confirm mutations of TP53 in ∼100% of patients and an extraordinarily complex profile of DNA copy number changes with considerable patient-to-patient diversity. This raises the joint challenge of exploiting all new available datasets and reducing their confounding complexity for the purpose of predicting clinical outcomes and identifying disease relevant pathway alterations. We therefore set out to use multi-data type genomic profiles (mRNA, DNA methylation, DNA copy-number alteration and microRNA) available from TCGA to identify prognostic signatures for the prediction of progression-free survival (PFS) and overall survival (OS).
Methodology/Principal Findings
We implemented a multivariate Cox Lasso model and median time-to-event prediction algorithm and applied it to two datasets integrated from the four genomic data types. We (1) selected features through cross-validation; (2) generated a prognostic index for patient risk stratification; and (3) directly predicted continuous clinical outcome measures, that is, the time to recurrence and survival time. We used Kaplan-Meier p-values, hazard ratios (HR), and concordance probability estimates (CPE) to assess prediction performance, comparing separate and integrated datasets. Data integration resulted in the best PFS signature (withheld data: p-value = 0.008; HR = 2.83; CPE = 0.72).
We provide a prediction tool that inputs genomic profiles of primary surgical samples and generates patient-specific predictions for the time to recurrence and survival, along with outcome risk predictions. Using integrated genomic profiles resulted in information gain for prediction of outcomes. Pathway analysis provided potential insights into functional changes affecting disease progression. The prognostic signatures, if prospectively validated, may be useful for interpreting therapeutic outcomes for clinical trials that aim to improve the therapy for SeOvCa patients.
PMCID: PMC3207809  PMID: 22073136
12.  Identification of genes and pathways involved in kidney renal clear cell carcinoma 
BMC Bioinformatics  2014;15(Suppl 17):S2.
Kidney Renal Clear Cell Carcinoma (KIRC) is one of fatal genitourinary diseases and accounts for most malignant kidney tumours. KIRC has been shown resistance to radiotherapy and chemotherapy. Like many types of cancers, there is no curative treatment for metastatic KIRC. Using advanced sequencing technologies, The Cancer Genome Atlas (TCGA) project of NIH/NCI-NHGRI has produced large-scale sequencing data, which provide unprecedented opportunities to reveal new molecular mechanisms of cancer. We combined differentially expressed genes, pathways and network analyses to gain new insights into the underlying molecular mechanisms of the disease development.
Followed by the experimental design for obtaining significant genes and pathways, comprehensive analysis of 537 KIRC patients' sequencing data provided by TCGA was performed. Differentially expressed genes were obtained from the RNA-Seq data. Pathway and network analyses were performed. We identified 186 differentially expressed genes with significant p-value and large fold changes (P < 0.01, |log(FC)| > 5). The study not only confirmed a number of identified differentially expressed genes in literature reports, but also provided new findings. We performed hierarchical clustering analysis utilizing the whole genome-wide gene expressions and differentially expressed genes that were identified in this study. We revealed distinct groups of differentially expressed genes that can aid to the identification of subtypes of the cancer. The hierarchical clustering analysis based on gene expression profile and differentially expressed genes suggested four subtypes of the cancer. We found enriched distinct Gene Ontology (GO) terms associated with these groups of genes. Based on these findings, we built a support vector machine based supervised-learning classifier to predict unknown samples, and the classifier achieved high accuracy and robust classification results. In addition, we identified a number of pathways (P < 0.04) that were significantly influenced by the disease. We found that some of the identified pathways have been implicated in cancers from literatures, while others have not been reported in the cancer before. The network analysis leads to the identification of significantly disrupted pathways and associated genes involved in the disease development. Furthermore, this study can provide a viable alternative in identifying effective drug targets.
Our study identified a set of differentially expressed genes and pathways in kidney renal clear cell carcinoma, and represents a comprehensive computational approach to analysis large-scale next-generation sequencing data. The pathway and network analyses suggested that information from distinctly expressed genes can be utilized in the identification of aberrant upstream regulators. Identification of distinctly expressed genes and altered pathways are important in effective biomarker identification for early cancer diagnosis and treatment planning. Combining differentially expressed genes with pathway and network analyses using intelligent computational approaches provide an unprecedented opportunity to identify upstream disease causal genes and effective drug targets.
PMCID: PMC4304191  PMID: 25559354
Kidney Renal Clear Cell Carcinoma; TCGA; RNA-Seq; Differentially Expressed Genes; Pathways; Gene Network Analysis; Machine Learning Classifier
13.  SubPatCNV: approximate subspace pattern mining for mapping copy-number variations 
BMC Bioinformatics  2015;16(1):16.
Many DNA copy-number variations (CNVs) are known to lead to phenotypic variations and pathogenesis. While CNVs are often only common in a small number of samples in the studied population or patient cohort, previous work has not focused on customized identification of CNV regions that only exhibit in subsets of samples with advanced data mining techniques to reliably answer questions such as “Which are all the chromosomal fragments showing nearly identical deletions or insertions in more than 30% of the individuals?”.
We introduce a tool for mining CNV subspace patterns, namely SubPatCNV, which is capable of identifying all aberrant CNV regions specific to arbitrary sample subsets larger than a support threshold. By design, SubPatCNV is the implementation of a variation of approximate association pattern mining algorithm under a spatial constraint on the positional CNV probe features. In benchmark test, SubPatCNV was applied to identify population specific germline CNVs from four populations of HapMap samples. In experiments on the TCGA ovarian cancer dataset, SubPatCNV discovered many large aberrant CNV events in patient subgroups, and reported regions enriched with cancer relevant genes. In both HapMap data and TCGA data, it was observed that SubPatCNV employs approximate pattern mining to more effectively identify CNV subspace patterns that are consistent within a subgroup from high-density array data.
SubPatCNV available through a unique scalable open-source software tool that provides the flexibility of identifying CNV regions specific to sample subgroups of different sizes from high-density CNV array data.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0426-7) contains supplementary material, which is available to authorized users.
PMCID: PMC4305219  PMID: 25591662
DNA copy-number variations; Approximate pattern mining; HapMap; Cancer
14.  Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis 
BMC Bioinformatics  2014;15(1):308.
In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis.
We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership.
The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-308) contains supplementary material, which is available to authorized users.
PMCID: PMC4262241  PMID: 25236784
Driver mutations; Passenger mutations; Somatic mutations; Copy number alterations; Gene networks; Network analysis; Cancer; Glioblastoma; Ovarian carcinoma; Brain cell compaction; Collagen cross-linking
15.  Spatial and Temporal Heterogeneity in High-Grade Serous Ovarian Cancer: A Phylogenetic Analysis 
PLoS Medicine  2015;12(2):e1001789.
The major clinical challenge in the treatment of high-grade serous ovarian cancer (HGSOC) is the development of progressive resistance to platinum-based chemotherapy. The objective of this study was to determine whether intra-tumour genetic heterogeneity resulting from clonal evolution and the emergence of subclonal tumour populations in HGSOC was associated with the development of resistant disease.
Methods and Findings
Evolutionary inference and phylogenetic quantification of heterogeneity was performed using the MEDICC algorithm on high-resolution whole genome copy number profiles and selected genome-wide sequencing of 135 spatially and temporally separated samples from 14 patients with HGSOC who received platinum-based chemotherapy. Samples were obtained from the clinical CTCR-OV03/04 studies, and patients were enrolled between 20 July 2007 and 22 October 2009. Median follow-up of the cohort was 31 mo (interquartile range 22–46 mo), censored after 26 October 2013. Outcome measures were overall survival (OS) and progression-free survival (PFS). There were marked differences in the degree of clonal expansion (CE) between patients (median 0.74, interquartile range 0.66–1.15), and dichotimization by median CE showed worse survival in CE-high cases (PFS 12.7 versus 10.1 mo, p = 0.009; OS 42.6 versus 23.5 mo, p = 0.003). Bootstrap analysis with resampling showed that the 95% confidence intervals for the hazard ratios for PFS and OS in the CE-high group were greater than 1.0. These data support a relationship between heterogeneity and survival but do not precisely determine its effect size. Relapsed tissue was available for two patients in the CE-high group, and phylogenetic analysis showed that the prevalent clonal population at clinical recurrence arose from early divergence events. A subclonal population marked by a NF1 deletion showed a progressive increase in tumour allele fraction during chemotherapy.
This study demonstrates that quantitative measures of intra-tumour heterogeneity may have predictive value for survival after chemotherapy treatment in HGSOC. Subclonal tumour populations are present in pre-treatment biopsies in HGSOC and can undergo expansion during chemotherapy, causing clinical relapse.
In this study, James Brenton and colleagues demonstrate that quantitative measures of intratumoural heterogeneity may have predictive value for survival after chemotherapy treatment in high-grade serous ovarian cancer.
Editors’ Summary
Every year, nearly 250,000 women develop ovarian cancer, and about 150,000 die from the disease. Ovarian cancer occurs when a cell on the surface of the ovaries (two small organs in the pelvis that produce eggs) or in the Fallopian tubes (which connect the ovaries to the womb) acquires genetic changes (mutations) that allow it to grow uncontrollably and to spread around the body (metastasize). For women whose ovarian cancer is diagnosed when it is confined to its site of origin, the outlook is good. About 90% of these women survive for at least five years. However, ovarian cancer is rarely diagnosed this early. Usually, by the time the cancer causes symptoms (often only vague abdominal pains and mild digestive disturbances), it has spread into the peritoneal cavity (the space around the gut, stomach, and liver) or has metastasized to distant organs. Patients with advanced ovarian cancer are treated with a combination of surgery and platinum-based chemotherapy, but only a quarter of such women are still alive five years after diagnosis, and the overall five-year survival rate for ovarian cancer is less than 50%.
Why Was This Study Done?
The major clinical challenge in the treatment of high-grade serous ovarian cancer (HGSOC; the most common type of ovarian cancer) is the development of resistance to platinum-based chemotherapy. If we knew how this resistance develops, it might be possible to improve the treatment of HGSOC. Tumors are thought to arise from a single mutated cell that accumulates additional mutations as it grows and divides. This process results in the formation of subpopulations of tumor cells, each with a different set of mutations. Experts think that this “intra-tumor heterogeneity” gives rise to tumor subclones that possess an evolutionary advantage over other subclones (they might, for example, grow faster or be resistant to chemotherapy) and that eventually dominate the tumor (“clonal expansion”). Here, the researchers investigate whether clonal evolution and the emergence of subclonal tumor populations explains the development of chemotherapy-resistant HGSOC by undertaking evolutionary inference and phylogenetic quantification of the heterogeneity of samples taken from women with HGSOC at different times and from different places in their body. Evolutionary inference and phylogenetic quantification are analytical approaches that can be used to reconstruct the evolutionary history (“family tree”) of a tumor.
What Did the Researchers Do and Find?
The researchers used an algorithm (a step-by-step procedure for data processing) called MEDICC to analyze detailed genetic data obtained from 135 spatially and temporally separated samples taken from 14 patients with HGSOC who had received platinum-based chemotherapy. The researchers report that there were marked differences in the degree of clonal expansion among the patients. When they split the patients into two groups based on the degree of clonal expansion in their tumors (CE-high and CE-low), patients with tumors classified as CE-high had a shorter progression-free survival time than patients with tumors classified as CE-high (10.1 months compared to 12.7 months) and a shorter overall survival time (23.5 months compared to 42.6 months). Moreover, a type of statistical analysis called bootstrap analysis, which tests for the robustness of the result, indicated that having CE-high tumors was likely to increase a patient’s risk of a poor outcome. Finally, phylogenetic analysis of samples taken from two patients before and after relapse and analysis of a NF1 deletion (NF1 encodes neurofibromin 1, a tumor suppressor protein that prevents uncontrolled cell growth; NF1 is frequently mutated in HGSOC) indicated that a resistant subclonal population was already present in the patients’ tumors before treatment began.
What Do These Findings Mean?
These findings show that clonal expansion occurs between diagnosis and relapse in HGSOC, that there are marked differences in the degree of clonal expansion among patients, and that a high degree of clonal expansion may have a negative effect on survival. The accuracy of these findings is limited by the small number of patients included in the study, and it is likely that the analyses reported here overestimate the effect of clonal expansion on patient outcomes. Nevertheless, the researchers suggest that, provided larger patient studies yield similar results, quantitative measures of intra-tumor heterogeneity might be useful as patient-specific prognostic markers in HGSOC. That is, measures of intra-tumor heterogeneity might eventually help clinicians to predict which of their patients with ovarian cancer are likely to have the best outcomes after platinum-based chemotherapy.
Additional Information
Please access these websites via the online version of this summary at
The US National Cancer Institute provides information about cancer and how it develops (in English and Spanish), including detailed information about ovarian cancer
Cancer Research UK, a not-for-profit organization, provides general information about cancer and how it develops, and detailed information about ovarian cancer
The UK National Health Service Choices website has information and personal stories about ovarian cancer
The not-for-profit organization provides personal stories about dealing with ovarian cancer; Eyes on the Prize, an online support group for women who have had cancers of the female reproductive system, also includes personal stories; the not-for-profit organization Ovarian Cancer Action also provides information, support, and personal stories about ovarian cancer
Wikipedia provides information about clonal evolution in cancer, tumor heterogeneity, and phylogenetics (note that Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
More information about the MEDICC algorithm is available
PMCID: PMC4339382  PMID: 25710373
16.  An eight-miRNA signature as a potential biomarker for predicting survival in lung adenocarcinoma 
Lung adenocarcinoma is a heterogernous disease that creates challenges for classification and management. The purpose of this study is to identify specific miRNA markers closely associated with the survival of LUAD patients from a large dataset of significantly altered miRNAs, and to assess the prognostic value of this miRNA expression profile for OS in patients with LUAD.
We obtained miRNA expression profiles and corresponding clinical information for 372 LUAD patients from The Cancer Genome Atlas (TCGA), and identified the most significantly altered miRNAs between tumor and normal samples. Using survival analysis and supervised principal components method, we identified an eight-miRNA signature for the prediction of overall survival (OS) of LUAD patients. The relationship between OS and the identified miRNA signature was self-validated in the TCGA cohort (randomly classified into two subgroups: n = 186 for the training set and n = 186 for the testing set). Survival receiver operating characteristic (ROC) analysis was used to assess the performance of survival prediction. The biological relevance of putative miRNA targets was also analyzed using bioinformatics.
Sixteen of the 111 most significantly altered miRNAs were associated with OS across different clinical subclasses of the TCGA-derived LUAD cohort. A linear prognostic model of eight miRNAs (miR-31, miR-196b, miR-766, miR-519a-1, miR-375, miR-187, miR-331 and miR-101-1) was constructed and weighted by the importance scores from the supervised principal component method to divide patients into high- and low-risk groups. Patients assigned to the high-risk group exhibited poor OS compared with patients in the low-risk group (hazard ratio [HR] = 1.99, P <0.001). The eight-miRNA signature is an independent prognostic marker of OS of LUAD patients and demonstrates good performance for predicting 5-year OS (Area Under the respective ROC Curves [AUC] = 0.626, P = 0.003), especially for non-smokers (AUC = 0.686, P = 0.023).
We identified an eight-miRNA signature that is prognostic of LUAD. The miRNA signature, if validated in other prospective studies, may have important implications in clinical practice, in particular identifying a subgroup of patients with LUAD who are at high risk of mortality.
PMCID: PMC4062505  PMID: 24893932
Lung adenocarcinoma; MicroRNA; Prognostic markers; Overall survival
17.  Single nucleotide polymorphisms in glutathione S-transferase P1 and M1 genes and overall survival of patients with ovarian serous cystadenocarcinoma treated with chemotherapy 
Oncology Letters  2016;11(4):2525-2531.
The effects of platinum-based drugs are controlled by genes that are involved in DNA detoxification, including glutathione S-transferase (GST)P1 and GSTM1, which have been associated with increased benefits in the chemotherapeutic treatment of patients with ovarian cancer. The present study assessed the effect of single nucleotide polymorphisms in GST genes on the overall survival (OS) of patients with ovarian serous cystadenocarcinoma that were treated with chemotherapy. A total of 95 patients received treatment with a carboplatin-based or alternative chemotherapy. Polymorphisms in the patients were genotyped using the following methods: Pyrosequencing, to identify GSTP1 Ile105Val; a relative quantification method, to identify the copy number variation in GSTM1; and polymerase chain reaction followed by gel electrophoresis, to identify the null vs. non-null genotypes of GSTM1. The association between genotypes and OS of patients was assessed using Kaplan-Meier survival curves and Cox proportional hazards regression analysis. The OS of patients treated with paclitaxel + carboplatin-based chemotherapy was significantly increased, compared with patients treated with alternative forms of chemotherapy (P=0.035). The OS of patients did not differ significantly between different GSTP1 genotypes (log-rank test, P=0.17). Cox proportional hazards regression analysis revealed that, since the start of the treatment, there was not a significant association between the GSTP1 isoleucine allele and the OS for heterozygous carriers of the isoleucine allele [hazards ratio (HR), 1.78; 95% confidence interval (CI), 0.77–4.12; P=0.18] and no homozygous carriers of the valine allele had been detected (HR, 0.00). There was no significant difference between GSTM1 genotypes, according to Kaplan-Meier survival analysis (log-rank test, P=0.83). Patients that possessed ≤1 copy of GSTM1 exhibited no decrease in OS (HR, 0.96; 95% CI, 0.37–2.51; P=0.94), compared with patients that possessed two copies of GSTM1 (HR, 0.71; 95% CI, 0.22–2.28; P=0.56). Overall, the present results suggest that there are no associations between polymorphisms in the GSTP1 and GSTM1 genes and the OS of patients with ovarian cancer following administration of adjuvant chemotherapy.
PMCID: PMC4812381  PMID: 27073511
single nucleotide polymorphisms; chemotherapy; pyrosequencing; ovarian serous cystadenocarcinoma; GSTM1; GSTP1; carboplatin; pharmacogenetics
18.  Identification of Druggable Cancer Driver Genes Amplified across TCGA Datasets 
PLoS ONE  2014;9(5):e98293.
The Cancer Genome Atlas (TCGA) projects have advanced our understanding of the driver mutations, genetic backgrounds, and key pathways activated across cancer types. Analysis of TCGA datasets have mostly focused on somatic mutations and translocations, with less emphasis placed on gene amplifications. Here we describe a bioinformatics screening strategy to identify putative cancer driver genes amplified across TCGA datasets. We carried out GISTIC2 analysis of TCGA datasets spanning 14 cancer subtypes and identified 461 genes that were amplified in two or more datasets. The list was narrowed to 73 cancer-associated genes with potential “druggable” properties. The majority of the genes were localized to 14 amplicons spread across the genome. To identify potential cancer driver genes, we analyzed gene copy number and mRNA expression data from individual patient samples and identified 40 putative cancer driver genes linked to diverse oncogenic processes. Oncogenic activity was further validated by siRNA/shRNA knockdown and by referencing the Project Achilles datasets. The amplified genes represented a number of gene families, including epigenetic regulators, cell cycle-associated genes, DNA damage response/repair genes, metabolic regulators, and genes linked to the Wnt, Notch, Hedgehog, JAK/STAT, NF-KB and MAPK signaling pathways. Among the 40 putative driver genes were known driver genes, such as EGFR, ERBB2 and PIK3CA. Wild-type KRAS was amplified in several cancer types, and KRAS-amplified cancer cell lines were most sensitive to KRAS shRNA, suggesting that KRAS amplification was an independent oncogenic event. A number of MAP kinase adapters were co-amplified with their receptor tyrosine kinases, such as the FGFR adapter FRS2 and the EGFR family adapter GRB7. The ubiquitin-like ligase DCUN1D1 and the histone methyltransferase NSD3 were also identified as novel putative cancer driver genes. We discuss the patient tailoring implications for existing cancer drug targets and we further discuss potential novel opportunities for drug discovery efforts.
PMCID: PMC4038530  PMID: 24874471
19.  A novel c-Met inhibitor, MK8033, synergizes with carboplatin plus paclitaxel to inhibit ovarian cancer cell growth 
Oncology Reports  2013;29(5):2011-2018.
Elevated serum levels of hepatocyte growth factor (HGF) and high tumor expression of c-Met are both indicators of poor overall survival from ovarian cancer (OVCA). In the present study, we evaluated the role of the HGF signaling pathway in OVCA cell line chemoresistance and OVCA patient overall survival as well as the influence of HGF/c-Met signaling inhibition on the sensitivity of OVCA cells to combinational carboplatin plus paclitaxel therapy. The prevalence of the HGF receptor, c-Met, was determined by immunohistochemistry in primary OVCA samples (n=79) and OVCA cell lines (n=41). The influence of the c-Met-specific inhibitor MK8033 on OVCA cell sensitivity to combinations of carboplatin plus paclitaxel was examined in a subset of OVCA cells (n=8) by CellTiter-Blue cell viability assays. Correlation tests were used to identify genes associated with response to MK8033 and carboplatin plus paclitaxel. Identified genes were evaluated for influence on overall survival from OVCA using principal component analysis (PCA) modeling in an independent clinical OVCA dataset (n=218). Immunohistochemistry analysis indicated that 83% of OVCA cells and 92% of primary OVCA expressed the HGF receptor, c-Met. MK8033 exhibited significant anti-proliferative effects against a panel of human OVCA cell lines. Combination index values determined by the Chou-Talalay isobologram equation indicated synergistic activity in combinations of MK8033 and carboplatin plus paclitaxel. Pearson's correlation identified a 47-gene signature to be associated with MK8033-carboplatin plus paclitaxel response. PCA modeling indicated an association of this 47-gene response signature with overall survival from OVCA (P=0.013). These data indicate that HGF/c-Met pathway signaling may influence OVCA chemosensitivity and overall patient survival. Furthermore, HGF/c-Met inhibition by MK8033 represents a promising new therapeutic avenue to increase OVCA sensitivity to carboplatin plus paclitaxel.
PMCID: PMC4536335  PMID: 23467907
c-Met expression; combination index; ovarian cancer; carboplatin; immunohistochemistry
20.  Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data 
BMC Systems Biology  2013;7(Suppl 2):S4.
Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need.
In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer.
This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.
PMCID: PMC3851989  PMID: 24565034
21.  Why Is There a Lack of Consensus on Molecular Subgroups of Glioblastoma? Understanding the Nature of Biological and Statistical Variability in Glioblastoma Expression Data 
PLoS ONE  2011;6(7):e20826.
Gene expression patterns characterizing clinically-relevant molecular subgroups of glioblastoma are difficult to reproduce. We suspect a combination of biological and analytic factors confounds interpretation of glioblastoma expression data. We seek to clarify the nature and relative contributions of these factors, to focus additional investigations, and to improve the accuracy and consistency of translational glioblastoma analyses.
We analyzed gene expression and clinical data for 340 glioblastomas in The Cancer Genome Atlas (TCGA). We developed a logic model to analyze potential sources of biological, technical, and analytic variability and used standard linear classifiers and linear dimensional reduction algorithms to investigate the nature and relative contributions of each factor.
Commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and technical noise make measurable but proportionally minor contributions to inconsistent molecular classification. Our analysis suggests that three, previously underappreciated factors may account for a larger fraction of classification errors: inherent non-linear/non-orthogonal relationships among the genes used in conjunction with classification algorithms that assume linearity; skewed data distributions assumed to be Gaussian; and biologic variability (noise) among tumors, of which we propose three types.
Our analysis of the TCGA data demonstrates a contributory role for technical factors in molecular classification inconsistencies in glioblastoma but also suggests that biological variability, abnormal data distribution, and non-linear relationships among genes may be responsible for a proportionally larger component of classification error. These findings may have important implications for both glioblastoma research and for translational application of other large-volume biological databases.
PMCID: PMC3145641  PMID: 21829433
22.  Genetic alterations in fatty acid transport and metabolism genes are associated with metastatic progression and poor prognosis of human cancers 
Scientific Reports  2016;6:18669.
Reprogramming of cellular metabolism is a hallmark feature of cancer cells. While a distinct set of processes drive metastasis when compared to tumorigenesis, it is yet unclear if genetic alterations in metabolic pathways are associated with metastatic progression of human cancers. Here, we analyzed the mutation, copy number variation and gene expression patterns of a literature-derived model of metabolic genes associated with glycolysis (Warburg effect), fatty acid metabolism (lipogenesis, oxidation, lipolysis, esterification) and fatty acid uptake in >9000 primary or metastatic tumor samples from the multi-cancer TCGA datasets. Our association analysis revealed a uniform pattern of Warburg effect mutations influencing prognosis across all tumor types, while copy number alterations in the electron transport chain gene SCO2, fatty acid uptake (CAV1, CD36) and lipogenesis (PPARA, PPARD, MLXIPL) genes were enriched in metastatic tumors. Using gene expression profiles, we established a gene-signature (CAV1, CD36, MLXIPL, CPT1C, CYP2E1) that strongly associated with epithelial-mesenchymal program across multiple cancers. Moreover, stratification of samples based on the copy number or expression profiles of the genes identified in our analysis revealed a significant effect on patient survival rates, thus confirming prominent roles of fatty acid uptake and metabolism in metastatic progression and poor prognosis of human cancers.
PMCID: PMC4698658  PMID: 26725848
23.  Identification of ovarian cancer driver genes by using module network integration of multi-omics data 
Interface Focus  2013;3(4):20130013.
The increasing availability of multi-omics cancer datasets has created a new opportunity for data integration that promises a more comprehensive understanding of cancer. The challenge is to develop mathematical methods that allow the integration and extraction of knowledge from large datasets such as The Cancer Genome Atlas (TCGA). This has led to the development of a variety of omics profiles that are highly correlated with each other; however, it remains unknown which profile is the most meaningful and how to efficiently integrate different omics profiles. We developed AMARETTO, an algorithm to identify cancer drivers by integrating a variety of omics data from cancer and normal tissue. AMARETTO first models the effects of genomic/epigenomic data on disease-specific gene expression. AMARETTO's second step involves constructing a module network to connect the cancer drivers with their downstream targets. We observed that more gene expression variation can be explained when using disease-specific gene expression data. We applied AMARETTO to the ovarian cancer TCGA data and identified several cancer driver genes of interest, including novel genes in addition to known drivers of cancer. Finally, we showed that certain modules are predictive of good versus poor outcome, and the associated drivers were related to DNA repair pathways.
PMCID: PMC3915833  PMID: 24511378
gene expression; DNA methylation; copy number; ovarian cancer; data integration
24.  GSVD Comparison of Patient-Matched Normal and Tumor aCGH Profiles Reveals Global Copy-Number Alterations Predicting Glioblastoma Multiforme Survival 
PLoS ONE  2012;7(1):e30098.
Despite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in 3% of the patients. These include the biochemically putative drug target, cell cycle-regulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched profiles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classification of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern.
PMCID: PMC3264559  PMID: 22291905
25.  Next generation sequencing profiling identifies miR-574-3p and miR-660-5p as potential novel prognostic markers for breast cancer 
BMC Genomics  2015;16:735.
Prognostication of Breast Cancer (BC) relies largely on traditional clinical factors and biomarkers such as hormone or growth factor receptors. Due to their suboptimal specificities, it is challenging to accurately identify the subset of patients who are likely to undergo recurrence and there remains a major need for markers of higher utility to guide therapeutic decisions. MicroRNAs (miRNAs) are small non-coding RNAs that function as post-transcriptional regulators of gene expression and have shown promise as potential prognostic markers in several cancer types including BC.
In our study, we sequenced miRNAs from 104 BC samples and 11 apparently healthy normal (reduction mammoplasty) breast tissues. We used Case–control (CC) and Case-only (CO) statistical paradigm to identify prognostic markers. Cox-proportional hazards regression model was employed and risk score analysis was performed to identify miRNA signature independent of potential confounders. Representative miRNAs were validated using qRT-PCR. Gene targets for prognostic miRNAs were identified using in silico predictions and in-house BC transcriptome dataset. Gene ontology terms were identified using DAVID bioinformatics v6.7. A total of 1,423 miRNAs were captured. In the CC approach, 126 miRNAs were retained with predetermined criteria for good read counts, from which 80 miRNAs were differentially expressed. Of these, four and two miRNAs were significant for Overall Survival (OS) and Recurrence Free Survival (RFS), respectively. In the CO approach, from 147 miRNAs retained after filtering, 11 and 4 miRNAs were significant for OS and RFS, respectively. In both the approaches, the risk scores were significant after adjusting for potential confounders. The miRNAs associated with OS identified in our cohort were validated using an external dataset from The Cancer Genome Atlas (TCGA) project. Targets for the identified miRNAs were enriched for cell proliferation, invasion and migration.
The study identified twelve non-redundant miRNAs associated with OS and/or RFS. These signatures include those that were reported by others in BC or other cancers. Importantly we report for the first time two new candidate miRNAs (miR-574-3p and miR-660-5p) as promising prognostic markers. Independent validation of signatures (for OS) using an external dataset from TCGA further strengthened the study findings.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1899-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4587870  PMID: 26416693
microRNA; Next generation sequencing; Breast cancer; Prognostic marker; miR-574-3p; miR-660-5p; Reduction mammoplasty; Overall survival; Recurrence free survival; TCGA

Results 1-25 (1539146)