Search tips
Search criteria

Results 1-25 (1275786)

Clipboard (0)

Related Articles

1.  Survival-Related Profile, Pathways, and Transcription Factors in Ovarian Cancer 
PLoS Medicine  2009;6(2):e1000024.
Ovarian cancer has a poor prognosis due to advanced stage at presentation and either intrinsic or acquired resistance to classic cytotoxic drugs such as platinum and taxoids. Recent large clinical trials with different combinations and sequences of classic cytotoxic drugs indicate that further significant improvement in prognosis by this type of drugs is not to be expected. Currently a large number of drugs, targeting dysregulated molecular pathways in cancer cells have been developed and are introduced in the clinic. A major challenge is to identify those patients who will benefit from drugs targeting these specific dysregulated pathways.The aims of our study were (1) to develop a gene expression profile associated with overall survival in advanced stage serous ovarian cancer, (2) to assess the association of pathways and transcription factors with overall survival, and (3) to validate our identified profile and pathways/transcription factors in an independent set of ovarian cancers.
Methods and Findings
According to a randomized design, profiling of 157 advanced stage serous ovarian cancers was performed in duplicate using ∼35,000 70-mer oligonucleotide microarrays. A continuous predictor of overall survival was built taking into account well-known issues in microarray analysis, such as multiple testing and overfitting. A functional class scoring analysis was utilized to assess pathways/transcription factors for their association with overall survival. The prognostic value of genes that constitute our overall survival profile was validated on a fully independent, publicly available dataset of 118 well-defined primary serous ovarian cancers. Furthermore, functional class scoring analysis was also performed on this independent dataset to assess the similarities with results from our own dataset. An 86-gene overall survival profile discriminated between patients with unfavorable and favorable prognosis (median survival, 19 versus 41 mo, respectively; permutation p-value of log-rank statistic = 0.015) and maintained its independent prognostic value in multivariate analysis. Genes that composed the overall survival profile were also able to discriminate between the two risk groups in the independent dataset. In our dataset 17/167 pathways and 13/111 transcription factors were associated with overall survival, of which 16 and 12, respectively, were confirmed in the independent dataset.
Our study provides new clues to genes, pathways, and transcription factors that contribute to the clinical outcome of serous ovarian cancer and might be exploited in designing new treatment strategies.
Ate van der Zee and colleagues analyze the gene expression profiles of ovarian cancer samples from 157 patients, and identify an 86-gene expression profile that seems to predict overall survival.
Editors' Summary
Ovarian cancer kills more than 100,000 women every year and is one of the most frequent causes of cancer death in women in Western countries. Most ovarian cancers develop when an epithelial cell in one of the ovaries (two small organs in the pelvis that produce eggs) acquires genetic changes that allow it to grow uncontrollably and to spread around the body (metastasize). In its early stages, ovarian cancer is confined to the ovaries and can often be treated successfully by surgery alone. Unfortunately, early ovarian cancer rarely has symptoms so a third of women with ovarian cancer have advanced disease when they first visit their doctor with symptoms that include vague abdominal pains and mild digestive disturbances. That is, cancer cells have spread into their abdominal cavity and metastasized to other parts of the body (so-called stage III and IV disease). The outlook for women diagnosed with stage III and IV disease, which are treated with a combination of surgery and chemotherapy, is very poor. Only 30% of women with stage III, and 5% with stage IV, are still alive five years after their cancer is diagnosed.
Why Was This Study Done?
If the cellular pathways that determine the biological behavior of ovarian cancer could be identified, it might be possible to develop more effective treatments for women with stage III and IV disease. One way to identify these pathways is to use gene expression profiling (a technique that catalogs all the genes expressed by a cell) to compare gene expression patterns in the ovarian cancers of women who survive for different lengths of time. Genes with different expression levels in tumors with different outcomes could be targets for new treatments. For example, it might be worth developing inhibitors of proteins whose expression is greatest in tumors with short survival times. In this study, the researchers develop an expression profile that is associated with overall survival in advanced-stage serous ovarian cancer (more than half of ovarian cancers originate in serous cells, epithelial cells that secrete a watery fluid). The researchers also assess the association of various cellular pathways and transcription factors (proteins that control the expression of other proteins) with survival in this type of ovarian carcinoma.
What Did the Researchers Do and Find?
The researchers analyzed the gene expression profiles of tumor samples taken from 157 patients with advanced stage serous ovarian cancer and used the “supervised principal components” method to build a predictor of overall survival from these profiles and patient survival times. This 86-gene predictor discriminated between patients with favorable and unfavorable outcomes (average survival times of 41 and 19 months, respectively). It also discriminated between groups of patients with these two outcomes in an independent dataset collected from 118 additional serous ovarian cancers. Next, the researchers used “functional class scoring” analysis to assess the association between pathway and transcription factor expression in the tumor samples and overall survival. Seventeen of 167 KEGG pathways (“wiring” diagrams of molecular interactions, reactions and relations involved in cellular processes and human diseases listed in the Kyoto Encyclopedia of Genes and Genomes) were associated with survival, 16 of which were confirmed in the independent dataset. Finally, 13 of 111 analyzed transcription factors were associated with overall survival in the tumor samples, 12 of which were confirmed in the independent dataset.
What Do These Findings Mean?
These findings identify an 86-gene overall survival gene expression profile that seems to predict overall survival for women with advanced serous ovarian cancer. However, before this profile can be used clinically, further validation of the profile and more robust methods for determining gene expression profiles are needed. Importantly, these findings also provide new clues about the genes, pathways and transcription factors that contribute to the clinical outcome of serous ovarian cancer, clues that can now be exploited in the search for new treatment strategies. Finally, these findings suggest that it might eventually be possible to tailor therapies to the needs of individual patients by analyzing which pathways are activated in their tumors and thus improve survival times for women with advanced ovarian cancer.
Additional Information.
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Simon Gayther and Kate Lawrenson
See also a related PLoS Medicine Research Article by Huntsman and colleagues
The US National Cancer Institute provides a brief description of what cancer is and how it develops, and information on all aspects of ovarian cancer for patients and professionals (in English and Spanish)
The UK charity Cancerbackup provides general information about cancer, and more specific information about ovarian cancer
MedlinePlus also provides links to other information about ovarian cancer (in English and Spanish)
The KEGG Pathway database provides pathway maps of known molecular networks involved in a wide range of cellular processes
PMCID: PMC2634794  PMID: 19192944
2.  Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia 
PLoS Computational Biology  2014;10(10):e1003908.
Gene expression is a combinatorial function of genetic/epigenetic factors such as copy number variation (CNV), DNA methylation (DM), transcription factors (TF) occupancy, and microRNA (miRNA) post-transcriptional regulation. At the maturity of microarray/sequencing technologies, large amounts of data measuring the genome-wide signals of those factors became available from Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA). However, there is a lack of an integrative model to take full advantage of these rich yet heterogeneous data. To this end, we developed RACER (Regression Analysis of Combined Expression Regulation), which fits the mRNA expression as response using as explanatory variables, the TF data from ENCODE, and CNV, DM, miRNA expression signals from TCGA. Briefly, RACER first infers the sample-specific regulatory activities by TFs and miRNAs, which are then used as inputs to infer specific TF/miRNA-gene interactions. Such a two-stage regression framework circumvents a common difficulty in integrating ENCODE data measured in generic cell-line with the sample-specific TCGA measurements. As a case study, we integrated Acute Myeloid Leukemia (AML) data from TCGA and the related TF binding data measured in K562 from ENCODE. As a proof-of-concept, we first verified our model formalism by 10-fold cross-validation on predicting gene expression. We next evaluated RACER on recovering known regulatory interactions, and demonstrated its superior statistical power over existing methods in detecting known miRNA/TF targets. Additionally, we developed a feature selection procedure, which identified 18 regulators, whose activities clustered consistently with cytogenetic risk groups. One of the selected regulators is miR-548p, whose inferred targets were significantly enriched for leukemia-related pathway, implicating its novel role in AML pathogenesis. Moreover, survival analysis using the inferred activities identified C-Fos as a potential AML prognostic marker. Together, we provided a novel framework that successfully integrated the TCGA and ENCODE data in revealing AML-specific regulatory program at global level.
Author Summary
Recent studies from The Cancer Genome Atlas (TCGA) showed that most Acute Myeloid Leukemia (AML) patients lack DNA mutations, which can potentially explain the tumorigenesis, and motivated a systematic approach to elucidate aberrant molecular signatures at the transcriptional and epigenetic levels. Using recently available data from two large consortia namely Encyclopedia of DNA Elements and TCGA, we developed a novel computational model to infer the regulatory activities of the expression regulators and their target genes in AML samples. Our analysis revealed 18 regulators whose dysregulation contributed significantly to explaining the global mRNA expression changes. Encouragingly, the inferred activities of these regulatory features followed a consistent pattern with cytogenetic phenotypes of the AML patients. Among these regulators, we identified microRNA hsa-miR-548p, whose regulatory relationships with leukemia-related genes including YY1 suggest its novel role in AML pathogenesis. Additionally, we discovered that the inferred activities of transcription factor C-Fos can be used as a prognostic marker to characterize survival rate of the AML patients. Together, we demonstrated an effective model that can integrate useful information from a large amount of heterogeneous data to dissect regulatory effects. Furthermore, the novel biological findings from this study may be constructive to future experimental research in AML.
PMCID: PMC4207489  PMID: 25340776
3.  TP53 oncomorphic mutations predict resistance to platinum- and taxane-based standard chemotherapy in patients diagnosed with advanced serous ovarian carcinoma 
International Journal of Oncology  2014;46(2):607-618.
Individual mutations in the tumor suppressor TP53 alter p53 protein function. Some mutations create a non-functional protein, whereas others confer oncogenic activity, which we term ‘oncomorphic’. Since mutations in TP53 occur in nearly all ovarian tumors, the objective of this study was to determine the relationship of oncomorphic TP53 mutations with patient outcomes in advanced serous ovarian cancer patients. Clinical and molecular data from 264 high-grade serous ovarian cancer patients uniformly treated with standard platinum- and taxane-based adjuvant chemotherapy were downloaded from The Cancer Genome Atlas (TCGA) portal. Additionally, patient samples were obtained from the University of Iowa and individual mutations were analyzed in ovarian cancer cell lines. Mutations in the TP53 were annotated and categorized as oncomorphic, loss of function (LOF), or unclassified. Associations between mutation types, chemoresistance, recurrence, and progression-free survival (PFS) were calculated. Oncomorphic TP53 mutations were present in 21.3% of ovarian cancers in the TCGA dataset. Patients with oncomorphic TP53 mutations demonstrated significantly worse PFS, a 60% higher risk of recurrence (HR=1.60, 95% confidence intervals 1.09, 2.33, p=0.015), and higher rates of platinum resistance (χ2 test p=0.0024) when compared with single nucleotide mutations not categorized as oncomorphic. Furthermore, tumors containing oncomorphic TP53 mutations displayed unique protein expression profiles, and some mutations conferred increased clonogenic capacity in ovarian cancer cell models. Our study reveals that oncomorphic TP53 mutations are associated with worse patient outcome. These data suggest that future studies should take into consideration the functional consequences of TP53 mutations when determining treatment options.
PMCID: PMC4277253  PMID: 25385265
oncomorphic p53 mutation; TP53; gain-of-function; ovarian cancer; chemoresistance
4.  Network-based Survival Analysis Reveals Subnetwork Signatures for Predicting Outcomes of Ovarian Cancer Treatment 
PLoS Computational Biology  2013;9(3):e1002975.
Cox regression is commonly used to predict the outcome by the time to an event of interest and in addition, identify relevant features for survival analysis in cancer genomics. Due to the high-dimensionality of high-throughput genomic data, existing Cox models trained on any particular dataset usually generalize poorly to other independent datasets. In this paper, we propose a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets. Net-Cox integrates gene network information into the Cox's proportional hazard model to explore the co-expression or functional relation among high-dimensional gene expression features in the gene network. Net-Cox was applied to analyze three independent gene expression datasets including the TCGA ovarian cancer dataset and two other public ovarian cancer datasets. Net-Cox with the network information from gene co-expression or functional relations identified highly consistent signature genes across the three datasets, and because of the better generalization across the datasets, Net-Cox also consistently improved the accuracy of survival prediction over the Cox models regularized by or . This study focused on analyzing the death and recurrence outcomes in the treatment of ovarian carcinoma to identify signature genes that can more reliably predict the events. The signature genes comprise dense protein-protein interaction subnetworks, enriched by extracellular matrix receptors and modulators or by nuclear signaling components downstream of extracellular signal-regulated kinases. In the laboratory validation of the signature genes, a tumor array experiment by protein staining on an independent patient cohort from Mayo Clinic showed that the protein expression of the signature gene FBN1 is a biomarker significantly associated with the early recurrence after 12 months of the treatment in the ovarian cancer patients who are initially sensitive to chemotherapy. Net-Cox toolbox is available at
Author Summary
Network-based computational models are attracting increasing attention in studying cancer genomics because molecular networks provide valuable information on the functional organizations of molecules in cells. Survival analysis mostly with the Cox proportional hazard model is widely used to predict or correlate gene expressions with time to an event of interest (outcome) in cancer genomics. Surprisingly, network-based survival analysis has not received enough attention. In this paper, we studied resistance to chemotherapy in ovarian cancer with a network-based Cox model, called Net-Cox. The experiments confirm that networks representing gene co-expression or functional relations can be used to improve the accuracy and the robustness of survival prediction of outcome in ovarian cancer treatment. The study also revealed subnetwork signatures that are enriched by extracellular matrix receptors and modulators and the downstream nuclear signaling components of extracellular signal-regulators, respectively. In particular, FBN1, which was detected as a signature gene of high confidence by Net-Cox with network information, was validated as a biomarker for predicting early recurrence in platinum-sensitive ovarian cancer patients in laboratory.
PMCID: PMC3605061  PMID: 23555212
5.  Identification of genes and pathways involved in kidney renal clear cell carcinoma 
BMC Bioinformatics  2014;15(Suppl 17):S2.
Kidney Renal Clear Cell Carcinoma (KIRC) is one of fatal genitourinary diseases and accounts for most malignant kidney tumours. KIRC has been shown resistance to radiotherapy and chemotherapy. Like many types of cancers, there is no curative treatment for metastatic KIRC. Using advanced sequencing technologies, The Cancer Genome Atlas (TCGA) project of NIH/NCI-NHGRI has produced large-scale sequencing data, which provide unprecedented opportunities to reveal new molecular mechanisms of cancer. We combined differentially expressed genes, pathways and network analyses to gain new insights into the underlying molecular mechanisms of the disease development.
Followed by the experimental design for obtaining significant genes and pathways, comprehensive analysis of 537 KIRC patients' sequencing data provided by TCGA was performed. Differentially expressed genes were obtained from the RNA-Seq data. Pathway and network analyses were performed. We identified 186 differentially expressed genes with significant p-value and large fold changes (P < 0.01, |log(FC)| > 5). The study not only confirmed a number of identified differentially expressed genes in literature reports, but also provided new findings. We performed hierarchical clustering analysis utilizing the whole genome-wide gene expressions and differentially expressed genes that were identified in this study. We revealed distinct groups of differentially expressed genes that can aid to the identification of subtypes of the cancer. The hierarchical clustering analysis based on gene expression profile and differentially expressed genes suggested four subtypes of the cancer. We found enriched distinct Gene Ontology (GO) terms associated with these groups of genes. Based on these findings, we built a support vector machine based supervised-learning classifier to predict unknown samples, and the classifier achieved high accuracy and robust classification results. In addition, we identified a number of pathways (P < 0.04) that were significantly influenced by the disease. We found that some of the identified pathways have been implicated in cancers from literatures, while others have not been reported in the cancer before. The network analysis leads to the identification of significantly disrupted pathways and associated genes involved in the disease development. Furthermore, this study can provide a viable alternative in identifying effective drug targets.
Our study identified a set of differentially expressed genes and pathways in kidney renal clear cell carcinoma, and represents a comprehensive computational approach to analysis large-scale next-generation sequencing data. The pathway and network analyses suggested that information from distinctly expressed genes can be utilized in the identification of aberrant upstream regulators. Identification of distinctly expressed genes and altered pathways are important in effective biomarker identification for early cancer diagnosis and treatment planning. Combining differentially expressed genes with pathway and network analyses using intelligent computational approaches provide an unprecedented opportunity to identify upstream disease causal genes and effective drug targets.
PMCID: PMC4304191  PMID: 25559354
Kidney Renal Clear Cell Carcinoma; TCGA; RNA-Seq; Differentially Expressed Genes; Pathways; Gene Network Analysis; Machine Learning Classifier
6.  Why Is There a Lack of Consensus on Molecular Subgroups of Glioblastoma? Understanding the Nature of Biological and Statistical Variability in Glioblastoma Expression Data 
PLoS ONE  2011;6(7):e20826.
Gene expression patterns characterizing clinically-relevant molecular subgroups of glioblastoma are difficult to reproduce. We suspect a combination of biological and analytic factors confounds interpretation of glioblastoma expression data. We seek to clarify the nature and relative contributions of these factors, to focus additional investigations, and to improve the accuracy and consistency of translational glioblastoma analyses.
We analyzed gene expression and clinical data for 340 glioblastomas in The Cancer Genome Atlas (TCGA). We developed a logic model to analyze potential sources of biological, technical, and analytic variability and used standard linear classifiers and linear dimensional reduction algorithms to investigate the nature and relative contributions of each factor.
Commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and technical noise make measurable but proportionally minor contributions to inconsistent molecular classification. Our analysis suggests that three, previously underappreciated factors may account for a larger fraction of classification errors: inherent non-linear/non-orthogonal relationships among the genes used in conjunction with classification algorithms that assume linearity; skewed data distributions assumed to be Gaussian; and biologic variability (noise) among tumors, of which we propose three types.
Our analysis of the TCGA data demonstrates a contributory role for technical factors in molecular classification inconsistencies in glioblastoma but also suggests that biological variability, abnormal data distribution, and non-linear relationships among genes may be responsible for a proportionally larger component of classification error. These findings may have important implications for both glioblastoma research and for translational application of other large-volume biological databases.
PMCID: PMC3145641  PMID: 21829433
7.  Identifying survival associated morphological features of triple negative breast cancer using multiple datasets 
Background and objective
Biomarkers for subtyping triple negative breast cancer (TNBC) are needed given the absence of responsive therapy and relatively poor prediction of survival. Morphology of cancer tissues is widely used in clinical practice for stratifying cancer patients, while genomic data are highly effective to classify cancer patients into subgroups. Thus integration of both morphological and genomic data is a promising approach in discovering new biomarkers for cancer outcome prediction. Here we propose a workflow for analyzing histopathological images and integrate them with genomic data for discovering biomarkers for TNBC.
Materials and methods
We developed an image analysis workflow for extracting a large collection of morphological features and deployed the same on histological images from The Cancer Genome Atlas (TCGA) TNBC samples during the discovery phase (n=44). Strong correlations between salient morphological features and gene expression profiles from the same patients were identified. We then evaluated the same morphological features in predicting survival using a local TNBC cohort (n=143). We further tested the predictive power on patient prognosis of correlated gene clusters using two other public gene expression datasets.
Results and conclusion
Using TCGA data, we identified 48 pairs of significantly correlated morphological features and gene clusters; four morphological features were able to separate the local cohort with significantly different survival outcomes. Gene clusters correlated with these four morphological features further proved to be effective in predicting patient survival using multiple public gene expression datasets. These results suggest the efficacy of our workflow and demonstrate that integrative analysis holds promise for discovering biomarkers of complex diseases.
PMCID: PMC3721170  PMID: 23585272
Triple Negative Breast Cancer; Computational Biology; Image Analysis; Cancer Survival; Biomarker Identification; The Cancer Genome Atlas
8.  Optimal chemotherapy treatment for women with recurrent ovarian cancer 
Current Oncology  2007;14(5):195-208.
What is the optimal chemotherapy treatment for women with recurrent ovarian cancer who have previously received platinum-based chemotherapy?
Currently, standard primary therapy for advanced disease involves a combination of maximal cytoreductive surgery and chemotherapy with carboplatin plus paclitaxel or with carboplatin alone. Despite initial high response rates, a large proportion of patients relapse, resulting in a therapeutic challenge. Because these patients are not curable, the goal of therapy becomes improvement in both quality and length of life. The search has therefore been to find active agents for women with recurrent disease following platinum-based chemotherapy.
Outcomes of interest included any combination of tumour response rate, progression-free survival, overall survival, adverse events, and quality of life.
The medline, embase, and Cochrane Library databases were systematically searched for primary articles and practice guidelines. The resulting evidence informed the development of clinical practice recommendations. The systematic review and recommendations were approved by the Report Approval Panel of the Program in Evidence-Based Care, and by the Gynecology Cancer Disease Site Group (dsg). The practice guideline was externally reviewed by a sample of practitioners from Ontario, Canada.
Thirteen randomized trials compared various chemotherapy regimens for patients with recurrent ovarian cancer.
In five of the thirteen trials in which 100% of patients were considered sensitive to platinum-containing chemotherapy, further platinum-based combination chemotherapy significantly improved response rates (two trials), progression-free survival (four trials), and overall survival (three trials) when compared with single-agent chemotherapy involving carboplatin or paclitaxel. Only two of these randomized trials compared the same chemotherapy regimens: carboplatin alone versus the combination of carboplatin and paclitaxel. Both trials were consistent in reporting improved survival outcomes with the combination of carboplatin and paclitaxel. In one trial, the combination of carboplatin and gemcitabine resulted in significantly higher response rates and improved progression-free survival when compared with carboplatin alone. Median survival with carboplatin alone ranged from 17 months to 24 months in four trials.
In eight of the thirteen trials in which 35%–100% of patients had platinum-refractory or -resistant disease, one trial reported a statistically significant 2-month improvement in overall survival with liposomal doxorubicin as compared with topotecan (15 months vs. 13 months, p = 0.038; hazard ratio: 1.23; 95% confidence interval: 1.01 to 1.50). In that trial, because of the limited clinical benefit and the unusual finding that a survival difference emerged only after a year of treatment with no corresponding improvement in the rate of response or of progression-free survival, the authors concluded that further confirmation by results from randomized trials were needed to establish the superiority of one agent over another in their trial. In one trial, topotecan was superior to treosulphan in patient progression-free survival by a span of approximately 2 months (5.4 months vs. 3.0 months, p < 0.001).
Toxicity was reported in all of the randomized trials, and although data on adverse events varied by treatment regimen, the observed adverse events correlated with known toxicity profiles. As expected, combination chemotherapy was associated with higher rates of adverse events.
Practice Guideline
Target Population
This clinical recommendation applies to women with recurrent epithelial ovarian cancer who have previously received platinum-based chemotherapy. Of specific interest are women who have previously shown sensitivity to platinum therapy and those who previously were refractory or resistant to platinum-based chemotherapy. As a general categorization within what is actually a continuum, “platinum sensitivity” refers to disease recurrence 6 months or more after prior platinum-containing chemotherapy, and “platinum resistance” refers to a response to platinum-based chemotherapy followed by relapse less than 6 months after chemotherapy is stopped. “Platinum-refractory disease” refers to a lack of response or to progression while on platinum-based chemotherapy.
Although the body of evidence that informs the clinical recommendations is based on randomized trial data, those data are incomplete. Based on the available data and expert consensus opinion, the Gynecology Cancer dsg makes these recommendations:
Systemic therapy for recurrent ovarian cancer is not curative. It is therefore recognized that each patient must be individually assessed to determine optimal therapy in terms of recurrence, sensitivity to platinum, toxicity, ease of administration, and patient preference. All suitable patients should be offered the opportunity to participate in randomized trials, if available.
In the absence of contraindications, combination platinum-based chemotherapy should be considered for patients with prior sensitivity to platinum-containing chemotherapy. As compared with carboplatin alone, the combination of carboplatin and paclitaxel significantly improved both progression-free and overall survival.
If combination platinum-based chemotherapy is not indicated, then a single platinum agent should be considered. Carboplatin has demonstrated efficacy across trials and has a manageable toxicity profile.
If a single platinum agent is not being considered, then monotherapy with paclitaxel, topotecan, or pegylated liposomal doxorubicin are seen as reasonable treatment options.
Some patients may be repeatedly sensitive to treatment and may benefit from multiple lines of chemotherapy.
For patients with platinum-refractory or platinum-resistant disease, the goals of treatment should be to improve quality of life by extending the symptom-free interval, by reducing symptom intensity, and by increasing progression-free interval, and, if possible, to prolong life.
With non-platinum agents, monotherapy should be considered because no advantage appears to accrue to the use of non-platinum-containing combination chemotherapy in this group of patients. Single-agent paclitaxel, topotecan, or pegylated liposomal doxorubicin have demonstrated activity in this patient population and are reasonable treatment options.
No evidence either supports or refutes the use of more than one line of chemotherapy in patients with platinum-refractory or platinum-resistant recurrence. Many treatment options have shown modest response rates, but their benefits over best supportive care have not been studied in clinical trials.
PMCID: PMC2002482  PMID: 17938703
Chemotherapy; drug therapy; ovarian cancer; ovarian neoplasms; practice guideline; systematic review
9.  An eight-miRNA signature as a potential biomarker for predicting survival in lung adenocarcinoma 
Lung adenocarcinoma is a heterogernous disease that creates challenges for classification and management. The purpose of this study is to identify specific miRNA markers closely associated with the survival of LUAD patients from a large dataset of significantly altered miRNAs, and to assess the prognostic value of this miRNA expression profile for OS in patients with LUAD.
We obtained miRNA expression profiles and corresponding clinical information for 372 LUAD patients from The Cancer Genome Atlas (TCGA), and identified the most significantly altered miRNAs between tumor and normal samples. Using survival analysis and supervised principal components method, we identified an eight-miRNA signature for the prediction of overall survival (OS) of LUAD patients. The relationship between OS and the identified miRNA signature was self-validated in the TCGA cohort (randomly classified into two subgroups: n = 186 for the training set and n = 186 for the testing set). Survival receiver operating characteristic (ROC) analysis was used to assess the performance of survival prediction. The biological relevance of putative miRNA targets was also analyzed using bioinformatics.
Sixteen of the 111 most significantly altered miRNAs were associated with OS across different clinical subclasses of the TCGA-derived LUAD cohort. A linear prognostic model of eight miRNAs (miR-31, miR-196b, miR-766, miR-519a-1, miR-375, miR-187, miR-331 and miR-101-1) was constructed and weighted by the importance scores from the supervised principal component method to divide patients into high- and low-risk groups. Patients assigned to the high-risk group exhibited poor OS compared with patients in the low-risk group (hazard ratio [HR] = 1.99, P <0.001). The eight-miRNA signature is an independent prognostic marker of OS of LUAD patients and demonstrates good performance for predicting 5-year OS (Area Under the respective ROC Curves [AUC] = 0.626, P = 0.003), especially for non-smokers (AUC = 0.686, P = 0.023).
We identified an eight-miRNA signature that is prognostic of LUAD. The miRNA signature, if validated in other prospective studies, may have important implications in clinical practice, in particular identifying a subgroup of patients with LUAD who are at high risk of mortality.
PMCID: PMC4062505  PMID: 24893932
Lung adenocarcinoma; MicroRNA; Prognostic markers; Overall survival
10.  A DNA Repair Pathway–Focused Score for Prediction of Outcomes in Ovarian Cancer Treated With Platinum-Based Chemotherapy 
New tools are needed to predict outcomes of ovarian cancer patients treated with platinum-based chemotherapy. We hypothesized that a molecular score based on expression of genes that are involved in platinum-induced DNA damage repair could provide such prognostic information.
Gene expression data was extracted from The Cancer Genome Atlas (TCGA) database for 151 DNA repair genes from tumors of serous ovarian cystadenocarcinoma patients (n = 511). A molecular score was generated based on the expression of 23 genes involved in platinum-induced DNA damage repair pathways. Patients were divided into low (scores 0–10) and high (scores 11–20) score groups, and overall survival (OS) was analyzed by Kaplan–Meier method. Results were validated in two gene expression microarray datasets. Association of the score with OS was compared with known clinical factors (age, stage, grade, and extent of surgical debulking) using univariate and multivariable Cox proportional hazards models. Score performance was evaluated by receiver operating characteristic (ROC) curve analysis. Correlations between the score and likelihood of complete response, recurrence-free survival, and progression-free survival were assessed. Statistical tests were two-sided.
Improved survival was associated with being in the high-scoring group (high vs low scores: 5-year OS, 40% vs 17%, P < .001), and results were reproduced in the validation datasets (P < .05). The score was the only pretreatment factor that showed a statistically significant association with OS (high vs low scores, hazard ratio of death = 0.40, 95% confidence interval = 0.32 to 0.66, P < .001). ROC curves indicated that the score outperformed the known clinical factors (score in a validation dataset vs clinical factors, area under the curve = 0.65 vs 0.52). The score positively correlated with complete response rate, recurrence-free survival, and progression-free survival (Pearson correlation coefficient [r2] = 0.60, 0.84, and 0.80, respectively; P < .001 for all).
The DNA repair pathway–focused score can be used to predict outcomes and response to platinum therapy in ovarian cancer patients.
PMCID: PMC3341307  PMID: 22505474
11.  Time to Recurrence and Survival in Serous Ovarian Tumors Predicted from Integrated Genomic Profiles 
PLoS ONE  2011;6(11):e24709.
Serous ovarian cancer (SeOvCa) is an aggressive disease with differential and often inadequate therapeutic outcome after standard treatment. The Cancer Genome Atlas (TCGA) has provided rich molecular and genetic profiles from hundreds of primary surgical samples. These profiles confirm mutations of TP53 in ∼100% of patients and an extraordinarily complex profile of DNA copy number changes with considerable patient-to-patient diversity. This raises the joint challenge of exploiting all new available datasets and reducing their confounding complexity for the purpose of predicting clinical outcomes and identifying disease relevant pathway alterations. We therefore set out to use multi-data type genomic profiles (mRNA, DNA methylation, DNA copy-number alteration and microRNA) available from TCGA to identify prognostic signatures for the prediction of progression-free survival (PFS) and overall survival (OS).
Methodology/Principal Findings
We implemented a multivariate Cox Lasso model and median time-to-event prediction algorithm and applied it to two datasets integrated from the four genomic data types. We (1) selected features through cross-validation; (2) generated a prognostic index for patient risk stratification; and (3) directly predicted continuous clinical outcome measures, that is, the time to recurrence and survival time. We used Kaplan-Meier p-values, hazard ratios (HR), and concordance probability estimates (CPE) to assess prediction performance, comparing separate and integrated datasets. Data integration resulted in the best PFS signature (withheld data: p-value = 0.008; HR = 2.83; CPE = 0.72).
We provide a prediction tool that inputs genomic profiles of primary surgical samples and generates patient-specific predictions for the time to recurrence and survival, along with outcome risk predictions. Using integrated genomic profiles resulted in information gain for prediction of outcomes. Pathway analysis provided potential insights into functional changes affecting disease progression. The prognostic signatures, if prospectively validated, may be useful for interpreting therapeutic outcomes for clinical trials that aim to improve the therapy for SeOvCa patients.
PMCID: PMC3207809  PMID: 22073136
12.  The Proneural Molecular Signature Is Enriched in Oligodendrogliomas and Predicts Improved Survival among Diffuse Gliomas 
PLoS ONE  2010;5(9):e12548.
The Cancer Genome Atlas Project (TCGA) has produced an extensive collection of ‘-omic’ data on glioblastoma (GBM), resulting in several key insights on expression signatures. Despite the richness of TCGA GBM data, the absence of lower grade gliomas in this data set prevents analysis genes related to progression and the uncovering of predictive signatures. A complementary dataset exists in the form of the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt), which contains molecular and clinical data for diffuse gliomas across the full spectrum of histologic class and grade. Here we present an investigation of the significance of the TCGA consortium's expression classification when applied to Rembrandt gliomas. We demonstrate that the proneural signature predicts improved clinical outcome among 176 Rembrandt gliomas that includes all histologies and grades, including GBMs (log rank test p = 1.16e-6), but also among 75 grade II and grade III samples (p = 2.65e-4). This gene expression signature was enriched in tumors with oligodendroglioma histology and also predicted improved survival in this tumor type (n = 43, p = 1.25e-4). Thus, expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for lower grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy. Integrated DNA and RNA analysis of low-grade and high-grade proneural gliomas identified increased expression and gene amplification of several genes including GLIS3, TGFB2, TNC, AURKA, and VEGFA in proneural GBMs, with corresponding loss of DLL3 and HEY2. Pathway analysis highlights the importance of the Notch and Hedgehog pathways in the proneural subtype. This demonstrates that the expression signatures identified in the TCGA analysis of GBMs also have intrinsic prognostic value for low-grade oligodendrogliomas, and likely represent important differences in tumor biology with implications for treatment and therapy.
PMCID: PMC2933229  PMID: 20838435
13.  Functional characterization of breast cancer using pathway profiles 
BMC Medical Genomics  2014;7:45.
The molecular characteristics of human diseases are often represented by a list of genes termed “signature genes”. A significant challenge facing this approach is that of reproducibility: signatures developed on a set of patients may fail to perform well on different sets of patients. As diseases are resulted from perturbed cellular functions, irrespective of the particular genes that contribute to the function, it may be more appropriate to characterize diseases based on these perturbed cellular functions.
We proposed a profile-based approach to characterize a disease using a binary vector whose elements indicate whether a given function is perturbed based on the enrichment analysis of expression data between normal and tumor tissues. Using breast cancer and its four primary clinically relevant subtypes as examples, this approach is evaluated based on the reproducibility, accuracy and resolution of the resulting pathway profiles.
Pathway profiles for breast cancer and its subtypes are constructed based on data obtained from microarray and RNA-Seq data sets provided by The Cancer Genome Atlas (TCGA), and an additional microarray data set provided by The European Genome-phenome Archive (EGA). An average reproducibility of 68% is achieved between different data sets (TCGA microarray vs. EGA microarray data) and 67% average reproducibility is achieved between different technologies (TCGA microarray vs. TCGA RNA-Seq data). Among the enriched pathways, 74% of them are known to be associated with breast cancer or other cancers. About 40% of the identified pathways are enriched in all four subtypes, with 4, 2, 4, and 7 pathways enriched only in luminal A, luminal B, triple-negative, and HER2+ subtypes, respectively. Comparison of profiles between subtypes, as well as other diseases, shows that luminal A and luminal B subtypes are more similar to the HER2+ subtype than to the triple-negative subtype, and subtypes of breast cancer are more likely to be closer to each other than to other diseases.
Our results demonstrate that pathway profiles can successfully characterize both common and distinct functional characteristics of four subtypes of breast cancer and other related diseases, with acceptable reproducibility, high accuracy and reasonable resolution.
PMCID: PMC4113668  PMID: 25041817
Signature genes; Pathway; Pathway profile; Enrichment analysis; Breast cancer
14.  Gene-gene interaction network analysis of ovarian cancer using TCGA data 
The Cancer Genome Atlas (TCGA) Data portal provides a platform for researchers to search, download, and analysis data generated by TCGA. The objective of this study was to explore the molecular mechanism of ovarian cancer pathogenesis.
Microarray data of ovarian cancer were downloaded from TCGA database, and Limma package in R language was used to identify the differentially expressed genes (DEGs) between ovarian cancer and normal samples, followed by the function and pathway annotations of the DEGs. Next, NetBox software was used to for the gene-gene interaction (GGI) network construction and the corresponding modules identification, and functions of genes in the modules were screened using DAVID.
Our studies identified 332 DEGs, including 146 up-regulated genes which mainly involved in the cell cycle related functions and cell cycle pathway, and 186 down-regulated genes which were enriched in extracellular region par function, and Ether lipid metabolism pathway. GGI network was constructed by 127 DEGs and their significantly interacted 209 genes (LINKERs). In the top 10 nodes ranked by degrees in the network, 5 were LINKERs. Totally, 7 functional modules in the network were selected, and they were enriched in different functions and pathways, such as mitosis process, DNA replication and DNA double-strand synthesis, lipid synthesis processes and metabolic pathways. AR, BRCA1, TFDP1, FOXM1, CDK2, and DBF4 were identified as the transcript factors of the 7 modules.
our data provides a comprehensive bioinformatics analysis of genes, functions, and pathways which may be involved in the pathogenesis of ovarian cancer.
PMCID: PMC4029308  PMID: 24314048
Differentially expressed genes; Function and pathway annotation; Gene-gene interaction network; Functional modules
15.  Src homology domain-containing phosphatase 2 suppresses cellular senescence in glioblastoma 
British Journal of Cancer  2011;105(8):1235-1243.
Epidermal growth factor receptor (EGFR) signalling is frequently altered during glioblastoma de novo pathogenesis. An important downstream modulator of this signal cascade is SHP2 (Src homology domain-containing phosphatase 2).
We examined the The Cancer Genome Atlas (TCGA) database for SHP2 mutations. We also examined the expression of a further 191 phosphatases in the TCGA database and used principal component and comparative marker analysis available from the Broad Institute to recapitulate the TCGA-defined subgroups and identify the specific phosphatases defining each subgroup. We identified five siRNAs from two independent commercial sources that were reported by the vendor to be pre-optimised in their specificity of SHP2 silencing. The specificity and physiological effects of these siRNAs were tested using an in vitro glioma model.
TCGA data demonstrate SHP2 to be mutated in 2% of the glioblastoma multiforme's studied. Both mutations identified in this study are likely to be activating mutations. We found that the four subgroups of GBM as defined by TCGA differ significantly with regard to the expression level of specific phosphatases as revealed by comparative marker analysis. Surprisingly, the four subgroups can be defined solely on the basis of phosphatase expression level by principal component analysis. This result suggests that critical phosphatases are responsible for the modulation of specific molecular pathways within each subgroup. Src homology domain-containing phosphatase 2 constitutes one of the 12 phosphatases that define the classical subgroup. We confirmed the biological significance by siRNA knockdown of SHP2. All five siRNAs tested reduced SHP2 expression by 70–100% and reduced glioblastoma cell line growth by up to 80%. Profiling the established molecular targets of SHP2 (ERK1/2 and STAT3) confirmed specificity of these siRNAs. The loss of cell viability induced by SHP2 silencing could not be explained by a significant increase in apoptosis alone as demonstrated by terminal deoxyribonucleotidyl transferase-mediated nick-end labelling and propidium iodide staining. Src homology domain-containing phosphatase 2 silencing, however, did induce an increase in β-galactosidase staining. Propidium iodide staining also showed that SHP2 silencing increases the population of glioblastoma cells in the G1 phase of the cell cycle and reduces the population of such cells in the G2/M- and S-phase.
Src homology domain-containing phosphatase 2 promotes the growth of glioblastoma cells by suppression of cellular senescence, a phenomenon not described previously. Selective inhibitors of SHP2 are commercially available and may be considered as a strategy for glioblastoma therapy.
PMCID: PMC3208488  PMID: 21934682
glioblastoma; phosphatases; SHP2; senescence
16.  Identification of ovarian cancer driver genes by using module network integration of multi-omics data 
Interface Focus  2013;3(4):20130013.
The increasing availability of multi-omics cancer datasets has created a new opportunity for data integration that promises a more comprehensive understanding of cancer. The challenge is to develop mathematical methods that allow the integration and extraction of knowledge from large datasets such as The Cancer Genome Atlas (TCGA). This has led to the development of a variety of omics profiles that are highly correlated with each other; however, it remains unknown which profile is the most meaningful and how to efficiently integrate different omics profiles. We developed AMARETTO, an algorithm to identify cancer drivers by integrating a variety of omics data from cancer and normal tissue. AMARETTO first models the effects of genomic/epigenomic data on disease-specific gene expression. AMARETTO's second step involves constructing a module network to connect the cancer drivers with their downstream targets. We observed that more gene expression variation can be explained when using disease-specific gene expression data. We applied AMARETTO to the ovarian cancer TCGA data and identified several cancer driver genes of interest, including novel genes in addition to known drivers of cancer. Finally, we showed that certain modules are predictive of good versus poor outcome, and the associated drivers were related to DNA repair pathways.
PMCID: PMC3915833  PMID: 24511378
gene expression; DNA methylation; copy number; ovarian cancer; data integration
17.  Identification of PBX1 Target Genes in Cancer Cells by Global Mapping of PBX1 Binding Sites 
PLoS ONE  2012;7(5):e36054.
PBX1 is a TALE homeodomain transcription factor involved in organogenesis and tumorigenesis. Although it has been shown that ovarian, breast, and melanoma cancer cells depend on PBX1 for cell growth and survival, the molecular mechanism of how PBX1 promotes tumorigenesis remains unclear. Here, we applied an integrated approach by overlapping PBX1 ChIP-chip targets with the PBX1-regulated transcriptome in ovarian cancer cells to identify genes whose transcription was directly regulated by PBX1. We further determined if PBX1 target genes identified in ovarian cancer cells were co-overexpressed with PBX1 in carcinoma tissues. By analyzing TCGA gene expression microarray datasets from ovarian serous carcinomas, we found co-upregulation of PBX1 and a significant number of its direct target genes. Among the PBX1 target genes, a homeodomain protein MEOX1 whose DNA binding motif was enriched in PBX1-immunoprecipicated DNA sequences was selected for functional analysis. We demonstrated that MEOX1 protein interacts with PBX1 protein and inhibition of MEOX1 yields a similar growth inhibitory phenotype as PBX1 suppression. Furthermore, ectopically expressed MEOX1 functionally rescued the PBX1-withdrawn effect, suggesting MEOX1 mediates the cellular growth signal of PBX1. These results demonstrate that MEOX1 is a critical target gene and cofactor of PBX1 in ovarian cancers.
PMCID: PMC3342315  PMID: 22567123
18.  An integrative characterization of recurrent molecular aberrations in glioblastoma genomes 
Nucleic Acids Research  2013;41(19):8803-8821.
Glioblastoma multiforme (GBM) is the most common and malignant primary brain tumor in adults. Decades of investigations and the recent effort of the Cancer Genome Atlas (TCGA) project have mapped many molecular alterations in GBM cells. Alterations on DNAs may dysregulate gene expressions and drive malignancy of tumors. It is thus important to uncover causal and statistical dependency between ‘effector’ molecular aberrations and ‘target’ gene expressions in GBMs. A rich collection of prior studies attempted to combine copy number variation (CNV) and mRNA expression data. However, systematic methods to integrate multiple types of cancer genomic data—gene mutations, single nucleotide polymorphisms, CNVs, DNA methylations, mRNA and microRNA expressions and clinical information—are relatively scarce. We proposed an algorithm to build ‘association modules’ linking effector molecular aberrations and target gene expressions and applied the module-finding algorithm to the integrated TCGA GBM data sets. The inferred association modules were validated by six tests using external information and datasets of central nervous system tumors: (i) indication of prognostic effects among patients; (ii) coherence of target gene expressions; (iii) retention of effector–target associations in external data sets; (iv) recurrence of effector molecular aberrations in GBM; (v) functional enrichment of target genes; and (vi) co-citations between effectors and targets. Modules associated with well-known molecular aberrations of GBM—such as chromosome 7 amplifications, chromosome 10 deletions, EGFR and NF1 mutations—passed the majority of the validation tests. Furthermore, several modules associated with less well-reported molecular aberrations—such as chromosome 11 CNVs, CD40, PLXNB1 and GSTM1 methylations, and mir-21 expressions—were also validated by external information. In particular, modules constituting trans-acting effects with chromosome 11 CNVs and cis-acting effects with chromosome 10 CNVs manifested strong negative and positive associations with survival times in brain tumors. By aligning the information of association modules with the established GBM subclasses based on transcription or methylation levels, we found each subclass possessed multiple concurrent molecular aberrations. Furthermore, the joint molecular characteristics derived from 16 association modules had prognostic power not explained away by the strong biomarker of CpG island methylator phenotypes. Functional and survival analyses indicated that immune/inflammatory responses and epithelial-mesenchymal transitions were among the most important determining processes of prognosis. Finally, we demonstrated that certain molecular aberrations uniquely recurred in GBM but were relatively rare in non-GBM glioma cells. These results justify the utility of an integrative analysis on cancer genomes and provide testable characterizations of driver aberration events in GBM.
PMCID: PMC3799430  PMID: 23907387
19.  Integrated Analysis of Gene Expression and Tumor Nuclear Image Profiles Associated with Chemotherapy Response in Serous Ovarian Carcinoma 
PLoS ONE  2012;7(5):e36383.
Small sample sizes used in previous studies result in a lack of overlap between the reported gene signatures for prediction of chemotherapy response. Although morphologic features, especially tumor nuclear morphology, are important for cancer grading, little research has been reported on quantitatively correlating cellular morphology with chemotherapy response, especially in a large data set. In this study, we have used a large population of patients to identify molecular and morphologic signatures associated with chemotherapy response in serous ovarian carcinoma.
Methodology/Principal Findings
A gene expression model that predicts response to chemotherapy is developed and validated using a large-scale data set consisting of 493 samples from The Cancer Genome Atlas (TCGA) and 244 samples from an Australian report. An identified 227-gene signature achieves an overall predictive accuracy of greater than 85% with a sensitivity of approximately 95% and specificity of approximately 70%. The gene signature significantly distinguishes between patients with unfavorable versus favorable prognosis, when applied to either an independent data set (P = 0.04) or an external validation set (P<0.0001). In parallel, we present the production of a tumor nuclear image profile generated from 253 sample slides by characterizing patients with nuclear features (such as size, elongation, and roundness) in incremental bins, and we identify a morphologic signature that demonstrates a strong association with chemotherapy response in serous ovarian carcinoma.
A gene signature discovered on a large data set provides robustness in accurately predicting chemotherapy response in serous ovarian carcinoma. The combination of the molecular and morphologic signatures yields a new understanding of potential mechanisms involved in drug resistance.
PMCID: PMC3348145  PMID: 22590536
20.  Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data 
BMC Systems Biology  2013;7(Suppl 2):S4.
Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need.
In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer.
This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.
PMCID: PMC3851989  PMID: 24565034
21.  Cancer genomic research at the crossroads: realizing the changing genetic landscape as intratumoral spatial and temporal heterogeneity becomes a confounding factor 
Cancer Cell International  2014;14(1):115.
The US National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) created the Cancer Genome Atlas (TCGA) Project in 2006. The TCGA’s goal was to sequence the genomes of 10,000 tumors to identify common genetic changes among different types of tumors for developing genetic-based treatments. TCGA offered great potential for cancer patients, but in reality has little impact on clinical applications. Recent reports place the past TCGA approach of testing a small tumor mass at a single time-point at a crossroads. This crossroads presents us with the conundrum of whether we should sequence more tumors or obtain multiple biopsies from each individual tumor at different time points. Sequencing more tumors with the past TCGA approach of single time-point sampling can neither capture the heterogeneity between different parts of the same tumor nor catch the heterogeneity that occurs as a function of time, error rates, and random drift. Obtaining multiple biopsies from each individual tumor presents multiple logistical and financial challenges. Here, we review current literature and rethink the utility and application of the TCGA approach. We discuss that the TCGA-led catalogue may provide insights into studying the functional significance of oncogenic genes in reference to non-cancer genetic background. Different methods to enhance identifying cancer targets, such as single cell technology, real time imaging of cancer cells with a biological global positioning system, and cross-referencing big data sets, are offered as ways to address sampling discrepancies in the face of tumor heterogeneity. We predict that TCGA landmarks may prove far more useful for cancer prevention than for cancer diagnosis and treatment when considering the effect of non-cancer genes and the normal genetic background on tumor microenvironment. Cancer prevention can be better realized once we understand how therapy affects the genetic makeup of cancer over time in a clinical setting. This may help create novel therapies for gene mutations that arise during a tumor’s evolution from the selection pressure of treatment.
PMCID: PMC4236490  PMID: 25411563
22.  SubPatCNV: approximate subspace pattern mining for mapping copy-number variations 
BMC Bioinformatics  2015;16(1):16.
Many DNA copy-number variations (CNVs) are known to lead to phenotypic variations and pathogenesis. While CNVs are often only common in a small number of samples in the studied population or patient cohort, previous work has not focused on customized identification of CNV regions that only exhibit in subsets of samples with advanced data mining techniques to reliably answer questions such as “Which are all the chromosomal fragments showing nearly identical deletions or insertions in more than 30% of the individuals?”.
We introduce a tool for mining CNV subspace patterns, namely SubPatCNV, which is capable of identifying all aberrant CNV regions specific to arbitrary sample subsets larger than a support threshold. By design, SubPatCNV is the implementation of a variation of approximate association pattern mining algorithm under a spatial constraint on the positional CNV probe features. In benchmark test, SubPatCNV was applied to identify population specific germline CNVs from four populations of HapMap samples. In experiments on the TCGA ovarian cancer dataset, SubPatCNV discovered many large aberrant CNV events in patient subgroups, and reported regions enriched with cancer relevant genes. In both HapMap data and TCGA data, it was observed that SubPatCNV employs approximate pattern mining to more effectively identify CNV subspace patterns that are consistent within a subgroup from high-density array data.
SubPatCNV available through a unique scalable open-source software tool that provides the flexibility of identifying CNV regions specific to sample subgroups of different sizes from high-density CNV array data.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0426-7) contains supplementary material, which is available to authorized users.
PMCID: PMC4305219  PMID: 25591662
DNA copy-number variations; Approximate pattern mining; HapMap; Cancer
23.  Multi-Gene Expression Predictors of Single Drug Responses to Adjuvant Chemotherapy in Ovarian Carcinoma: Predicting Platinum Resistance 
PLoS ONE  2012;7(2):e30550.
Despite advances in radical surgery and chemotherapy delivery, ovarian cancer is the most lethal gynecologic malignancy. Standard therapy includes treatment with platinum-based combination chemotherapies yet there is no biomarker model to predict their responses to these agents. We here have developed and independently tested our multi-gene molecular predictors for forecasting patients' responses to individual drugs on a cohort of 55 ovarian cancer patients. To independently validate these molecular predictors, we performed microarray profiling on FFPE tumor samples of 55 ovarian cancer patients (UVA-55) treated with platinum-based adjuvant chemotherapy. Genome-wide chemosensitivity biomarkers were initially discovered from the in vitro drug activities and genomic expression data for carboplatin and paclitaxel, respectively. Multivariate predictors were trained with the cell line data and then evaluated with a historical patient cohort. For the UVA-55 cohort, the carboplatin, taxol, and combination predictors significantly stratified responder patients and non-responder patients (p = 0.019, 0.04, 0.014) with sensitivity = 91%, 96%, 93 and NPV = 57%, 67%, 67% in pathologic clinical response. The combination predictor also demonstrated a significant survival difference between predicted responders and non-responders with a median survival of 55.4 months vs. 32.1 months. Thus, COXEN single- and combination-drug predictors successfully stratified platinum resistance and taxane response in an independent cohort of ovarian cancer patients based on their FFPE tumor samples.
PMCID: PMC3277593  PMID: 22348014
24.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma 
DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma.
We introduce a modeling approach termed EPoC (Endogenous Perturbation analysis of Cancer), enabling the construction of global, gene-level models that causally connect gene copy number with expression in glioblastoma.On the basis of the resulting model, we predict genes that are likely to be disease-driving and validate selected predictions experimentally. We also demonstrate that further analysis of the network model by sparse singular value decomposition allows stratification of patients with glioblastoma into short-term and long-term survivors, introducing decomposed network models as a useful principle for biomarker discovery.Finally, in systematic comparisons, we demonstrate that EPoC is computationally efficient and yields more consistent results than mRNA-only methods, standard eQTL methods, and two recent multivariate methods for genotype–mRNA coupling.
Gains and losses of chromosomal material (DNA copy number aberrations; CNAs) are a characteristic feature of cancer genomes. At the level of a single locus, it is well known that increased copy number (gene amplification) typically leads to increased gene expression, whereas decreased copy number (gene deletion) leads to decreased gene expression (Pollack et al, 2002; Lee et al, 2008; Nilsson et al, 2008). However, CNAs also affect the expression of genes located outside the amplified/deleted region itself via indirect mechanisms. To fully understand the action of CNAs, it is therefore necessary to analyze their action in a network context. Toward this goal, improved computational approaches will be important, if not essential.
To determine the global effects on transcription of CNAs in the brain tumor glioblastoma, we develop EPoC (Endogenous Perturbation analysis of Cancer), a computational technique capable of inferring sparse, causal network models by combining genome-wide, paired CNA- and mRNA-level data. EPoC aims to detect disease-driving copy number aberrations and their effect on target mRNA expression, and stratify patients into long-term and short-term survivors. Technically, EPoC relates CNA perturbations to mRNA responses by matrix equations, derived from a steady-state approximation of the transcriptional network. Patient prognostic scores are obtained from singular value decompositions of the network matrix. The models are constructed by solving a large-scale, regularized regression problem.
We apply EPoC to glioblastoma data from The Cancer Genome Atlas (TCGA) consortium (186 patients). The identified CNA-driven network comprises 10 672 genes, and contains a number of copy number-altered genes that control multiple downstream genes. Highly connected hub genes include well-known oncogenes and tumor supressor genes that are frequently deleted or amplified in glioblastoma, including EGFR, PDGFRA, CDKN2A and CDKN2B, confirming a clear association between these aberrations and transcriptional variability of these brain tumors. In addition, we identify a number of hub genes that have previously not been associated with glioblastoma, including interferon alpha 1 (IFNA1), myeloid/lymphoid or mixed-lineage leukemia translocated to 10 (MLLT10, a well-known leukemia gene), glutamate decarboxylase 2 GAD2, a postulated glutamate receptor GPR158 and Necdin (NDN). Furthermore, we demonstrate that the network model contains useful information on downstream target genes (including stem cell regulators), and possible drug targets.
We proceed to explore the validity of a small network region experimentally. Introducing experimental perturbations of NDN and other targets in four glioblastoma cell lines (T98G, U-87MG, U-343MG and U-373MG), we confirm several predicted mechanisms. We also demonstrate that the TCGA glioblastoma patients can be stratified into long-term and short-term survivors, using our proposed prognostic scores derived from a singular vector decomposition of the network model. Finally, we compare EPoC to existing methods for mRNA networks analysis and expression quantitative locus methods, and demonstrate that EPoC produces more consistent models between technically independent glioblastoma data sets, and that the EPoC models exhibit better overlap with known protein–protein interaction networks and pathway maps.
In summary, we conclude that large-scale integrative modeling reveals mechanistically and prognostically informative networks in human glioblastoma. Our approach operates at the gene level and our data support that individual hub genes can be identified in practice. Very large aberrations, however, cannot be fully resolved by the current modeling strategy.
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.
PMCID: PMC3101951  PMID: 21525872
cancer biology; cancer genomics; glioblastoma
25.  Activity of the multikinase inhibitor dasatinib against ovarian cancer cells 
British Journal of Cancer  2009;101(10):1699-1708.
Here, we explore the therapeutic potential of dasatinib, a small-molecule inhibitor that targets multiple cytosolic and membrane-bound tyrosine kinases, including members of the Src kinase family, EphA2, and focal adhesion kinase for the treatment of ovarian cancer.
We examined the effects of dasatinib on proliferation, invasion, apoptosis, cell-cycle arrest, and kinase activity using a panel of 34 established human ovarian cancer cell lines. Molecular markers for response prediction were studied using gene expression profiling. Multiple drug effect/combination index (CI) isobologram analysis was used to study the interactions with chemotherapeutic drugs.
Concentration-dependent anti-proliferative effects of dasatinib were seen in all ovarian cancer cell lines tested, but varied significantly between individual cell lines with up to a 3 log-fold difference in the IC50 values (IC50 range: 0.001–11.3 μmol l−1). Dasatinib significantly inhibited invasion, and induced cell apoptosis, but less cell-cycle arrest. At a wide range of clinically achievable drug concentrations, additive and synergistic interactions were observed for dasatinib plus carboplatin (mean CI values, range: 0.73–1.11) or paclitaxel (mean CI values, range: 0.76–1.05). In this study, 24 out of 34 (71%) representative ovarian cancer cell lines were highly sensitive to dasatinib, compared with only 8 out of 39 (21%) representative breast cancer cell lines previously reported. Cell lines with high expression of Yes, Lyn, Eph2A, caveolin-1 and 2, moesin, annexin-1, and uPA were particularly sensitive to dasatinib.
These data provide a clear biological rationale to test dasatinib as a single agent or in combination with chemotherapy in patients with ovarian cancer.
PMCID: PMC2778533  PMID: 19861960
Src; Eph2A; FAK; uPA; dasatinib; ovarian cancer

Results 1-25 (1275786)