Search tips
Search criteria

Results 1-3 (3)

Clipboard (0)
Year of Publication
Document Types
1.  Pathway-based classification of cancer subtypes 
Biology Direct  2012;7:21.
Molecular markers based on gene expression profiles have been used in experimental and clinical settings to distinguish cancerous tumors in stage, grade, survival time, metastasis, and drug sensitivity. However, most significant gene markers are unstable (not reproducible) among data sets. We introduce a standardized method for representing cancer markers as 2-level hierarchical feature vectors, with a basic gene level as well as a second level of (more stable) pathway markers, for the purpose of discriminating cancer subtypes. This extends standard gene expression arrays with new pathway-level activation features obtained directly from off-the-shelf gene set enrichment algorithms such as GSEA. Such so-called pathway-based expression arrays are significantly more reproducible across datasets. Such reproducibility will be important for clinical usefulness of genomic markers, and augment currently accepted cancer classification protocols.
The present method produced more stable (reproducible) pathway-based markers for discriminating breast cancer metastasis and ovarian cancer survival time. Between two datasets for breast cancer metastasis, the intersection of standard significant gene biomarkers totaled 7.47% of selected genes, compared to 17.65% using pathway-based markers; the corresponding percentages for ovarian cancer datasets were 20.65% and 33.33% respectively. Three pathways, consisting of Type_1_diabetes mellitus, Cytokine-cytokine_receptor_interaction and Hedgehog_signaling (all previously implicated in cancer), are enriched in both the ovarian long survival and breast non-metastasis groups. In addition, integrating pathway and gene information, we identified five (ID4, ANXA4, CXCL9, MYLK, FBXL7) and six (SQLE, E2F1, PTTG1, TSTA3, BUB1B, MAD2L1) known cancer genes significant for ovarian and breast cancer respectively.
Standardizing the analysis of genomic data in the process of cancer staging, classification and analysis is important as it has implications for both pre-clinical as well as clinical studies. The paradigm of diagnosis and prediction using pathway-based biomarkers as features can be an important part of the process of biomarker-based cancer analysis, and the resulting canonical (clinically reproducible) biomarkers can be important in standardizing genomic data. We expect that identification of such canonical biomarkers will improve clinical utility of high-throughput datasets for diagnostic and prognostic applications.
This article was reviewed by John McDonald (nominated by I. King Jordon), Eugene Koonin, Nathan Bowen (nominated by I. King Jordon), and Ekaterina Kotelnikova (nominated by Mikhail Gelfand).
PMCID: PMC3485163  PMID: 22759382
Classification; Pathway; Cancer subtypes; Ovarian cancer; Breast cancer; Gene set enrichment analysis
2.  In silico regulatory analysis for exploring human disease progression 
Biology Direct  2008;3:24.
An important goal in bioinformatics is to unravel the network of transcription factors (TFs) and their targets. This is important in the human genome, where many TFs are involved in disease progression. Here, classification methods are applied to identify new targets for 152 transcriptional regulators using publicly-available targets as training examples. Three types of sequence information are used: composition, conservation, and overrepresentation.
Starting with 8817 TF-target interactions we predict an additional 9333 targets for 152 TFs. Randomized classifiers make few predictions (~2/18660) indicating that our predictions for many TFs are significantly enriched for true targets. An enrichment score is calculated and used to filter new predictions.
Two case-studies for the TFs OCT4 and WT1 illustrate the usefulness of our predictions:
• Many predicted OCT4 targets fall into the Wnt-pathway. This is consistent with known biology as OCT4 is developmentally related and Wnt pathway plays a role in early development.
• Beginning with 15 known targets, 354 predictions are made for WT1. WT1 has a role in formation of Wilms' tumor. Chromosomal regions previously implicated in Wilms' tumor by cytological evidence are statistically enriched in predicted WT1 targets. These findings may shed light on Wilms' tumor progression, suggesting that the tumor progresses either by loss of WT1 or by loss of regions harbouring its targets.
• Targets of WT1 are statistically enriched for cancer related functions including metastasis and apoptosis. Among new targets are BAX and PDE4B, which may help mediate the established anti-apoptotic effects of WT1.
• Of the thirteen TFs found which co-regulate genes with WT1 (p ≤ 0.02), 8 have been previously implicated in cancer. The regulatory-network for WT1 targets in genomic regions relevant to Wilms' tumor is provided.
We have assembled a set of features for the targets of human TFs and used them to develop classifiers for the determination of new regulatory targets. Many predicted targets are consistent with the known biology of their regulators, and new targets for the Wilms' tumor regulator, WT1, are proposed. We speculate that Wilms' tumor development is mediated by chromosomal rearrangements in the location of WT1 targets.
This article was reviewed by Trey Ideker, Vladimir A. Kuznetsov(nominated by Frank Eisenhaber), and Tzachi Pilpel.
PMCID: PMC2464594  PMID: 18564415
3.  Classifying transcription factor targets and discovering relevant biological features 
Biology Direct  2008;3:22.
An important goal in post-genomic research is discovering the network of interactions between transcription factors (TFs) and the genes they regulate. We have previously reported the development of a supervised-learning approach to TF target identification, and used it to predict targets of 104 transcription factors in yeast. We now include a new sequence conservation measure, expand our predictions to include 59 new TFs, introduce a web-server, and implement an improved ranking method to reveal the biological features contributing to regulation. The classifiers combine 8 genomic datasets covering a broad range of measurements including sequence conservation, sequence overrepresentation, gene expression, and DNA structural properties.
Principal Findings
(1) Application of the method yields an amplification of information about yeast regulators. The ratio of total targets to previously known targets is greater than 2 for 11 TFs, with several having larger gains: Ash1(4), Ino2(2.6), Yaf1(2.4), and Yap6(2.4).
(2) Many predicted targets for TFs match well with the known biology of their regulators. As a case study we discuss the regulator Swi6, presenting evidence that it may be important in the DNA damage response, and that the previously uncharacterized gene YMR279C plays a role in DNA damage response and perhaps in cell-cycle progression.
(3) A procedure based on recursive-feature-elimination is able to uncover from the large initial data sets those features that best distinguish targets for any TF, providing clues relevant to its biology. An analysis of Swi6 suggests a possible role in lipid metabolism, and more specifically in metabolism of ceramide, a bioactive lipid currently being investigated for anti-cancer properties.
(4) An analysis of global network properties highlights the transcriptional network hubs; the factors which control the most genes and the genes which are bound by the largest set of regulators. Cell-cycle and growth related regulators dominate the former; genes involved in carbon metabolism and energy generation dominate the latter.
Postprocessing of regulatory-classifier results can provide high quality predictions, and feature ranking strategies can deliver insight into the regulatory functions of TFs. Predictions are available at an online web-server, including the full transcriptional network, which can be analyzed using VisAnt network analysis suite.
This article was reviewed by Igor Jouline, Todd Mockler(nominated by Valerian Dolja), and Sandor Pongor.
PMCID: PMC2441612  PMID: 18513408

Results 1-3 (3)