Search tips
Search criteria

Results 1-8 (8)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  CSNK1E/CTNNB1 Are Synthetic Lethal To TP53 in Colorectal Cancer and Are Markers for Prognosis 
Neoplasia (New York, N.Y.)  2014;16(5):441-450.
Two genes are called synthetic lethal (SL) if their simultaneous mutations lead to cell death, but each individual mutation does not. Targeting SL partners of mutated cancer genes can kill cancer cells specifically, but leave normal cells intact. We present an integrated approach to uncovering SL pairs in colorectal cancer (CRC). Screening verified SL pairs using microarray gene expression data of cancerous and normal tissues, we first identified potential functionally relevant (simultaneously differentially expressed) gene pairs. From the top-ranked pairs, ~ 20 genes were chosen for immunohistochemistry (IHC) staining in 171 CRC patients. To find novel SL pairs, all 169 combined pairs from the individual IHC were synergistically correlated to five clinicopathological features, e.g. overall survival. Of the 11 predicted SL pairs, MSH2-POLB and CSNK1E-MYC were consistent with literature, and we validated the top two pairs, CSNK1E-TP53 and CTNNB1-TP53 using RNAi knockdown and small molecule inhibitors of CSNK1E in isogenic HCT-116 and RKO cells. Furthermore, synthetic lethality of CSNK1E and TP53 was verified in mouse model. Importantly, multivariate analysis revealed that CSNK1E-P53, CTNNB1-P53, MSH2-RB1, and BRCA1-WNT5A were independent prognosis markers from stage, with CSNK1E-P53 applicable to early-stage and the remaining three throughout all stages. Our findings suggest that CSNK1E is a promising target for TP53-mutant CRC patients which constitute ~ 40% to 50% of patients, while to date safety regarding inhibition of TP53 is controversial. Thus the integrated approach is useful in finding novel SL pairs for cancer therapeutics, and it is readily accessible and applicable to other cancers.
PMCID: PMC4198690  PMID: 24947187
CRC, colorectal cancer; IHC, immunohistochemistry; FDR, false discovery rate; TD, tumor-dependent; SL, synthetic lethal
2.  Inferring Genetic Interactions via a Data-Driven Second Order Model 
Genetic/transcriptional regulatory interactions are shown to predict partial components of signaling pathways, which have been recognized as vital to complex human diseases. Both activator (A) and repressor (R) are known to coregulate their common target gene (T). Xu et al. (2002) proposed to model this coregulation by a fixed second order response surface (called the RS algorithm), in which T is a function of A, R, and AR. Unfortunately, the RS algorithm did not result in a sufficient number of genetic interactions (GIs) when it was applied to a group of 51 yeast genes in a pilot study. Thus, we propose a data-driven second order model (DDSOM), an approximation to the non-linear transcriptional interactions, to infer genetic and transcriptional regulatory interactions. For each triplet of genes of interest (A, R, and T), we regress the expression of T at time t + 1 on the expression of A, R, and AR at time t. Next, these well-fitted regression models (viewed as points in R3) are collected, and the center of these points is used to identify triples of genes having the A-R-T relationship or GIs. The DDSOM and RS algorithms are first compared on inferring transcriptional compensation interactions of a group of yeast genes in DNA synthesis and DNA repair using microarray gene expression data; the DDSOM algorithm results in higher modified true positive rate (about 75%) than that of the RS algorithm, checked against quantitative RT-polymerase chain reaction results. These validated GIs are reported, among which some coincide with certain interactions in DNA repair and genome instability pathways in yeast. This suggests that the DDSOM algorithm has potential to predict pathway components. Further, both algorithms are applied to predict transcriptional regulatory interactions of 63 yeast genes. Checked against the known transcriptional regulatory interactions queried from TRANSFAC, the proposed also performs better than the RS algorithm.
PMCID: PMC3342528  PMID: 22563331
gene expression; genetic interaction; microarray data; pathway; regression; transcriptional regulatory interaction
3.  H2B ubiquitylation is part of chromatin architecture that marks exon-intron structure in budding yeast 
BMC Genomics  2011;12:627.
The packaging of DNA into chromatin regulates transcription from initiation through 3' end processing. One aspect of transcription in which chromatin plays a poorly understood role is the co-transcriptional splicing of pre-mRNA.
Here we provide evidence that H2B monoubiquitylation (H2BK123ub1) marks introns in Saccharomyces cerevisiae. A genome-wide map of H2BK123ub1 in this organism reveals that this modification is enriched in coding regions and that its levels peak at the transcribed regions of two characteristic subgroups of genes. First, long genes are more likely to have higher levels of H2BK123ub1, correlating with the postulated role of this modification in preventing cryptic transcription initiation in ORFs. Second, genes that are highly transcribed also have high levels of H2BK123ub1, including the ribosomal protein genes, which comprise the majority of intron-containing genes in yeast. H2BK123ub1 is also a feature of introns in the yeast genome, and the disruption of this modification alters the intragenic distribution of H3 trimethylation on lysine 36 (H3K36me3), which functionally correlates with alternative RNA splicing in humans. In addition, the deletion of genes encoding the U2 snRNP subunits, Lea1 or Msl1, in combination with an htb-K123R mutation, leads to synthetic lethality.
These data suggest that H2BK123ub1 facilitates cross talk between chromatin and pre-mRNA splicing by modulating the distribution of intronic and exonic histone modifications.
PMCID: PMC3274495  PMID: 22188810
4.  Modeling and comparing the organization of circular genomes 
Bioinformatics  2011;27(7):912-918.
Motivation: Most prokaryotic genomes are circular with a single chromosome (called circular genomes), which consist of bacteria and archaea. Orthologous genes (abbreviated as orthologs) are genes directly evolved from an ancestor gene, and can be traced through different species in evolution. Shared orthologs between bacterial genomes have been used to measure their genome evolution. Here, organization of circular genomes is analyzed via distributions of shared orthologs between genomes. However, these distributions are often asymmetric and bimodal; to date, there is no joint distribution to model such data. This motivated us to develop a family of bivariate distributions with generalized von Mises marginals (BGVM) and its statistical inference.
Results: A new measure based on circular grade correlation and the fraction of shared orthologs is proposed for association between circular genomes, and a visualization tool developed to depict genome structure similarity. The proposed procedures are applied to eight pairs of prokaryotes separated from domain down to species, and 13 mycoplasma bacteria that are mammalian pathogens belonging to the same genus. We close with remarks on further applications to many features of genomic organization, e.g. shared transcription factor binding sites, between any pair of circular genomes. Thus, the proposed procedures may be applied to identifying conserved chromosome backbones, among others, for genome construction in synthetic biology.
Availability: All codes of the BGVM procedures and 1000+ prokaryotic genomes are available at∼gshieh/bgvm.htm.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3065686  PMID: 21278186
5.  Inferring genetic interactions via a nonlinear model and an optimization algorithm 
BMC Systems Biology  2010;4:16.
Biochemical pathways are gradually becoming recognized as central to complex human diseases and recently genetic/transcriptional interactions have been shown to be able to predict partial pathways. With the abundant information made available by microarray gene expression data (MGED), nonlinear modeling of these interactions is now feasible. Two of the latest advances in nonlinear modeling used sigmoid models to depict transcriptional interaction of a transcription factor (TF) for a target gene, but do not model cooperative or competitive interactions of several TFs for a target.
An S-shape model and an optimization algorithm (GASA) were developed to infer genetic interactions/transcriptional regulation of several genes simultaneously using MGED. GASA consists of a genetic algorithm (GA) and a simulated annealing (SA) algorithm, which is enhanced by a steepest gradient descent algorithm to avoid being trapped in local minimum. Using simulated data with various degrees of noise, we studied how GASA with two model selection criteria and two search spaces performed. Furthermore, GASA was shown to outperform network component analysis, the time series network inference algorithm (TSNI), GA with regular GA (GAGA) and GA with regular SA. Two applications are demonstrated. First, GASA is applied to infer a subnetwork of human T-cell apoptosis. Several of the predicted interactions are supported by the literature. Second, GASA was applied to infer the transcriptional factors of 34 cell cycle regulated targets in S. cerevisiae, and GASA performed better than one of the latest advances in nonlinear modeling, GAGA and TSNI. Moreover, GASA is able to predict multiple transcription factors for certain targets, and these results coincide with experiments confirmed data in YEASTRACT.
GASA is shown to infer both genetic interactions and transcriptional regulatory interactions well. In particular, GASA seems able to characterize the nonlinear mechanism of transcriptional regulatory interactions (TIs) in yeast, and may be applied to infer TIs in other organisms. The predicted genetic interactions of a subnetwork of human T-cell apoptosis coincide with existing partial pathways, suggesting the potential of GASA on inferring biochemical pathways.
PMCID: PMC2848194  PMID: 20184777
6.  WebPARE: web-computing for inferring genetic or transcriptional interactions 
Bioinformatics  2009;26(4):582-584.
Summary: Inferring genetic or transcriptional interactions, when done successfully, may provide insights into biological processes or biochemical pathways of interest. Unfortunately, most computational algorithms require a certain level of programming expertise. To provide a simple web interface for users to infer interactions from time course gene expression data, we present WebPARE, which is based on the pattern recognition algorithm (PARE). For expression data, in which each type of interaction (e.g. activator target) and the corresponding paired gene expression pattern are significantly associated, PARE uses a non-linear score to classify gene pairs of interest into a few subclasses of various time lags. In each subclass, PARE learns the parameters in the decision score using known interactions from biological experiments or published literature. Subsequently, the trained algorithm predicts interactions of a similar nature. Previously, PARE was shown to infer two sets of interactions in yeast successfully. Moreover, several predicted genetic interactions coincided with existing pathways; this indicates the potential of PARE in predicting partial pathway components. Given a list of gene pairs or genes of interest and expression data, WebPARE invokes PARE and outputs predicted interactions and their networks in directed graphs.
Availability: A web-computing service WebPARE is publicly available at:
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2820674  PMID: 20007742
7.  Uncovering transcriptional interactions via an adaptive fuzzy logic approach 
BMC Bioinformatics  2009;10:400.
To date, only a limited number of transcriptional regulatory interactions have been uncovered. In a pilot study integrating sequence data with microarray data, a position weight matrix (PWM) performed poorly in inferring transcriptional interactions (TIs), which represent physical interactions between transcription factors (TF) and upstream sequences of target genes. Inferring a TI means that the promoter sequence of a target is inferred to match the consensus sequence motifs of a potential TF, and their interaction type such as AT or RT is also predicted. Thus, a robust PWM (rPWM) was developed to search for consensus sequence motifs. In addition to rPWM, one feature extracted from ChIP-chip data was incorporated to identify potential TIs under specific conditions. An interaction type classifier was assembled to predict activation/repression of potential TIs using microarray data. This approach, combining an adaptive (learning) fuzzy inference system and an interaction type classifier to predict transcriptional regulatory networks, was named AdaFuzzy.
AdaFuzzy was applied to predict TIs using real genomics data from Saccharomyces cerevisiae. Following one of the latest advances in predicting TIs, constrained probabilistic sparse matrix factorization (cPSMF), and using 19 transcription factors (TFs), we compared AdaFuzzy to four well-known approaches using over-representation analysis and gene set enrichment analysis. AdaFuzzy outperformed these four algorithms. Furthermore, AdaFuzzy was shown to perform comparably to 'ChIP-experimental method' in inferring TIs identified by two sets of large scale ChIP-chip data, respectively. AdaFuzzy was also able to classify all predicted TIs into one or more of the four promoter architectures. The results coincided with known promoter architectures in yeast and provided insights into transcriptional regulatory mechanisms.
AdaFuzzy successfully integrates multiple types of data (sequence, ChIP, and microarray) to predict transcriptional regulatory networks. The validated success in the prediction results implies that AdaFuzzy can be applied to uncover TIs in yeast.
PMCID: PMC2797023  PMID: 19961622
8.  Inferring transcriptional compensation interactions in yeast via stepwise structure equation modeling 
BMC Bioinformatics  2008;9:134.
With the abundant information produced by microarray technology, various approaches have been proposed to infer transcriptional regulatory networks. However, few approaches have studied subtle and indirect interaction such as genetic compensation, the existence of which is widely recognized although its mechanism has yet to be clarified. Furthermore, when inferring gene networks most models include only observed variables whereas latent factors, such as proteins and mRNA degradation that are not measured by microarrays, do participate in networks in reality.
Motivated by inferring transcriptional compensation (TC) interactions in yeast, a stepwise structural equation modeling algorithm (SSEM) is developed. In addition to observed variables, SSEM also incorporates hidden variables to capture interactions (or regulations) from latent factors. Simulated gene networks are used to determine with which of six possible model selection criteria (MSC) SSEM works best. SSEM with Bayesian information criterion (BIC) results in the highest true positive rates, the largest percentage of correctly predicted interactions from all existing interactions, and the highest true negative (non-existing interactions) rates. Next, we apply SSEM using real microarray data to infer TC interactions among (1) small groups of genes that are synthetic sick or lethal (SSL) to SGS1, and (2) a group of SSL pairs of 51 yeast genes involved in DNA synthesis and repair that are of interest. For (1), SSEM with BIC is shown to outperform three Bayesian network algorithms and a multivariate autoregressive model, checked against the results of qRT-PCR experiments. The predictions for (2) are shown to coincide with several known pathways of Sgs1 and its partners that are involved in DNA replication, recombination and repair. In addition, experimentally testable interactions of Rad27 are predicted.
SSEM is a useful tool for inferring genetic networks, and the results reinforce the possibility of predicting pathways of protein complexes via genetic interactions.
PMCID: PMC2323972  PMID: 18312694

Results 1-8 (8)