Search tips
Search criteria

Results 1-15 (15)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  QChIPat: a quantitative method to identify distinct binding patterns for two biological ChIP-seq samples in different experimental conditions 
BMC Genomics  2013;14(Suppl 8):S3.
Many computational programs have been developed to identify enriched regions for a single biological ChIP-seq sample. Given that many biological questions are often asked to compare the difference between two different conditions, it is important to develop new programs that address the comparison of two biological ChIP-seq samples. Despite several programs designed to address this question, these programs suffer from some drawbacks, such as inability to distinguish whether the identified differential enriched regions are indeed significantly enriched, lack of distinguishing binding patterns, and neglect of the normalization between samples.
In this study, we developed a novel quantitative method for comparing two biological ChIP-seq samples, called QChIPat. Our method employs a new global normalization method: nonparametric empirical Bayes (NEB) correction normalization, utilizes pre-defined enriched regions identified from single-sample peak calling programs, uses statistical methods to define differential enriched regions, then defines binding (histone modification) pattern information for those differential enriched regions. Our program was tested on a benchmark data: histone modifications data used by ChIPDiffs. It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells.
Several advantages of our program include: 1) it considers a control (or input) experiment; 2) it incorporates a novel global normalization strategy: nonparametric empirical Bayes correction normalization; 3) it provides the binding pattern information among different enriched regions. QChIPat is implemented in R, Perl and C++, and has been tested under Linux. The R package is available at
PMCID: PMC4042236  PMID: 24564479
2.  A modulator based regulatory network for ERα signaling pathway 
BMC Genomics  2012;13(Suppl 6):S6.
Estrogens control multiple functions of hormone-responsive breast cancer cells. They regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. ERα requires distinct co-regulator or modulators for efficient transcriptional regulation, and they form a regulatory network. Knowing this regulatory network will enable systematic study of the effect of ERα on breast cancer.
To investigate the regulatory network of ERα and discover novel modulators of ERα functions, we proposed an analytical method based on a linear regression model to identify translational modulators and their network relationships. In the network analysis, a group of specific modulator and target genes were selected according to the functionality of modulator and the ERα binding. Network formed from targets genes with ERα binding was called ERα genomic regulatory network; while network formed from targets genes without ERα binding was called ERα non-genomic regulatory network. Considering the active or repressive function of ERα, active or repressive function of a modulator, and agonist or antagonist effect of a modulator on ERα, the ERα/modulator/target relationships were categorized into 27 classes.
Using the gene expression data and ERα Chip-seq data from the MCF-7 cell line, the ERα genomic/non-genomic regulatory networks were built by merging ERα/ modulator/target triplets (TF, M, T), where TF refers to the ERα, M refers to the modulator, and T refers to the target. Comparing these two networks, ERα non-genomic network has lower FDR than the genomic network. In order to validate these two networks, the same network analysis was performed in the gene expression data from the ZR-75.1 cell. The network overlap analysis between two cancer cells showed 1% overlap for the ERα genomic regulatory network, but 4% overlap for the non-genomic regulatory network.
We proposed a novel approach to infer the ERα/modulator/target relationships, and construct the genomic/non-genomic regulatory networks in two cancer cells. We found that the non-genomic regulatory network is more reliable than the genomic regulatory network.
PMCID: PMC3481450  PMID: 23134758
3.  Multivalent epigenetic marks confer microenvironment-responsive epigenetic plasticity to ovarian cancer cells 
Epigenetics  2010;5(8):716-729.
“Epigenetic plasticity” refers to the capability of mammalian cells to alter their differentiation status via chromatin remodeling-associated alterations in gene expression. While epigenetic plasticity has been best associated with lineage commitment of embryonic stem cells, recent studies have demonstrated chromatin remodeling even in terminally differentiated normal cells and advanced-stage melanoma and breast cancer cells, in context-dependent responses to alterations in their microenvironment. In the current study, we extend this attribute of epigenetic plasticity to aggressive ovarian cancer cells, by using an integrative approach to associate cellular phenotypes with chromatin modifications (“ChIP-chip”) and mRNA and microRNA expression. While we identified numerous gene promoters possessing the well-known “bivalent mark” of H3K27me3/H3K4me2, we also report 14 distinct, lesser known bi-, tri- and tetravalent combinations of activating and repressive chromatin modifications, in platinum-resistant CP 70 ovarian cancer cells. The vast majority (>90%) of all the histone marks studied localized to regions within 2,000 bp of transcription start sites, supporting a role in gene regulation. Upon a simple alteration in the microenvironment, transition from two- to three-dimensional culture, an increase (17–38%) in repressive-only marked promoters was observed, concomitant with a decrease (31–21%) in multivalent (i.e., juxtaposed permissive and repressive histone marked) promoters. Like embryonic/tissue stem and other (non-ovarian) carcinoma cells, ovarian cancer cell epigenetic plasticity reflects an inherent transcriptional flexibility for context-responsive alterations in phenotype. It is possible that this plasticity could be therapeutically exploited for the management of this lethal gynecologic malignancy.
PMCID: PMC3052886  PMID: 20676026
histone modifications; gene expression; chromatin remodeling; ovarian cancer; epigenetic plasticity; tumor microenvironment; bivalent histone mark
4.  Preprocessing differential methylation hybridization microarray data 
BioData Mining  2011;4:13.
DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods.
We study 20 different preprocessing methods, which are the combination of five background correction methods and four normalization methods. In order to compare these 20 methods, we evaluate their performance of identifying known methylated and un-methylated housekeeping genes based on two statistics. Comparison details are illustrated using breast cancer cell line and ovarian cancer patient methylation microarray data. Our comparison results show that different background correction methods perform similarly; however, four normalization methods perform very differently. In particular, all three different LOESS normalization methods perform better than the one without any normalization.
It is necessary to do within-array normalization, and the two LOESS normalization methods based on specific DMH internal control probes produce more stable and relatively better results than the global LOESS normalization method.
PMCID: PMC3118966  PMID: 21575229
5.  A modulated empirical Bayes model for identifying topological and temporal estrogen receptor α regulatory networks in breast cancer 
BMC Systems Biology  2011;5:67.
Estrogens regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. Dynamic gene expression changes have been shown to characterize the breast cancer cell response to estrogens, the every molecular mechanism of which is still not well understood.
We developed a modulated empirical Bayes model, and constructed a novel topological and temporal transcription factor (TF) regulatory network in MCF7 breast cancer cell line upon stimulation by 17β-estradiol stimulation. In the network, significant TF genomic hubs were identified including ER-alpha and AP-1; significant non-genomic hubs include ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, and PITX2. Although the early and late networks were distinct (<5% overlap of ERα target genes between the 4 and 24 h time points), all nine hubs were significantly represented in both networks. In MCF7 cells with acquired resistance to tamoxifen, the ERα regulatory network was unresponsive to 17β-estradiol stimulation. The significant loss of hormone responsiveness was associated with marked epigenomic changes, including hyper- or hypo-methylation of promoter CpG islands and repressive histone methylations.
We identified a number of estrogen regulated target genes and established estrogen-regulated network that distinguishes the genomic and non-genomic actions of estrogen receptor. Many gene targets of this network were not active anymore in anti-estrogen resistant cell lines, possibly because their DNA methylation and histone acetylation patterns have changed.
PMCID: PMC3117732  PMID: 21554733
6.  Identifying hypermethylated CpG islands using a quantile regression model 
BMC Bioinformatics  2011;12:54.
DNA methylation has been shown to play an important role in the silencing of tumor suppressor genes in various tumor types. In order to have a system-wide understanding of the methylation changes that occur in tumors, we have developed a differential methylation hybridization (DMH) protocol that can simultaneously assay the methylation status of all known CpG islands (CGIs) using microarray technologies. A large percentage of signals obtained from microarrays can be attributed to various measurable and unmeasurable confounding factors unrelated to the biological question at hand. In order to correct the bias due to noise, we first implemented a quantile regression model, with a quantile level equal to 75%, to identify hypermethylated CGIs in an earlier work. As a proof of concept, we applied this model to methylation microarray data generated from breast cancer cell lines. However, we were unsure whether 75% was the best quantile level for identifying hypermethylated CGIs. In this paper, we attempt to determine which quantile level should be used to identify hypermethylated CGIs and their associated genes.
We introduce three statistical measurements to compare the performance of the proposed quantile regression model at different quantile levels (95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%), using known methylated genes and unmethylated housekeeping genes reported in breast cancer cell lines and ovarian cancer patients. Our results show that the quantile levels ranging from 80% to 90% are better at identifying known methylated and unmethylated genes.
In this paper, we propose to use a quantile regression model to identify hypermethylated CGIs by incorporating probe effects to account for noise due to unmeasurable factors. Our model can efficiently identify hypermethylated CGIs in both breast and ovarian cancer data.
PMCID: PMC3051900  PMID: 21324121
7.  Identification of novel DNA methylation inhibitors via a two-component reporter gene system 
Targeting abnormal DNA methylation represents a therapeutically relevant strategy for cancer treatment as demonstrated by the US Food and Drug Administration approval of the DNA methyltransferase inhibitors azacytidine and 5-aza-2'-deoxycytidine for the treatment of myelodysplastic syndromes. But their use is associated with increased incidences of bone marrow suppression. Alternatively, procainamide has emerged as a potential DNA demethylating agent for clinical translation. While procainamide is much safer than 5-aza-2'-deoxycytidine, it requires high concentrations to be effective in DNA demethylation in suppressing cancer cell growth. Thus, our laboratories have embarked on the pharmacological exploitation of procainamide to develop potent DNA methylation inhibitors through lead optimization.
We report the use of a DNA methylation two-component enhanced green fluorescent protein reporter system as a screening platform to identify novel DNA methylation inhibitors from a compound library containing procainamide derivatives.
A lead agent IM25, which exhibits substantially higher potency in GSTp1 DNA demethylation with lower cytotoxicity in MCF7 cells relative to procainamide and 5-aza-2'-deoxycytidine, was identified by the screening platform.
Our data provide a proof-of-concept that procainamide could be pharmacologically exploited to develop novel DNA methylation inhibitors, of which the translational potential in cancer therapy/prevention is currently under investigation.
PMCID: PMC3025941  PMID: 21219604
8.  Identifying differentially methylated genes using mixed effect and generalized least square models 
BMC Bioinformatics  2009;10:404.
DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown.
We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations.
We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.
PMCID: PMC2800121  PMID: 20003206
9.  An integrative ChIP-chip and gene expression profiling to model SMAD regulatory modules 
BMC Systems Biology  2009;3:73.
The TGF-β/SMAD pathway is part of a broader signaling network in which crosstalk between pathways occurs. While the molecular mechanisms of TGF-β/SMAD signaling pathway have been studied in detail, the global networks downstream of SMAD remain largely unknown. The regulatory effect of SMAD complex likely depends on transcriptional modules, in which the SMAD binding elements and partner transcription factor binding sites (SMAD modules) are present in specific context.
To address this question and develop a computational model for SMAD modules, we simultaneously performed chromatin immunoprecipitation followed by microarray analysis (ChIP-chip) and mRNA expression profiling to identify TGF-β/SMAD regulated and synchronously coexpressed gene sets in ovarian surface epithelium. Intersecting the ChIP-chip and gene expression data yielded 150 direct targets, of which 141 were grouped into 3 co-expressed gene sets (sustained up-regulated, transient up-regulated and down-regulated), based on their temporal changes in expression after TGF-β activation. We developed a data-mining method driven by the Random Forest algorithm to model SMAD transcriptional modules in the target sequences. The predicted SMAD modules contain SMAD binding element and up to 2 of 7 other transcription factor binding sites (E2F, P53, LEF1, ELK1, COUPTF, PAX4 and DR1).
Together, the computational results further the understanding of the interactions between SMAD and other transcription factors at specific target promoters, and provide the basis for more targeted experimental verification of the co-regulatory modules.
PMCID: PMC2724489  PMID: 19615063
10.  Integrated analysis of DNA methylation and gene expression reveals specific signaling pathways associated with platinum resistance in ovarian cancer 
BMC Medical Genomics  2009;2:34.
Cisplatin and carboplatin are the primary first-line therapies for the treatment of ovarian cancer. However, resistance to these platinum-based drugs occurs in the large majority of initially responsive tumors, resulting in fully chemoresistant, fatal disease. Although the precise mechanism(s) underlying the development of platinum resistance in late-stage ovarian cancer patients currently remains unknown, CpG-island (CGI) methylation, a phenomenon strongly associated with aberrant gene silencing and ovarian tumorigenesis, may contribute to this devastating condition.
To model the onset of drug resistance, and investigate DNA methylation and gene expression alterations associated with platinum resistance, we treated clonally derived, drug-sensitive A2780 epithelial ovarian cancer cells with increasing concentrations of cisplatin. After several cycles of drug selection, the isogenic drug-sensitive and -resistant pairs were subjected to global CGI methylation and mRNA expression microarray analyses. To identify chemoresistance-associated, biological pathways likely impacted by DNA methylation, promoter CGI methylation and mRNA expression profiles were integrated and subjected to pathway enrichment analysis.
Promoter CGI methylation revealed a positive association (Spearman correlation of 0.99) between the total number of hypermethylated CGIs and GI50 values (i.e., increased drug resistance) following successive cisplatin treatment cycles. In accord with that result, chemoresistance was reversible by DNA methylation inhibitors. Pathway enrichment analysis revealed hypermethylation-mediated repression of cell adhesion and tight junction pathways and hypomethylation-mediated activation of the cell growth-promoting pathways PI3K/Akt, TGF-beta, and cell cycle progression, which may contribute to the onset of chemoresistance in ovarian cancer cells.
Selective epigenetic disruption of distinct biological pathways was observed during development of platinum resistance in ovarian cancer. Integrated analysis of DNA methylation and gene expression may allow for the identification of new therapeutic targets and/or biomarkers prognostic of disease response. Finally, our results suggest that epigenetic therapies may facilitate the prevention or reversal of transcriptional repression responsible for chemoresistance and the restoration of sensitivity to platinum-based chemotherapeutics.
PMCID: PMC2712480  PMID: 19505326
11.  STAT3 can be activated through paracrine signaling in breast epithelial cells 
BMC Cancer  2008;8:302.
Many cancers, including breast cancer, have been identified with increased levels of phosphorylated or the active form of Signal Transducers and Activators of Transcription 3 (STAT3) protein. However, whether the tumor microenvironment plays a role in this activation is still poorly understood.
Conditioned media, which contains soluble factors from MDA-MB-231 and MDA-MB-468 breast cancer cells and breast cancer associated fibroblasts, was added to MCF-10A breast epithelial and MDA-MB-453 breast cancer cells. The stimulation of phosphorylated STAT3 (p-STAT3) levels by conditioned media was assayed by Western blot in the presence or absence of neutralized IL-6 antibody, or a JAK/STAT3 inhibitor, JSI-124. The stimulation of cell proliferation in MCF-10A cells by conditioned media in the presence or absence of JSI-124 was subjected to MTT analysis. IL-6, IL-10, and VEGF levels were determined by ELISA analysis.
Our results demonstrated that conditioned media from cell lines with constitutively active STAT3 are sufficient to induce p-STAT3 levels in various recipients that do not possess elevated p-STAT3 levels. This signaling occurs through the JAK/STAT3 pathway, leading to STAT3 phosphorylation as early as 30 minutes and is persistent for at least 24 hours. ELISA analysis confirmed a correlation between elevated levels of IL-6 production and p-STAT3. Neutralization of the IL-6 ligand or gp130 was sufficient to block increased levels of p-STAT3 (Y705) in treated cells. Furthermore, soluble factors within the MDA-MB-231 conditioned media were also sufficient to stimulate an increase in IL-6 production from MCF-10A cells.
These results demonstrate STAT3 phosphorylation in breast epithelial cells can be stimulated by paracrine signaling through soluble factors from both breast cancer cells and breast cancer associated fibroblasts with elevated STAT3 phosphorylation. The induction of STAT3 phosphorylation is through the IL-6/JAK pathway and appears to be associated with cell proliferation. Understanding how IL-6 and other soluble factors may lead to STAT3 activation via the tumor microenvironment will provide new therapeutic regimens for breast carcinomas and other cancers with elevated p-STAT3 levels.
PMCID: PMC2582243  PMID: 18939993
12.  Probe signal correction for differential methylation hybridization experiments 
BMC Bioinformatics  2008;9:453.
Non-biological signal (or noise) has been the bane of microarray analysis. Hybridization effects related to probe-sequence composition and DNA dye-probe interactions have been observed in differential methylation hybridization (DMH) microarray experiments as well as other effects inherent to the DMH protocol.
We suggest two models to correct for non-biologically relevant probe signal with an overarching focus on probe-sequence composition. The estimated effects are evaluated and the strengths of the models are considered in the context of DMH analyses.
The majority of estimated parameters were statistically significant in all considered models. Model selection for signal correction is based on interpretation of the estimated values and their biological significance.
PMCID: PMC2603337  PMID: 18947421
13.  A Poisson mixture model to identify changes in RNA polymerase II binding quantity using high-throughput sequencing technology 
BMC Genomics  2008;9(Suppl 2):S23.
We present a mixture model-based analysis for identifying differences in the distribution of RNA polymerase II (Pol II) in transcribed regions, measured using ChIP-seq (chromatin immunoprecipitation following massively parallel sequencing technology). The statistical model assumes that the number of Pol II-targeted sequences contained within each genomic region follows a Poisson distribution. A Poisson mixture model was then developed to distinguish Pol II binding changes in transcribed region using an empirical approach and an expectation-maximization (EM) algorithm developed for estimation and inference. In order to achieve a global maximum in the M-step, a particle swarm optimization (PSO) was implemented. We applied this model to Pol II binding data generated from hormone-dependent MCF7 breast cancer cells and antiestrogen-resistant MCF7 breast cancer cells before and after treatment with 17β-estradiol (E2). We determined that in the hormone-dependent cells, ~9.9% (2527) genes showed significant changes in Pol II binding after E2 treatment. However, only ~0.7% (172) genes displayed significant Pol II binding changes in E2-treated antiestrogen-resistant cells. These results show that a Poisson mixture model can be used to analyze ChIP-seq data.
PMCID: PMC2559888  PMID: 18831789
14.  Methylation Linear Discriminant Analysis (MLDA) for identifying differentially methylated CpG islands 
BMC Bioinformatics  2008;9:337.
Hypermethylation of promoter CpG islands is strongly correlated to transcriptional gene silencing and epigenetic maintenance of the silenced state. As well as its role in tumor development, CpG island methylation contributes to the acquisition of resistance to chemotherapy. Differential Methylation Hybridisation (DMH) is one technique used for genome-wide DNA methylation analysis. The study of such microarray data sets should ideally account for the specific biological features of DNA methylation and the non-symmetrical distribution of the ratios of unmethylated and methylated sequences hybridised on the array. We have therefore developed a novel algorithm tailored to this type of data, Methylation Linear Discriminant Analysis (MLDA).
MLDA was programmed in R (version 2.7.0) and the package is available at CRAN [1]. This approach utilizes linear regression models of non-normalised hybridisation data to define methylation status. Log-transformed signal intensities of unmethylated controls on the microarray are used as a reference. The signal intensities of DNA samples digested with methylation sensitive restriction enzymes and mock digested are then transformed to the likelihood of a locus being methylated using this reference. We tested the ability of MLDA to identify loci differentially methylated as analysed by DMH between cisplatin sensitive and resistant ovarian cancer cell lines. MLDA identified 115 differentially methylated loci and 23 out of 26 of these loci have been independently validated by Methylation Specific PCR and/or bisulphite pyrosequencing.
MLDA has advantages for analyzing methylation data from CpG island microarrays, since there is a clear rational for the definition of methylation status, it uses DMH data without between-group normalisation and is less influenced by cross-hybridisation of loci. The MLDA algorithm successfully identified differentially methylated loci between two classes of samples analysed by DMH using CpG island microarrays.
PMCID: PMC2529322  PMID: 18691414
15.  Genome-wide analysis of alternative promoters of human genes using a custom promoter tiling array 
BMC Genomics  2008;9:349.
Independent lines of evidence suggested that a large fraction of human genes possess multiple promoters driving gene expression from distinct transcription start sites. Understanding which promoter is employed in which cellular context is required to unravel gene regulatory networks within the cell.
We have developed a custom microarray platform that tiles roughly 35,000 alternative putative promoters from nearly 7,000 genes in the human genome. To demonstrate the utility of this array platform, we have analyzed the patterns of promoter usage in 17β-estradiol (E2)-treated and untreated MCF7 cells and show widespread usage of alternative promoters. Most intriguingly, we show that the downstream promoter in E2-sensitive multiple promoter genes tends to be very close to the 3'-terminus of the gene, suggesting exotic mechanisms of expression regulation in these genes.
The usage of alternative promoters greatly multiplies the transcriptional complexity available within the human genome. The fact that many of these promoters are incapable of driving the synthesis of a meaningful protein-encoding transcript further complicates the story.
PMCID: PMC2527337  PMID: 18655706

Results 1-15 (15)