Background
DNA methylation is an important epigenetic mark and dysregulation of DNA methylation is associated with many diseases including cancer. Advances in next-generation sequencing now allow unbiased methylome profiling of entire patient cohorts, greatly facilitating biomarker discovery and presenting new opportunities to understand the biological mechanisms by which changes in methylation contribute to disease. Enrichment-based sequencing assays such as MethylCap-seq are a cost effective solution for genome-wide determination of methylation status, but the technical reliability of methylation reconstruction from raw sequencing data has not been well characterized.
Methods
We analyze three MethylCap-seq data sets and perform two different analyses to assess data quality. First, we investigate how data quality is affected by excluding samples that do not meet quality control cutoff requirements. Second, we consider the effect of additional reads on enrichment score, saturation, and coverage. Lastly, we verify a method for the determination of the global amount of methylation from MethylCap-seq data by comparing to a spiked-in control DNA of known methylation status.
Results
We show that rejection of samples based on our quality control parameters leads to a significant improvement of methylation calling. Additional reads beyond ~13 million unique aligned reads improved coverage, modestly improved saturation, and did not impact enrichment score. Lastly, we find that a global methylation indicator calculated from MethylCap-seq data correlates well with the global methylation level of a sample as obtained from a spike-in DNA of known methylation level.
Conclusions
We show that with appropriate quality control MethylCap-seq is a reliable tool, suitable for cohorts of hundreds of patients, that provides reproducible methylation information on a feature by feature basis as well as information about the global level of methylation.
doi:10.1186/1471-2164-13-S8-S6
PMCID: PMC3535705
PMID: 23281662
Rodriguez, Benjamin AT | Frankhouser, David | Murphy, Mark | Trimarchi, Michael | Tam, Hok-Hei | Curfman, John | Huang, Rita | Chan, Michael WY | Lai, Hung-Cheng | Parikh, Deval | Ball, Bryan | Schwind, Sebastian | Blum, William | Marcucci, Guido | Yan, Pearlly | Bundschuh, Ralf
Background
Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups.
Methods
The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization.
Results
Here, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J.
Conclusions
This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
doi:10.1186/1471-2164-13-S6-S14
PMCID: PMC3481483
PMID: 23134780
Villar, Diego | Ortiz-Barahona, Amaya | Gómez-Maldonado, Laura | Pescador, Nuria | Sánchez-Cabo, Fátima | Hackl, Hubert | Rodriguez, Benjamin A. T. | Trajanoski, Zlatko | Dopazo, Ana | Huang, Tim H. M. | Yan, Pearlly S. | del Peso, Luis | Mata, Juan
The transcriptional response driven by Hypoxia-inducible factor (HIF) is central to the adaptation to oxygen restriction. Despite recent characterization of genome-wide HIF DNA binding locations and hypoxia-regulated transcripts in different cell types, the molecular bases of HIF target selection remain unresolved. Herein, we combined multi-level experimental data and computational predictions to identify sequence motifs that may contribute to HIF target selectivity. We obtained a core set of bona fide HIF binding regions by integrating multiple HIF1 DNA binding and hypoxia expression profiling datasets. This core set exhibits evolutionarily conserved binding regions and is enriched in functional responses to hypoxia. Computational prediction of enriched transcription factor binding sites identified sequence motifs corresponding to several stress-responsive transcription factors, such as activator protein 1 (AP1), cAMP response element-binding (CREB), or CCAAT-enhancer binding protein (CEBP). Experimental validations on HIF-regulated promoters suggest a functional role of the identified motifs in modulating HIF-mediated transcription. Accordingly, transcriptional targets of these factors are over-represented in a sorted list of hypoxia-regulated genes. Altogether, our results implicate cooperativity among stress-responsive transcription factors in fine-tuning the HIF transcriptional response.
doi:10.1371/journal.pone.0045708
PMCID: PMC3454324
PMID: 23029193
He, Huiling | Liyanarachchi, Sandya | Akagi, Keiko | Nagy, Rebecca | Li, Jingfeng | Dietrich, Rosemary C | Li, Wei | Sebastian, Nikhil | Wen, Bernard | Xin, Baozhong | Singh, Jarnail | Yan, Pearlly | Alder, Hansjuerg | Haan, Eric | Wieczorek, Dagmar | Albrecht, Beate | Puffenberger, Erik | Wang, Heng | Westman, Judith A. | Padgett, Richard A | Symer, David E | de la Chapelle, Albert
Small nuclear RNAs (snRNAs) are essential factors in mRNA splicing. By homozygosity mapping and deep sequencing, we show that a gene encoding U4atac snRNA, a component of the minor U12-dependent spliceosome, is mutated in individuals with microcephalic osteodysplastic primordial dwarfism type I (MOPD I), a severe developmental disorder characterized by extreme intrauterine growth retardation and multiple organ abnormalities. Functional assays show that mutations (30G>A, 51G>A, 55G>A, and 111G>A) associated with MOPD I cause defective U12-dependent splicing. Endogenous U12-dependent but not U2-dependent introns are poorly spliced in MOPD I patient fibroblast cells while introduction of wild type U4atac snRNA into MOPD I cells enhances U12-dependent splicing. These results illustrate the critical role of minor intron splicing in human development.
doi:10.1126/science.1200587
PMCID: PMC3380448
PMID: 21474760
microcephalic osteodysplastic primordial dwarfism type I; RNU4ATAC; mutation; splicing; snRNA; minor spliceosome
Rodriguez, Benjamin A.T. | Weng, Yu-I | Liu, Ta-Ming | Zuo, Tao | Hsu, Pei-Yin | Lin, Ching-Hung | Cheng, Ann-Lii | Cui, Hengmi | Yan, Pearlly S. | Huang, Tim H.-M.
While tumor suppressor genes frequently undergo epigenetic silencing in cancer, how the instructions directing this transcriptional repression are transmitted in cancer cells remain largely unclear. Expression of cyclin-dependent kinase inhibitor 1C (CDKN1C), an imprinted gene on chromosomal band 11 p15.5, is reduced or lost in the majority of breast cancers. Here, we report that CDKN1C is suppressed by estrogen through epigenetic mechanisms involving the chromatin-interacting noncoding RNA KCNQ1OT1 and CCCTC-binding factor (CTCF). Activation of estrogen signaling reduced CDKN1C expression 3-fold (P < 0.001) and established repressive histone modifications at the 5′ regulatory region of the locus. These events were concomitant with induction of KCNQ1OT1 expression as well as increased recruitment of CTCF to both the distal KCNQ1OT1 promoter-associated imprinting control region (ICR) and the CDKN1C locus. Transient depletion of CTCF by small interfering RNA increased CDKN1C expression and significantly reduced the estrogen-mediated repression of CDKN1C. Further studies in breast cancer cell lines indicated that the epigenetic silencing of CDKN1C occurs in part as the result of genetic loss of the inactive methylated 11p15.5 ICR allele (R2 = 0.612, P < 0.001). We also found a novel cis-encoded antisense transcript, CDKN1C-AS, which is induced by estrogen signaling following pharmacologic inhibition of DNA methyltransferase and histone deacetylase activity. Forced expression of CDKN1C-AS was capable of repressing endogenous CDKN1C in vivo. Our findings suggest that in addition to promoter hypermethylation, epigenetic repression of tumor suppressor genes by CTCF and noncoding RNA transcripts could be more common and important than previously understood.
doi:10.1093/carcin/bgr017
PMCID: PMC3106431
PMID: 21304052
Yeh, Kun-Tu | Chen, Tze-Ho | Yang, Hui-Wen | Chou, Jian-Liang | Chen, Lin-Yu | Yeh, Chia-Ming | Chen, Yu-Hsin | Lin, Ru-Inn | Su, Her-Young | Chen, Gary CW | Deatherage, Daniel E | Huang, Yi-Wen | Yan, Pearlly S | Lin, Huey-Jen | Nephew, Kenneth P | Huang, Tim H-M | Lai, Hung-Cheng | Chan, Michael WY
Aberrant TGFβ signaling pathway may alter the expression of down-stream targets and promotes ovarian carcinogenesis. However, the mechanism of this impairment is not fully understood. Our previous study identified RunX1T1 as a putative SMAD4 target in an immortalized ovarian surface epithelial cell line, IOSE. In this study, we report that transcription of RunX1T1 was confirmed to be positively regulated by SMAD4 in IOSE cells and epigenetically silenced in a panel of ovarian cancer cell lines by promoter hypermethylation and histone methylation at H3 lysine 9. SMAD4 depletion increased repressive histone modifications of RunX1T1 promoter without affecting promoter methylation in IOSE cells. Epigenetic treatment can restore RunX1T1 expression by reversing its epigenetic status in MCP 3 ovarian cancer cells. When transiently treated with a demethylating agent, the expression of RunX1T1 was partially restored in MCP 3 cells, but gradual re-silencing through promoter re-methylation was observed after the treatment. Interestingly, SMAD4 knockdown accelerated this re-silencing process, suggesting that normal TGFβ signaling is essential for the maintenance of RunX1T1 expression. In vivo analysis confirmed that hypermethylation of RunX1T1 was detected in 35.7% (34/95) of ovarian tumors with high clinical stages (p = 0.035) and in 83% (5/6) of primary ovarian cancer-initiating cells. Additionally, concurrent methylation of RunX1T1 and another SMAD4 target, FBXO32 which was previously found to be hypermethylated in ovarian cancer was observed in this same sample cohort (p < 0.05). Restoration of RunX1T1 inhibited cancer cell growth. Taken together, dysregulated TGFβ/SMAD4 signaling may lead to epigenetic silencing of a putative tumor suppressor, RunX1T1, during ovarian carcinogenesis.
doi:10.4161/epi.6.6.15856
PMCID: PMC3359493
PMID: 21540640
ovarian cancer; epigenetics; TGFβ; RunX1T1
von dem Knesebeck, Anna | Felsberg, Jörg | Waha, Anke | Hartmann, Wolfgang | Scheffler, Björn | Glas, Martin | Hammes, Jennifer | Mikeska, Thomas | Yan, Pearlly S | Endl, Elmar | Simon, Matthias | Reifenberger, Guido | Pietsch, Torsten | Waha, Andreas
Alterations of DNA methylation play an important role in gliomas. In a genome-wide screen, we identified a CpG-rich fragment within the 5′ region of the tumor necrosis factor receptor superfamily, member 11A gene (TNFRSF11A) that showed de novo methylation in gliomas. TNFRSF11A, also known as receptor activator of NF-κB (RANK), activates several signaling pathways, such as NF-κB, JNK, ERK, p38α, and Akt/PKB. Using pyrosequencing, we detected RANK/TNFRSF11A promoter methylation in 8 (57.1%) of 14 diffuse astrocytomas, 17 (77.3%) of 22 anaplastic astrocytomas, 101 (84.2%) of 120 glioblastomas, 6 (100%) of 6 glioma cell lines, and 7 (100%) of 7 glioma stem cell-enriched glioblastoma primary cultures but not in four normal white matter tissue samples. Treatment of glioma cell lines with the demethylating agent 5-aza-2′-deoxycytidine significantly reduced the methylation level and resulted in increased RANK/TNFRSF11A mRNA expression. Overexpression of RANK/TNFRSF11A in glioblastoma cell lines leads to a significant reduction in focus formation and elevated apoptotic activity after flow cytometric analysis. Reporter assay studies of transfected glioma cells supported these results by showing the activation of signaling pathways associated with regulation of apoptosis. We conclude that RANK/TNFRSF11A is a novel and frequent target for de novo methylation in gliomas, which affects apoptotic activity and focus formation thereby contributing to the molecular pathogenesis of gliomas.
PMCID: PMC3394195
PMID: 22787434
Rodriguez, Benjamin | Tam, Hok-Hei | Frankhouser, David | Trimarchi, Michael | Murphy, Mark | Kuo, Chris | Parikh, Deval | Ball, Bryan | Schwind, Sebastian | Curfman, John | Blum, William | Marcucci, Guido | Yan, Pearlly | Bundschuh, Ralf
Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. Here, we present a scalable, flexible workflow for MethylCap-seq Quality Control, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
doi:10.1109/GENSiPS.2011.6169426
PMCID: PMC3320741
PMID: 22484542
next generation sequencing; DNA methylation; epigenetics; cancer; data analysis; data visualization
Next Generation Sequencing is highly resource intensive. NGS Tasks related to data processing, management and analysis require high-end computing servers or even clusters. Additionally, processing NGS experiments requires suitable storage space and significant manual interaction. At The Ohio State University's Biomedical Informatics Shared Resource, we designed and implemented a scalable architecture to address the challenges associated with the resource intensive nature of NGS secondary analysis built around Illumina Genome Analyzer II sequencers and Illumina’s Gerald data processing pipeline. The software infrastructure includes a distributed computing platform consisting of a LIMS called QUEST (http://bisr.osumc.edu), an Automation Server, a computer cluster for processing NGS pipelines, and a network attached storage device expandable up to 40TB. The system has been architected to scale to multiple sequencers without requiring additional computing or labor resources. This platform provides demonstrates how to manage and automate NGS experiments in an institutional or core facility setting.
PMCID: PMC3392054
PMID: 22779037
van Roon, Eddy H J | de Miranda, Noël F C C | van Nieuwenhuizen, Merlijn P | de Meijer, Emile J | van Puijenbroek, Marjo | Yan, Pearlly S | Huang, Tim H-M | van Wezel, Tom | Morreau, Hans | Boer, Judith M
DNA methylation is a hallmark in a subset of right-sided colorectal cancers. Methylation-based screening may improve prevention and survival rate for this type of cancer, which is often clinically asymptomatic in the early stages. We aimed to discover prognostic or diagnostic biomarkers for colon cancer by comparing DNA methylation profiles of right-sided colon tumours and paired normal colon mucosa using an 8.5 k CpG island microarray. We identified a diagnostic CpG-rich region, located in the first intron of the protein-tyrosine phosphatase gamma gene (PTPRG) gene, with altered methylation already in the adenoma stage, that is, before the carcinoma transition. Validation of this region in an additional cohort of 103 sporadic colorectal tumours and 58 paired normal mucosa tissue samples showed 94% sensitivity and 96% specificity. Interestingly, comparable results were obtained when screening a cohort of Lynch syndrome-associated cancers. Functional studies showed that PTPRG intron 1 methylation did not directly affect PTPRG expression, however, the methylated region overlapped with a binding site of the insulator protein CTCF. Chromatin immunoprecipitation (ChIP) showed that methylation of the locus was associated with absence of CTCF binding. Methylation-associated changes in CTCF binding to PTPRG intron 1 could have implications on tumour gene expression by enhancer blocking, chromosome loop formation or abrogation of its insulator function. The high sensitivity and specificity for the PTPRG intron 1 methylation in both sporadic and hereditary colon cancers support biomarker potential for early detection of colon cancer.
doi:10.1038/ejhg.2010.187
PMCID: PMC3061992
PMID: 21150880
PTPRG; colorectal cancer; CTCF; DNA methylation; Lynch syndrome
Background
DNA methylation plays a very important role in the silencing of tumor suppressor genes in various tumor types. In order to gain a genome-wide understanding of how changes in methylation affect tumor growth, the differential methylation hybridization (DMH) protocol has been developed and large amounts of DMH microarray data have been generated. However, it is still unclear how to preprocess this type of microarray data and how different background correction and normalization methods used for two-color gene expression arrays perform for the methylation microarray data. In this paper, we demonstrate our discovery of a set of internal control probes that have log ratios (M) theoretically equal to zero according to this DMH protocol. With the aid of this set of control probes, we propose two LOESS (or LOWESS, locally weighted scatter-plot smoothing) normalization methods that are novel and unique for DMH microarray data. Combining with other normalization methods (global LOESS and no normalization), we compare four normalization methods. In addition, we compare five different background correction methods.
Results
We study 20 different preprocessing methods, which are the combination of five background correction methods and four normalization methods. In order to compare these 20 methods, we evaluate their performance of identifying known methylated and un-methylated housekeeping genes based on two statistics. Comparison details are illustrated using breast cancer cell line and ovarian cancer patient methylation microarray data. Our comparison results show that different background correction methods perform similarly; however, four normalization methods perform very differently. In particular, all three different LOESS normalization methods perform better than the one without any normalization.
Conclusions
It is necessary to do within-array normalization, and the two LOESS normalization methods based on specific DMH internal control probes produce more stable and relatively better results than the global LOESS normalization method.
doi:10.1186/1756-0381-4-13
PMCID: PMC3118966
PMID: 21575229
Shen, Changyu | Huang, Yiwen | Liu, Yunlong | Wang, Guohua | Zhao, Yuming | Wang, Zhiping | Teng, Mingxiang | Wang, Yadong | Flockhart, David A | Skaar, Todd C | Yan, Pearlly | Nephew, Kenneth P | Huang, Tim HM | Li, Lang
Background
Estrogens regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. Dynamic gene expression changes have been shown to characterize the breast cancer cell response to estrogens, the every molecular mechanism of which is still not well understood.
Results
We developed a modulated empirical Bayes model, and constructed a novel topological and temporal transcription factor (TF) regulatory network in MCF7 breast cancer cell line upon stimulation by 17β-estradiol stimulation. In the network, significant TF genomic hubs were identified including ER-alpha and AP-1; significant non-genomic hubs include ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, and PITX2. Although the early and late networks were distinct (<5% overlap of ERα target genes between the 4 and 24 h time points), all nine hubs were significantly represented in both networks. In MCF7 cells with acquired resistance to tamoxifen, the ERα regulatory network was unresponsive to 17β-estradiol stimulation. The significant loss of hormone responsiveness was associated with marked epigenomic changes, including hyper- or hypo-methylation of promoter CpG islands and repressive histone methylations.
Conclusions
We identified a number of estrogen regulated target genes and established estrogen-regulated network that distinguishes the genomic and non-genomic actions of estrogen receptor. Many gene targets of this network were not active anymore in anti-estrogen resistant cell lines, possibly because their DNA methylation and histone acetylation patterns have changed.
doi:10.1186/1752-0509-5-67
PMCID: PMC3117732
PMID: 21554733
Background
DNA methylation has been shown to play an important role in the silencing of tumor suppressor genes in various tumor types. In order to have a system-wide understanding of the methylation changes that occur in tumors, we have developed a differential methylation hybridization (DMH) protocol that can simultaneously assay the methylation status of all known CpG islands (CGIs) using microarray technologies. A large percentage of signals obtained from microarrays can be attributed to various measurable and unmeasurable confounding factors unrelated to the biological question at hand. In order to correct the bias due to noise, we first implemented a quantile regression model, with a quantile level equal to 75%, to identify hypermethylated CGIs in an earlier work. As a proof of concept, we applied this model to methylation microarray data generated from breast cancer cell lines. However, we were unsure whether 75% was the best quantile level for identifying hypermethylated CGIs. In this paper, we attempt to determine which quantile level should be used to identify hypermethylated CGIs and their associated genes.
Results
We introduce three statistical measurements to compare the performance of the proposed quantile regression model at different quantile levels (95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%), using known methylated genes and unmethylated housekeeping genes reported in breast cancer cell lines and ovarian cancer patients. Our results show that the quantile levels ranging from 80% to 90% are better at identifying known methylated and unmethylated genes.
Conclusions
In this paper, we propose to use a quantile regression model to identify hypermethylated CGIs by incorporating probe effects to account for noise due to unmeasurable factors. Our model can efficiently identify hypermethylated CGIs in both breast and ovarian cancer data.
doi:10.1186/1471-2105-12-54
PMCID: PMC3051900
PMID: 21324121
Hsu, Chia-Chen | Leu, Yu-Wei | Tseng, Min-Jen | Lee, Kuan-Der | Kuo, Tzen-Yu | Yen, Jia-Yi | Lai, Yen-Ling | Hung, Yi-Chen | Sun, Wei-Sheng | Chen, Chien-Min | Chu, Pei-Yi | Yeh, Kun-Tu | Yan, Pearlly S | Chang, Yu-Sun | Huang, Tim H-M | Hsiao, Shu-Huei
Background
The Cdc42-interacting protein-4, Trip10 (also known as CIP4), is a multi-domain adaptor protein involved in diverse cellular processes, which functions in a tissue-specific and cell lineage-specific manner. We previously found that Trip10 is highly expressed in estrogen receptor-expressing (ER+) breast cancer cells. Estrogen receptor depletion reduced Trip10 expression by progressively increasing DNA methylation. We hypothesized that Trip10 functions as a tumor suppressor and may be involved in the malignancy of ER-negative (ER-) breast cancer. To test this hypothesis and evaluate whether Trip10 is epigenetically regulated by DNA methylation in other cancers, we evaluated DNA methylation of Trip10 in liver cancer, brain tumor, ovarian cancer, and breast cancer.
Methods
We applied methylation-specific polymerase chain reaction and bisulfite sequencing to determine the DNA methylation of Trip10 in various cancer cell lines and tumor specimens. We also overexpressed Trip10 to observe its effect on colony formation and in vivo tumorigenesis.
Results
We found that Trip10 is hypermethylated in brain tumor and breast cancer, but hypomethylated in liver cancer. Overexpressed Trip10 was associated with endogenous Cdc42 and huntingtin in IMR-32 brain tumor cells and CP70 ovarian cancer cells. However, overexpression of Trip10 promoted colony formation in IMR-32 cells and tumorigenesis in mice inoculated with IMR-32 cells, whereas overexpressed Trip10 substantially suppressed colony formation in CP70 cells and tumorigenesis in mice inoculated with CP70 cells.
Conclusions
Trip10 regulates cancer cell growth and death in a cancer type-specific manner. Differential DNA methylation of Trip10 can either promote cell survival or cell death in a cell type-dependent manner.
doi:10.1186/1423-0127-18-12
PMCID: PMC3044094
PMID: 21299869
Camerlengo, Terry | Ozer, Hatice Gulcin | Yan, Pearlly | Parvin, Jeffrey | Huang, Tim | Huang, Kun | Perez, Francisco | Teng, Mingxiang | Li, Lang | Liu, Yunlong | Kurc, Tahsin
Enabling data analysis in large data depositories for high throughput experimental data such as gene microarrays and ChIP-seq is challenging. In this paper, we discuss three methods for integrating QUEST, a data depository for epigenetic experiments, with a web-based data analysis platform GenePattern. These methods are universal and can serve as an exemplary implementation resolving the dilemma facing many similar database systems in integrating data analysis tools.
doi:10.1109/BIBM.2009.84
PMCID: PMC2998767
PMID: 21151835
high-throughput database; GenePattern; ChIP-seq
Motivation: Antibody-based Chromatin Immunoprecipitation assay followed by high-throughput sequencing technology (ChIP-seq) is a relatively new method to study the binding patterns of specific protein molecules over the entire genome. ChIP-seq technology allows scientist to get more comprehensive results in shorter time. Here, we present a non-linear normalization algorithm and a mixture modeling method for comparing ChIP-seq data from multiple samples and characterizing genes based on their RNA polymerase II (Pol II) binding patterns.
Results: We apply a two-step non-linear normalization method based on locally weighted regression (LOESS) approach to compare ChIP-seq data across multiple samples and model the difference using an Exponential-NormalK mixture model. Fitted model is used to identify genes associated with differential binding sites based on local false discovery rate (fdr). These genes are then standardized and hierarchically clustered to characterize their Pol II binding patterns. As a case study, we apply the analysis procedure comparing normal breast cancer (MCF7) to tamoxifen-resistant (OHT) cell line. We find enriched regions that are associated with cancer (P < 0.0001). Our findings also imply that there may be a dysregulation of cell cycle and gene expression control pathways in the tamoxifen-resistant cells. These results show that the non-linear normalization method can be used to analyze ChIP-seq data across multiple samples.
Availability: Data are available at http://www.bmi.osu.edu/~khuang/Data/ChIP/RNAPII/
Contact: taslim.2@osu.edu; khuang@bmi.osu.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp384
PMCID: PMC2800347
PMID: 19561022
Ghoshal, Kalpana | Motiwala, Tasneem | Claus, Rainer | Yan, Pearlly | Kutay, Huban | Datta, Jharna | Majumder, Sarmila | Bai, Shoumei | Majumder, Arnab | Huang, Tim | Plass, Christoph | Jacob, Samson T. | Hotchin, Neil A.
Background
A hallmark of cancer cells is hypermethylation of CpG islands (CGIs), which probably arises from upregulation of one or more DNA methyltransferases. The purpose of this study was to identify the targets of DNMT3B, an essential DNA methyltransferase in mammals, in colon cancer.
Methodology/Principal Findings
Chromatin immunoprecipitation with DNMT3B specific antibody followed by CGI microarray identified genes with or without CGIs, repeat elements and genomic contigs in RKO cells. ChIP-Chop analysis showed that the majority of the target genes including P16, DCC, DISC1, SLIT1, CAVEOLIN1, GNA11, TBX5, TBX18, HOXB13 and some histone variants, that harbor CGI in their promoters, were methylated in multiple colon cancer cell lines but not in normal colon epithelial cells. Further, these genes were reactivated in RKO cells after treatment with 5-aza-2′-deoxycytidine, a DNA hypomethylating agent. COBRA showed that the CGIs encompassing the promoter and/or coding region of DCC, TBX5, TBX18, SLIT1 were methylated in primary colorectal tumors but not in matching normal colon tissues whereas GNA11 was methylated in both. MassARRAY analysis demonstrated that the CGI located ∼4.5 kb upstream of HOXB13 +1 site was tumor-specifically hypermethylated in primary colorectal cancers and cancer cell lines. HOXB13 upstream CGI was partially hypomethylated in DNMT1−/− HCT cells but was almost methylation free in cells lacking both DNMT1 and DNMT3B. Analysis of tumor suppressor properties of two aberrantly methylated transcription factors, HOXB13 and TBX18, revealed that both inhibited growth and clonogenic survival of colon cancer cells in vitro, but only HOXB13 abolished tumor growth in nude mice.
Conclusions/Significance
This is the first report that identifies several important tumor suppressors and transcription factors as direct DNMT3B targets in colon cancer and as potential biomarkers for this cancer. Further, this study shows that methylation at an upstream CGI of HOXB13 is unique to colon cancer.
doi:10.1371/journal.pone.0010338
PMCID: PMC2861599
PMID: 20454457
The methylated DNA immunoprecipitation microarray (MeDIP-chip) is a genome-wide, high-resolution approach to detect DNA methylation in whole genome or CpG (cytosine base followed by a guanine base) islands. The method utilizes anti-methylcytosine antibody to immunoprecipitate DNA that contains highly methylated CpG sites. Enriched methylated DNA can be interrogated using DNA microarrays or by massive parallel sequencing techniques. This combined approach allows researchers to rapidly identify methylated regions in a genome-wide manner, and compare DNA methylation patterns between two samples with diversely different DNA methylation status. MeDIP-chip has been applied successfully for analyses of methylated DNA in the different targets including animal and plant tissues (1, 2). Here we present a MeDIP-chip protocol that is routinely used in our laboratory, illustrated with specific examples from MeDIP-chip analysis of breast cancer cell lines. Potential technical pitfalls and solutions are also provided to serve as workflow guidelines.
doi:10.1007/978-1-60327-378-7_10
PMCID: PMC2845920
PMID: 19763503
DNA methylation; epigenetics; MeDIP-chip; microarray; cancer
Differential Methylation Hybridization (DMH) is a high-throughput DNA methylation screening tool that utilizes methylation-sensitive restriction enzymes to profile methylated fragments by hybridizing them to a CpG island microarray. This array contains probes spanning all the 27,800 islands annotated in the UCSC Genome Browser. Herein we describe a DMH protocol with clearly identified quality control points. In this manner, samples that are unlikely to provide good read-outs for differential methylation profiles between the test and the control samples will be identified and repeated with appropriate modifications. The step-by-step laboratory DMH protocol is described. In addition, we provide descriptions regarding DMH data analysis, including image quantification, background correction, and statistical procedures for both exploratory analysis and more formal inferences. Issues regarding quality control are addressed as well.
doi:10.1007/978-1-60327-192-9_9
PMCID: PMC2838393
PMID: 19488875
DNA methylation; Differential Methylation Hybridization (DMH); CpG islands (CGI); microarray
Chou, Jian-Liang | Su, Her-Young | Chen, Lin-Yu | Liao, Yu-Ping | Hartman-Frey, Corinna | Lai, Yi-Hui | Yang, Hui-Wen | Deatherage, Daniel E. | Kuo, Chieh-Ti | Huang, Yi-Wen | Yan, Pearlly S. | Hsiao, Shu-Huei | Tai, Chien-Kuo | Lin, Huey-Jen L. | Davuluri, Ramana V. | Chao, Tai-Kuang | Nephew, Kenneth P. | Huang, Tim H.-M. | Lai, Hung-Cheng | Chan, Michael W.Y.
Resistance to TGF-β is frequently observed in ovarian cancer, and disrupted TGF-β/SMAD4 signaling results in aberrant expression of downstream target genes in the disease. Our previous study showed that ADAM19, a SMAD4 target gene, is down-regulated through epigenetic mechanisms in ovarian cancer with aberrant TGF-β/SMAD4 signaling. In this study, we investigated the mechanism of down-regulation of FBXO32, another SMAD4 target gene, and the clinical significance of loss of FBXO32 expression in ovarian cancer. Expression of FBXO32 was observed in normal ovarian surface epithelium but not in ovarian cancer cell lines. FBXO32 methylation was seen in ovarian cancer cell lines displaying constitutive TGF-β/SMAD4 signaling, and epigenetic drug treatment restored FBXO32 expression in ovarian cancer cell lines regardless of FBXO32 methylation status, suggesting that epigenetic regulation of this gene in ovarian cancer may be a common event. In advanced stage ovarian tumors, significant (29.3%; P<0.05) methylation frequency of FBXO32 was observed and the association between FBXO32 methylation and shorter progression-free survival was significant, as determined by both Kaplan-Meier analysis (P<0.05) and multivariate Cox regression analysis (hazard ratio 1.003, P<0.05). Re-expression of FBXO32 markedly reduced proliferation of a platinum-resistant ovarian cancer line both in vitro and in vivo, due to increased apoptosis of the cells, and resensitized ovarian cancer cells to cisplatin. In conclusion, the novel tumor suppressor FBXO32 is epigenetically silenced in ovarian cancer cell lines with disrupted TGF-β/SMAD4 signaling and FBXO32 methylation status predicts survival in patients with ovarian cancer.
doi:10.1038/labinvest.2009.138
PMCID: PMC2829100
PMID: 20065949
Ovarian cancer; epigenetics; TGF-β; FBXO32
HUANG, YI-WEN | JANSEN, RACHEL A. | FABBRI, ENRICA | POTTER, DUSTIN | LIYANARACHCHI, SANDYA | CHAN, MICHAEL W.Y. | LIU, JOSEPH C. | CRIJNS, ANNE P.G. | BROWN, ROBERT | NEPHEW, KENNETH P. | VAN DER ZEE, ATE G.J. | COHN, DAVID E. | YAN, PEARLLY S. | HUANG, TIM H.-M. | LIN, HUEY-JEN L.
Ovarian cancer ranks the most lethal among gynecologic neoplasms in women. To develop potential bio-markers for diagnosis, we have identified five novel genes (CYP39A1, GTF2A1, FOXD4L4, EBP, and HAAO) that are hypermethylated in ovarian tumors, compared with the non-malignant normal ovarian surface epithelia, using the quantitative methylation-specific polymerase chain reactions. Interestingly enough, multivariate Cox regression analysis has identified hypermethylation of CYP39A1 correlated with an increase rate of relapsing (P=0.032, hazard ratio >1). Concordant hypermethylation in at least three loci was observed in 50 out of 55 (91%) of ovarian tumors examined. The test sensitivity and specificity were assessed to be 96 and 67% for CYP39A1; 95 and 88% for GTF2A1; 93 and 67% for FOXD4L4; 81 and 67% for EBP; 89 and 82% for HAAO, respectively. Our data have identified, for the first time, GTF2A1 alone, or GTF2A1 plus HAAO are excellent candidate biomarkers for detecting this disease. Moreover, the known functions of these gene products further implicate dys-regulated transcriptional control, cholesterol metabolism, or synthesis of quinolinic acids, may play important roles in attributing to ovarian neoplasm. Molecular therapies, by reversing the aberrant epigenomes using inhibitory agents or by abrogating the upstream signaling pathways that convey the epigenomic perturbations, may be developed into promising treatment regimens.
PMCID: PMC2829240
PMID: 19724865
ovarian cancer; epigenetics; DNA methylation; biomarkers; quantitative methylation-specific polymerase chain reaction
Lin, Huey-Jen L. | Zuo, Tao | Lin, Ching-Hung | Kuo, Chieh Ti | Liyanarachchi, Sandya | Sun, Shuying | Shen, Rulong | Deatherage, Daniel E. | Potter, Dustin | Asamoto, Lisa | Lin, Shili | Yan, Pearlly S. | Cheng, Ann-Lii | Ostrowski, Michael C. | Huang, Tim H.-M.
The interplay between histone modifications and promoter hypermethylation provides a causative explanation for epigenetic gene silencing in cancer. Less is known about the upstream initiators that direct this process. Here, we report that the Cystatin M (CST6) tumor suppressor gene is concurrently down-regulated with other loci in breast epithelial cells co-cultured with cancer-associated fibroblasts (CAFs). Promoter hypermethylation of CST6 is associated with aberrant AKT1 activation in epithelial cells, as well as the disabled INNP4B regulator resulted from the suppression by CAFs. Repressive chromatin, marked by trimethyl-H3K27 and dimethyl-H3K9, and de novo DNA methylation is established at the promoter. The findings suggest that microenvironmental stimuli are triggers in this epigenetic cascade, leading to the long-term silencing of CST6 in breast tumors. Our present findings implicate a causal mechanism defining how tumor stromal fibroblasts support neoplastic progression by manipulating the epigenome of mammary epithelial cells. The result also highlights the importance of direct cell-cell contract between epithelial cells and the surrounding fibroblasts that confer this epigenetic perturbation. Since this two-way interaction is anticipated, the described co-culture system can be used to determine the effect of epithelial factors on fibroblasts in future studies.
doi:10.1158/0008-5472.CAN-08-0288
PMCID: PMC2821873
PMID: 19074894
Background
DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown.
Results
We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations.
Conclusions
We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.
doi:10.1186/1471-2105-10-404
PMCID: PMC2800121
PMID: 20003206
Qin, Huaxia | Chan, Michael WY | Liyanarachchi, Sandya | Balch, Curtis | Potter, Dustin | Souriraj, Irene J | Cheng, Alfred SL | Agosto-Perez, Francisco J | Nikonova, Elena V | Yan, Pearlly S | Lin, Huey-Jen | Nephew, Kenneth P | Saltz, Joel H | Showe, Louise C | Huang, Tim HM | Davuluri, Ramana V
Background
The TGF-β/SMAD pathway is part of a broader signaling network in which crosstalk between pathways occurs. While the molecular mechanisms of TGF-β/SMAD signaling pathway have been studied in detail, the global networks downstream of SMAD remain largely unknown. The regulatory effect of SMAD complex likely depends on transcriptional modules, in which the SMAD binding elements and partner transcription factor binding sites (SMAD modules) are present in specific context.
Results
To address this question and develop a computational model for SMAD modules, we simultaneously performed chromatin immunoprecipitation followed by microarray analysis (ChIP-chip) and mRNA expression profiling to identify TGF-β/SMAD regulated and synchronously coexpressed gene sets in ovarian surface epithelium. Intersecting the ChIP-chip and gene expression data yielded 150 direct targets, of which 141 were grouped into 3 co-expressed gene sets (sustained up-regulated, transient up-regulated and down-regulated), based on their temporal changes in expression after TGF-β activation. We developed a data-mining method driven by the Random Forest algorithm to model SMAD transcriptional modules in the target sequences. The predicted SMAD modules contain SMAD binding element and up to 2 of 7 other transcription factor binding sites (E2F, P53, LEF1, ELK1, COUPTF, PAX4 and DR1).
Conclusion
Together, the computational results further the understanding of the interactions between SMAD and other transcription factors at specific target promoters, and provide the basis for more targeted experimental verification of the co-regulatory modules.
doi:10.1186/1752-0509-3-73
PMCID: PMC2724489
PMID: 19615063
Several studies have reported that a high expression ratio of HOXB13 to IL17BR predicts tumor recurrence in node-negative, estrogen receptor (ER) α-positive breast cancer patients treated with tamoxifen. The molecular mechanisms underlying this dysregulation of gene expression remain to be explored. Our epigenetic analysis has found that increased promoter methylation of one of these genes, HOXB13, correlate with the decreased expression of its transcript in breast cancer cell lines (P < 0.005). Transcriptional silencing of this gene can be reversed by a demethylation treatment. HOXB13 is suppressed by the activation of estrogen signaling in ERα-positive breast cancer cells. However, treatment with 4-hydroxytamoxifen (4-OHT), an antiestrogen, abrogates the ERα-mediated suppression in cancer cells. The notion that this transcriptional induction of HOXB13 occurs in vitro with simultaneous exposure to both estrogen and 4-OHT may provide a biological explanation for its aberrant expression in many node-negative patients undergoing tamoxifen therapy. Interestingly, promoter hypermethylation of HOXB13 is more frequently observed in ERα-positive patients with increased lymph node metastasis (P = 0.031) and large tumor sizes (>5 cm) (P = 0.008). In addition, this aberrant epigenetic event is associated with shorter disease-free survival (P = 0.029) in cancer patients. These results suggest that hypermethylation of HOXB13 is a late event of breast tumorigenesis and a poor prognostic indicator of node-positive cancer patients.
doi:10.1093/carcin/bgn115
PMCID: PMC2899848
PMID: 18499701