Genomics has substantially changed our approach to cancer research. Gene expression profiling, for example, has been utilized to delineate subtypes of cancer, and facilitated derivation of predictive and prognostic signatures. The emergence of technologies for the high resolution and genome-wide description of genetic and epigenetic features has enabled the identification of a multitude of causal DNA events in tumors. This has afforded the potential for large scale integration of genome and transcriptome data generated from a variety of technology platforms to acquire a better understanding of cancer.
Here we show how multi-dimensional genomics data analysis would enable the deciphering of mechanisms that disrupt regulatory/signaling cascades and downstream effects. Since not all gene expression changes observed in a tumor are causal to cancer development, we demonstrate an approach based on multiple concerted disruption (MCD) analysis of genes that facilitates the rational deduction of aberrant genes and pathways, which otherwise would be overlooked in single genomic dimension investigations.
Notably, this is the first comprehensive study of breast cancer cells by parallel integrative genome wide analyses of DNA copy number, LOH, and DNA methylation status to interpret changes in gene expression pattern. Our findings demonstrate the power of a multi-dimensional approach to elucidate events which would escape conventional single dimensional analysis and as such, reduce the cohort sample size for cancer gene discovery.
High throughput microarray technologies have afforded the investigation of genomes, epigenomes, and transcriptomes at unprecedented resolution. However, software packages to handle, analyze, and visualize data from these multiple 'omics disciplines have not been adequately developed.
Here, we present SIGMA2, a system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes. Multi-dimensional datasets can be simultaneously visualized and analyzed with respect to each dimension, allowing combinatorial integration of the different assays belonging to the different 'omics.
The identification of genes altered at multiple levels such as copy number, loss of heterozygosity (LOH), DNA methylation and the detection of consequential changes in gene expression can be concertedly performed, establishing SIGMA2 as a novel tool to facilitate the high throughput systems biology analysis of cancer.
Cancer is thought to be caused by a sequence of multiple genetic and epigenetic alterations which occur in one or more of the genes controlling cell cycle progression and signaling transduction. The complexity of carcinogenic mechanisms leads to heterogeneity in molecular phenotype, pathology, and prognosis of cancers.
Genome-wide mutational analysis of cancer genes in individual tumors is the most direct way to elucidate the complex process of disease progression, although such high-throughput sequencing technologies are not yet fully developed. As a surrogate marker for pathway activation analysis, expression profiling using microarrays has been successfully applied for the classification of tumor types, stages of tumor progression, or in some cases, prediction of clinical outcomes. However, the biological implication of those gene expression signatures is often unclear.
Systems biological approaches leverage the signature genes as a representation of changes in signaling pathways, instead of interpreting the relevance between each gene and phenotype. This approach, which can be achieved by comparing the gene set or the expression profile with those of reference experiments in which a defined pathway is modulated, will improve our understanding of cancer classification, clinical outcome, and carcinogenesis. In this review, we will discuss recent studies on the development of expression signatures to monitor signaling pathway activities and how these signatures can be used to improve the identification of responders to anticancer drugs.
Expression signature; signaling pathway; drug discovery; cancer therapy; systems biology.
After completion of the human genome, genome-wide association studies were conducted to identify single nucleotide polymorphisms (SNPs) associated with cancer initiation and progression. Most of the studies identified SNPs that were located outside the coding region, and the odds ratios were too low to implement in clinical practice. Although the genome gives information about genome sequence and structure, the human epigenome provides functional aspects of the genome. Epigenome-wide association studies (EWAS) provide an opportunity to identify genome-wide epigenetic variants that are associated with cancer. However, there are problems and issues in implementing EWAS to establish an association between epigenetic profiles and cancer. Few challenges include selection and handling of samples, choice of population and sample size, accurate measurement of exposure, integrating data, and insufficient information about the role of repeat sequences. The current status of EWAS, challenges in the field, and their potential solutions are discussed in this article.
Acetylation; biomarker; chromatin; environmental mutagens; epidemiology; epigenetics; histone acetyl transferase (HAT); histone; histone deacetylase (HDAC); histone code; imprinting; methylation; methyl transferase; mutagens; risk assessment; screening.
Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.
Epigenomes are comprised, in part, of all genome-wide chromatin modifications including DNA methylation and histone modifications. Unlike the genome, epigenomes are dynamic during development and differentiation in order to establish and maintain cell type-specific gene expression states that underlie cellular identity and function. Chromatin modifications are particularly labile, providing a mechanism for organisms to respond and adapt to environmental cues. Results from studies in animal models clearly demonstrate that epigenomic variability leads to phenotypic variability including susceptibility to disease that is not recognized at the DNA sequence level. Thus, capturing epigenomic information is invaluable for comprehensively understanding development, differentiation, and disease. Herein, we provide a brief overview of epigenetic processes, how they are relevant to human health, and review studies utilizing technologies that enable epigenome mapping. We conclude by describing feasible applications of epigenome mapping, focusing on epigenome-wide association studies (eGWAS), which have the potential to revolutionize current studies of human diseases and will likely promote the discovery of novel diagnostic, preventative, and treatment strategies.
Epigenetics; genetics; gene expression; gene regulation
Cancer evolution at all stages is driven by both epigenetic abnormalities as well as genetic alterations. Dysregulation of epigenetic control events may lead to abnormal patterns of DNA methylation and chromatin configurations, both of which are critical contributors to the pathogenesis of cancer. These epigenetic abnormalities are set and maintained by multiple protein complexes and the interplay between their individual components including DNA methylation machinery, histone modifiers, particularly, polycomb (PcG) proteins, and chromatin remodeling proteins. Recent advances in genome-wide technology have revealed that the involvement of these dysregulated epigenetic components appears to be extensive. Moreover, there is a growing connection between epigenetic abnormalities in cancer and concepts concerning stem-like cell subpopulations as a driving force for cancer. Emerging data suggest that aspects of the epigenetic landscape inherent to normal embryonic and adult stem/progenitor cells may help foster, under the stress of chronic inflammation or accumulating reactive oxygen species, evolution of malignant subpopulations. Finally, understanding molecular mechanisms involved in initiation and maintenance of epigenetic abnormalities in all types of cancer has great potential for translational purposes. This is already evident for epigenetic biomarker development, and for pharmacological targeting aimed at reversing cancer-specific epigenetic alterations.
cancer epigenetics; DNA methylation; polycomb proteins; cancer stem cells; biomarkers; epigenetic therapy
Papillary thyroid carcinoma (PTC) accounts for over 80% of all thyroid malignancies. The molecular pathogenesis remains incompletely clarified although activation of the RET fusion oncogenes, and RAS and BRAF oncogenes, has been well characterized. Novel technologies using genome-wide approaches to study tumor genomes and epigenomes have provided great insights into tumor development. Growing evidence shows that acquired epigenetic abnormalities participate with genetic alterations to cause altered patterns of gene expression/function. It has been established beyond doubt that promoter cytosine methylation in CpG islands, and the subsequent gene silencing, is intimately involved in cancer development. These epigenetic events very likely contribute to significant variation in gene expression profiling, phenotypic features, and biologic characteristics seen in PTC. Hypermethylation of promoter regions has also been analyzed in PTC, and most studies have focused on individual genes or a small cohort of genes implicated in tumorigenesis.
In high throughput cancer genomic studies, results from the analysis of single datasets often suffer from a lack of reproducibility because of small sample sizes. Integrative analysis can effectively pool and analyze multiple datasets and provides a cost effective way to improve reproducibility. In integrative analysis, simultaneously analyzing all genes profiled may incur high computational cost. A computationally affordable remedy is prescreening, which fits marginal models, can be conducted in a parallel manner, and has low computational cost.
An integrative prescreening approach is developed for the analysis of multiple cancer genomic datasets. Simulation shows that the proposed integrative prescreening has better performance than alternatives, particularly including prescreening with individual datasets, an intensity approach and meta-analysis. We also analyze multiple microarray gene profiling studies on liver and pancreatic cancers using the proposed approach.
The proposed integrative prescreening provides an effective way to reduce the dimensionality in cancer genomic studies. It can be coupled with existing analysis methods to identify cancer markers.
For therapeutic purposes, non-small cell lung cancer (NSCLC) has traditionally been regarded as a single disease. However, recent evidence suggest that the two major subtypes of NSCLC, adenocarcinoma (AC) and squamous cell carcinoma (SqCC) respond differently to both molecular targeted and new generation chemotherapies. Therefore, identifying the molecular differences between these tumor types may impact novel treatment strategy. We performed the first large-scale analysis of 261 primary NSCLC tumors (169 AC and 92 SqCC), integrating genome-wide DNA copy number, methylation and gene expression profiles to identify subtype-specific molecular alterations relevant to new agent design and choice of therapy. Comparison of AC and SqCC genomic and epigenomic landscapes revealed 778 altered genes with corresponding expression changes that are selected during tumor development in a subtype-specific manner. Analysis of >200 additional NSCLCs confirmed that these genes are responsible for driving the differential development and resulting phenotypes of AC and SqCC. Importantly, we identified key oncogenic pathways disrupted in each subtype that likely serve as the basis for their differential tumor biology and clinical outcomes. Downregulation of HNF4α target genes was the most common pathway specific to AC, while SqCC demonstrated disruption of numerous histone modifying enzymes as well as the transcription factor E2F1. In silico screening of candidate therapeutic compounds using subtype-specific pathway components identified HDAC and PI3K inhibitors as potential treatments tailored to lung SqCC. Together, our findings suggest that AC and SqCC develop through distinct pathogenetic pathways that have significant implication in our approach to the clinical management of NSCLC.
Accumulating databases in human genome research have enabled integrated genome-wide study on complicated diseases such as cancers. A practical approach is to mine a global transcriptome profile of disease from public database. New concepts of these diseases might emerge by landscaping this profile.
In this study, we clustered human colorectal normal mucosa (N), inflammatory bowel disease (IBD), adenoma (A) and cancer (T) related expression sequence tags (EST) into UniGenes via an in-house GetUni software package and analyzed the transcriptome overview of these libraries by GOTree Machine (GOTM). Additionally, we downloaded UniGene based cDNA libraries of colon and analyzed them by Xprofiler to cross validate the efficiency of GetUni. Semi-quantitative RT-PCR was used to validate the expression of β-catenin and. 7 novel genes in colorectal cancers.
The efficiency of GetUni was successfully validated by Xprofiler and RT-PCR. Genes in library N, IBD and A were all found in library T. A total of 14,879 genes were identified with 2,355 of them having at least 2 transcripts. Differences in gene enrichment among these libraries were statistically significant in 50 signal transduction pathways and Pfam protein domains by GOTM analysis P < 0.01 Hypergeometric Test). Genes in two metabolic pathways, ribosome and glycolysis, were more enriched in the expression profiles of A and IBD than in N and T. Seven transmembrane receptor superfamily genes were typically abundant in cancers.
Colorectal cancers are genetically heterogeneous. Transcription variants are common in them. Aberrations of ribosome and glycolysis pathway might be early indicators of precursor lesions in colon cancers. The electronic gene expression profile could be used to highlight the integral molecular events in colorectal cancers.
Covalent modification of DNA distinguishes cellular identities and is crucial for regulating the pluripotency and differentiation of embryonic stem (ES) cells. The recent demonstration that 5-methylcytosine (5-mC) may be further modified to 5-hydroxymethylcytosine (5-hmC) in ES cells has revealed a novel regulatory paradigm to modulate the epigenetic landscape of pluripotency. To understand the role of 5-hmC in the epigenomic landscape of pluripotent cells, here we profile the genome-wide 5-hmC distribution and correlate it with the genomic profiles of 11 diverse histone modifications and six transcription factors in human ES cells. By integrating genomic 5-hmC signals with maps of histone enrichment, we link particular pluripotency-associated chromatin contexts with 5-hmC. Intriguingly, through additional correlations with defined chromatin signatures at promoter and enhancer subtypes, we show distinct enrichment of 5-hmC at enhancers marked with H3K4me1 and H3K27ac. These results suggest potential role(s) for 5-hmC in the regulation of specific promoters and enhancers. In addition, our results provide a detailed epigenomic map of 5-hmC from which to pursue future functional studies on the diverse regulatory roles associated with 5-hmC.
Recent studies revealed the oxygenase-catalyzed production of 5-hydroxymethylcytosine (5-hmC) as a modification to mammalian DNA. 5-hmC is known to play important roles in self-renewal and cell lineage specification in embryonic stem (ES) cells, suggesting a potential role for 5-hmC–mediated epigenetic regulation in modulating the pluripotency of ES cells. To unveil this new regulatory paradigm in human ES cells, here we use a 5-hmC–specific chemical labeling approach to capture 5-hmC and profile its genome-wide distribution in human ES cells. We show that 5-hmC is an important epigenetic modification associated with the pluripotent state that could play role(s) in a subset of promoters and enhancers with defined chromatin signatures in ES cells.
New technologies are enabling the measurement of many types of genomic and epigenomic information at scales ranging from the atomic to nuclear. Much of this new data is increasingly structural in nature, and is often difficult to coordinate with other data sets. There is a legitimate need for integrating and visualizing these disparate data sets to reveal structural relationships not apparent when looking at these data in isolation.
We have applied object-oriented technology to develop a downloadable visualization tool, Genome3D, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. In order to integrate and visualize large volume of data, novel statistical and mathematical approaches have been developed to reduce the size of the data. To our knowledge, this is the first such tool developed that can visualize human genome in three-dimension. We describe here the major features of Genome3D and discuss our multi-scale data framework using a representative basic physical model. We then demonstrate many of the issues and benefits of multi-resolution data integration.
Genome3D is a software visualization tool that explores a wide range of structural genomic and epigenetic data. Data from various sources of differing scales can be integrated within a hierarchical framework that is easily adapted to new developments concerning the structure of the physical genome. In addition, our tool has a simple annotation mechanism to incorporate non-structural information. Genome3D is unique is its ability to manipulate large amounts of multi-resolution data from diverse sources to uncover complex and new structural relationships within the genome.
The importance of genetic and epigenetic alterations maybe in their aggregate role in altering core pathways in tumorigenesis.
Merging genome-wide genomic and epigenomic alterations, we identify key genes and pathways altered in colorectal cancers (CRC). DNA Methylation analysis was tested for predicting survival in CRC patients using Cox proportional hazard model.
We identified 29 low frequency mutated genes that are also inactivated by epigenetic mechanisms in CRC. Pathway analysis showed the extracellular matrix (ECM) remodeling pathway is silenced in CRC. 6 ECM pathway genes were tested for their prognostic potential in large CRC cohorts (n=777). DNA Methylation of IGFBP3 and EVL predicted for poor survival (IGFBP3: HR=2.58, 95%CI:1.37-4.87, p=0.004; EVL: HR=2.48, 95%CI:1.07-5.74, p=0.034) and simultaneous methylation of multiple genes predicted significantly worse survival (HR=8.61, 95%CI:2.16-34.36, p<0.001 for methylation of IGFBP3, EVL, CD109 and FLNC). DNA Methylation of IGFBP3 and EVL was validated as a prognostic marker in an independent contemporary matched cohort (IGFBP3 HR=2.06, 95% CI:1.04-4.09, p=0.038; EVL HR=2.23, 95%CI:1.00-5.0, p=0.05) and EVL DNA methylation remained significant in a secondary historical validation cohort (HR=1.41, 95%CI:1.05-1.89, p=0.022). Moreover, DNA methylation of selected ECM genes helps to stratify the high-risk Stage 2 colon cancers patients who would benefit from adjuvant chemotherapy (HR: 5.85, 95%CI:2.03-16.83, p=0.001 for simultaneous methylation of IGFBP3, EVL and CD109).
CRC that have silenced in ECM pathway components show worse survival suggesting that our finding provides novel prognostic biomarkers for CRC and reflects the high importance of integrative analyses linking genetic and epigenetic abnormalities with pathway disruption in cancer.
DNA Methylation; Extracellular Matrix Pathway; Prognostic Biomarker; Colorectal cancer
With the advance of large-scale omics technologies, it is now feasible to reversely engineer the underlying genetic networks that describe the complex interplays of molecular elements that lead to complex diseases. Current networking approaches are mainly focusing on building genetic networks at large without probing the interaction mechanisms specific to a physiological or disease condition. The aim of this study was thus to develop such a novel networking approach based on the relevance concept, which is ideal to reveal integrative effects of multiple genes in the underlying genetic circuit for complex diseases.
The approach started with identification of multiple disease pathways, called a gene forest, in which the genes extracted from the decision forest constructed by supervised learning of the genome-wide transcriptional profiles for patients and normal samples. Based on the newly identified disease mechanisms, a novel pair-wise relevance metric, adjusted frequency value, was used to define the degree of genetic relationship between two molecular determinants. We applied the proposed method to analyze a publicly available microarray dataset for colon cancer. The results demonstrated that the colon cancer-specific gene network captured the most important genetic interactions in several cellular processes, such as proliferation, apoptosis, differentiation, mitogenesis and immunity, which are known to be pivotal for tumourigenesis. Further analysis of the topological architecture of the network identified three known hub cancer genes [interleukin 8 (IL8) (p ≈ 0), desmin (DES) (p = 2.71 × 10-6) and enolase 1 (ENO1) (p = 4.19 × 10-5)], while two novel hub genes [RNA binding motif protein 9 (RBM9) (p = 1.50 × 10-4) and ribosomal protein L30 (RPL30) (p = 1.50 × 10-4)] may define new central elements in the gene network specific to colon cancer. Gene Ontology (GO) based analysis of the colon cancer-specific gene network and the sub-network that consisted of three-way gene interactions suggested that tumourigenesis in colon cancer resulted from dysfunction in protein biosynthesis and categories associated with ribonucleoprotein complex which are well supported by multiple lines of experimental evidence.
This study demonstrated that IL8, DES and ENO1 act as the central elements in colon cancer susceptibility, and protein biosynthesis and the ribosome-associated function categories largely account for the colon cancer tumuorigenesis. Thus, the newly developed relevancy-based networking approach offers a powerful means to reverse-engineer the disease-specific network, a promising tool for systematic dissection of complex diseases.
The complexity of the mammalian genome is regulated by heritable epigenetic mechanisms, which provide the basis for differentiation, development and cellular homeostasis. These mechanisms act on the level of chromatin, by modifying DNA, histone proteins and nucleosome density/composition. During the last decade it became clear that cancer is defined by a variety of epigenetic changes, which occur in early stages of disease and parallel genetic mutations. With the advent of new technologies we are just starting to unravel the cancer epigenome and latest mechanistic findings provide the first clue as to how altered epigenetic patterns might occur in different cancers. Here we review latest findings on chromatin related mechanisms and hypothesize how their impairment might contribute to the altered epigenome of cancer cells.
► Genome-wide analyses reveal epigenomic differences in functional regions. ► Epigenetic patterns occur in large domains in the genome. ► Epigenetic mechanisms are interrelated. ► Mutations of epigenetic enzymes are frequently associated with cancer. ► Changes in nuclear architecture are related to epigenomic alterations in cancer.
Cancer epigenetics; Chromatin; Nuclear architecture; DNA methylation; Histone modification; lncRNA
Gene expression profiling using microarray technologies provides a powerful approach to understand complex biological systems and the pathogenesis of diseases. In the field of liver cancer research, a number of genome-wide profiling studies have been published. These studies have provided gene sets, that is, signature, which could classify tumors and predict clinical outcomes such as survival, recurrence, and metastasis. More recently, the application of genomic profiling has been extended to identify molecular targets, pathways, and the cellular origins of the tumors. Systemic and integrative analyses of multiple data sets and emerging new technologies also accelerate the progress of the cancer genomic studies. Here, we review the genomic signatures identified from the genomic profiling studies of hepatocellular carcinoma (HCC), and categorize and characterize them into prediction, phenotype, function, and molecular target signatures according to their utilities and properties. Our classification of the signatures would be helpful to understand and design studies with extended application of genomic profiles.
signature; microarray; integrative analysis; hepatocellular carcinoma
Recent epigenomic studies have identified significant differences between developmental stages and cell types. While the importance of epigenetic regulation has been increasingly recognized, it remains unclear how the global epigenetic patterns are established and maintained. Here I review a number of recent studies with the emphasis on the role of the genomic sequence in shaping the epigenetic landscape. These studies strongly suggest that the sequence information is important not just for controlling target specificity but for orchestrating the diversity of epigenetic patterns among different cell types. The epigenome is maintained by the complex network of a large number of interactions. Integrative approaches are needed to gain insights into these networks.
Renal cell carcinoma (RCC) is characterized by a number of diverse molecular aberrations that differ among individuals. Recent approaches to molecularly classify RCC were based on clinical, pathological as well as on single molecular parameters. As a consequence, gene expression patterns reflecting the sum of genetic aberrations in individual tumors may not have been recognized. In an attempt to uncover such molecular features in RCC, we used a novel, unbiased and integrative approach.
We integrated gene expression data from 97 primary RCC of different pathologic parameters, 15 RCC metastases as well as 34 cancer cell lines for two-way nonsupervised hierarchical clustering using gene groups suggested by the PANTHER Classification System. We depicted the genomic landscape of the resulted tumor groups by means of Single Nuclear Polymorphism (SNP) technology. Finally, the achieved results were immunohistochemically analyzed using a tissue microarray (TMA) composed of 254 RCC.
We found robust, genome wide expression signatures, which split RCC into three distinct molecular subgroups. These groups remained stable even if randomly selected gene sets were clustered. Notably, the pattern obtained from RCC cell lines was clearly distinguishable from that of primary tumors. SNP array analysis demonstrated differing frequencies of chromosomal copy number alterations among RCC subgroups. TMA analysis with group-specific markers showed a prognostic significance of the different groups.
We propose the existence of characteristic and histologically independent genome-wide expression outputs in RCC with potential biological and clinical relevance.
DNA-microarray; SNP-array; RCC subgroups; Tissue microarray; Outcome
Genome-wide techniques such as microarray analysis, Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS), linkage analysis and association studies are used extensively in the search for genes that cause diseases, and often identify many hundreds of candidate disease genes. Selection of the most probable of these candidate disease genes for further empirical analysis is a significant challenge. Additionally, identifying the genes that cause complex diseases is problematic due to low penetrance of multiple contributing genes. Here, we describe a novel bioinformatic approach that selects candidate disease genes according to their expression profiles. We use the eVOC anatomical ontology to integrate text-mining of biomedical literature and data-mining of available human gene expression data. To demonstrate that our method is successful and widely applicable, we apply it to a database of 417 candidate genes containing 17 known disease genes. We successfully select the known disease gene for 15 out of 17 diseases and reduce the candidate gene set to 63.3% (±18.8%) of its original size. This approach facilitates direct association between genomic data describing gene expression and information from biomedical texts describing disease phenotype, and successfully prioritizes candidate genes according to their expression in disease-affected tissues.
Follicular lymphoma (FL) is a form of non-Hodgkin's lymphoma (NHL) that arises from germinal center (GC) B-cells. Despite the significant advances in immunotherapy, FL is still not curable. Beyond transcriptional profiling and genomics datasets, there currently is no epigenome-scale dataset or integrative biology approach that can adequately model this disease and therefore identify novel mechanisms and targets for successful prevention and treatment of FL.
We performed methylation-enriched genome-wide bisulfite sequencing of FL cells and normal CD19+ B-cells using 454 sequencing technology. The methylated DNA fragments were enriched with methyl-binding proteins, treated with bisulfite, and sequenced using the Roche-454 GS FLX sequencer. The total number of bases covered in the human genome was 18.2 and 49.3 million including 726,003 and 1.3 million CpGs in FL and CD19+ B-cells, respectively. 11,971 and 7,882 methylated regions of interest (MRIs) were identified respectively. The genome-wide distribution of these MRIs displayed significant differences between FL and normal B-cells. A reverse trend in the distribution of MRIs between the promoter and the gene body was observed in FL and CD19+ B-cells. The MRIs identified in FL cells also correlated well with transcriptomic data and ChIP-on-Chip analyses of genome-wide histone modifications such as tri-methyl-H3K27, and tri-methyl-H3K4, indicating a concerted epigenetic alteration in FL cells.
This study is the first to provide a large scale and comprehensive analysis of the DNA methylation sequence composition and distribution in the FL epigenome. These integrated approaches have led to the discovery of novel and frequent targets of aberrant epigenetic alterations. The genome-wide bisulfite sequencing approach developed here can be a useful tool for profiling DNA methylation in clinical samples.
Genome-wide gene expression profile using deep sequencing technologies can drive the discovery of cancer biomarkers and therapeutic targets. Such efforts are often limited to profiling the expression signature of either mRNA or microRNA (miRNA) in a single type of cancer.
Here we provided an integrated analysis of the genome-wide mRNA and miRNA expression profiles of three different genitourinary cancers: carcinomas of the bladder, kidney and testis.
Our results highlight the general or cancer-specific roles of several genes and miRNAs that may serve as candidate oncogenes or suppressors of tumor development. Further comparative analyses at the systems level revealed that significant aberrations of the cell adhesion process, p53 signaling, calcium signaling, the ECM-receptor and cell cycle pathways, the DNA repair and replication processes and the immune and inflammatory response processes were the common hallmarks of human cancers. Gene sets showing testicular cancer-specific deregulation patterns were mainly implicated in processes related to male reproductive function, and general disruptions of multiple metabolic pathways and processes related to cell migration were the characteristic molecular events for renal and bladder cancer, respectively. Furthermore, we also demonstrated that tumors with the same histological origins and genes with similar functions tended to group together in a clustering analysis. By assessing the correlation between the expression of each miRNA and its targets, we determined that deregulation of ‘key’ miRNAs may result in the global aberration of one or more pathways or processes as a whole.
This systematic analysis deciphered the molecular phenotypes of three genitourinary cancers and investigated their variations at the miRNA level simultaneously. Our results provided a valuable source for future studies and highlighted some promising genes, miRNAs, pathways and processes that may be useful for diagnostic or therapeutic applications.
The recently developed RNA interference (RNAi) technology has created an unprecedented opportunity which allows the function of individual genes in whole organisms or cell lines to be interrogated at genome-wide scale. However, multiple issues, such as off-target effects or low efficacies in knocking down certain genes, have produced RNAi screening results that are often noisy and that potentially yield both high rates of false positives and false negatives. Therefore, integrating RNAi screening results with other information, such as protein-protein interaction (PPI), may help to address these issues.
By analyzing 24 genome-wide RNAi screens interrogating various biological processes in Drosophila, we found that RNAi positive hits were significantly more connected to each other when analyzed within a protein-protein interaction network, as opposed to random cases, for nearly all screens. Based on this finding, we developed a network-based approach to identify false positives (FPs) and false negatives (FNs) in these screening results. This approach relied on a scoring function, which we termed NePhe, to integrate information obtained from both PPI network and RNAi screening results. Using a novel rank-based test, we compared the performance of different NePhe scoring functions and found that diffusion kernel-based methods generally outperformed others, such as direct neighbor-based methods. Using two genome-wide RNAi screens as examples, we validated our approach extensively from multiple aspects. We prioritized hits in the original screens that were more likely to be reproduced by the validation screen and recovered potential FNs whose involvements in the biological process were suggested by previous knowledge and mutant phenotypes. Finally, we demonstrated that the NePhe scoring system helped to biologically interpret RNAi results at the module level.
By comprehensively analyzing multiple genome-wide RNAi screens, we conclude that network information can be effectively integrated with RNAi results to produce suggestive FPs and FNs, and to bring biological insight to the screening results.
Lung cancer has become a global public health burden, further substantiating the need for early diagnosis and more effective targeted therapies. The key to accomplishing both these goals is a better understanding of the genes and pathways disrupted during the initiation and progression of this disease. Gene promoter hypermethylation is an epigenetic modification of DNA at promoter CpG islands that together with changes in histone structure culminates in loss of transcription. The fact that gene promoter hypermethylation is a major mechanism for silencing genes in lung cancer has stimulated the development of screening approaches to identify additional genes and pathways that are disrupted within the epigenome. Some of these approaches include restriction landmark scanning, methylation CpG island amplification coupled with representational difference analysis, and transcriptome-wide screening. Genes identified by these approaches, their function, and prevalence in lung cancer are described. Recently, we used global screening approaches to interrogate 43 genes in and around the candidate lung cancer susceptibility locus, 6q23–25. Five genes, TCF21, SYNE1, AKAP12, IL20RA, and ACAT2, were methylated at 14 to 81% prevalence, but methylation was not associated with age at diagnosis or stage of lung cancer. These candidate tumor suppressor genes likely play key roles in contributing to sporadic lung cancer. The realization that methylation is a dominant mechanism in lung cancer etiology and its reversibility by pharmacologic agents has led to the initiation of translational studies to develop biomarkers in sputum for early detection and the testing of demethylating and histone deacetylation inhibitors for treatment of lung cancer.
gene promoter hypermethylation; lung cancer; chromosome 6; epigenetics
Epigenetic states are governed by DNA methylation and a host of modifications to histones bound with DNA. These states are essential for proper developmentally regulated gene expression and are perturbed in many diseases. There is great interest in identifying epigenetic mark placement genome-wide and understanding how these marks vary among cell types, with changes in environment or according to health and disease status. Current epigenomic analyses employ bisulfite sequencing and chromatin immunoprecipitation, but query only one type of epigenetic mark at a time, DNA methylation or histone modifications, and often require substantial input material. To overcome these limitations, we established a method using nanofluidics and multi-color fluorescence microscopy to detect DNA and histones in individual chromatin fragments at about 10 Mbp/min. We demonstrated its utility for epigenetic analysis by identifying DNA methylation on individual molecules. This technique will provide the unprecedented opportunity for genome-wide, simultaneous analysis of multiple epigenetic states on single molecules using femtogram quantities of material.
Single-molecule; chromatin; epigenetics; epigenomics; DNA methylation; nanofluidics; laser-induced fluorescence; methyl binding domain protein; green fluorescent protein; HeLa cell