Search tips
Search criteria

Results 1-25 (44)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
2.  The Dynamics of DNA Methylation Covariation Patterns in Carcinogenesis 
PLoS Computational Biology  2014;10(7):e1003709.
Recently it has been observed that cancer tissue is characterised by an increased variability in DNA methylation patterns. However, how the correlative patterns in genome-wide DNA methylation change during the carcinogenic progress has not yet been explored. Here we study genome-wide inter-CpG correlations in DNA methylation, in addition to single site variability, during cervical carcinogenesis. We demonstrate how the study of changes in DNA methylation covariation patterns across normal, intra-epithelial neoplasia and invasive cancer allows the identification of CpG sites that indicate the risk of neoplastic transformation in stages prior to neoplasia. Importantly, we show that the covariation in DNA methylation at these risk CpG loci is maximal immediately prior to the onset of cancer, supporting the view that high epigenetic diversity in normal cells increases the risk of cancer. Consistent with this, we observe that invasive cancers exhibit increased covariation in DNA methylation at the risk CpG sites relative to normal tissue, but lower levels relative to pre-cancerous lesions. We further show that the identified risk CpG sites undergo preferential DNA methylation changes in relation to human papilloma virus infection and age. Results are validated in independent data including prospectively collected samples prior to neoplastic transformation. Our data are consistent with a phase transition model of carcinogenesis, in which epigenetic diversity is maximal prior to the onset of cancer. The model and algorithm proposed here may allow, in future, network biomarkers predicting the risk of neoplastic transformation to be identified.
Author Summary
DNA methylation is a covalent modification of DNA which can regulate how active genes are. DNA methylation is altered at many genomic loci in cancer cells, leading to widespread functional disruption. Importantly, DNA methylation alterations across the genome are seen even in early carcinogenesis. Although the pattern of DNA methylation change during carcinogenesis has been studied at individual genomic loci, no study has yet analysed how these patterns change at a systems-level, specifically how do DNA methylation patterns at pairs of genomic sites change during disease progression. Doing so can shed light on how the epigenetic diversity of cell populations changes during the carcinogenic process. This study performs a systems-level analysis of the dynamic changes in DNA methylation correlation pattern during cervical carcinogenesis, demonstrating that epigenetic diversity is maximal just prior to the onset of cancer. Importantly, this supports the view that the risk of cancer development is closely related to an increase in epigenetic diversity in apparently healthy cells. In addition, the study provides a computational algorithm which successfully identifies the altered genomic sites confering the risk of cervical cancer.
PMCID: PMC4091688  PMID: 25010556
3.  A BRCA1-mutation associated DNA methylation signature in blood cells predicts sporadic breast cancer incidence and survival 
Genome Medicine  2014;6(6):47.
BRCA1 mutation carriers have an 85% risk of developing breast cancer but the risk of developing non-hereditary breast cancer is difficult to assess. Our objective is to test whether a DNA methylation (DNAme) signature derived from BRCA1 mutation carriers is able to predict non-hereditary breast cancer.
In a case/control setting (72 BRCA1 mutation carriers and 72 BRCA1/2 wild type controls) blood cell DNA samples were profiled on the Illumina 27 k methylation array. Using the Elastic Net classification algorithm, a BRCA1-mutation DNAme signature was derived and tested in two cohorts: (1) The NSHD (19 breast cancers developed within 12 years after sample donation and 77 controls) and (2) the UKCTOCS trial (119 oestrogen receptor positive breast cancers developed within 5 years after sample donation and 122 controls).
We found that our blood-based BRCA1-mutation DNAme signature applied to blood cell DNA from women in the NSHD resulted in a receiver operating characteristics (ROC) area under the curve (AUC) of 0.65 (95% CI 0.51 to 0.78, P = 0.02) which did not validate in buccal cells from the same individuals. Applying the signature in blood DNA from UKCTOCS volunteers resulted in AUC of 0.57 (95% CI 0.50 to 0.64; P = 0.03) and is independent of family history or any other known risk factors. Importantly the BRCA1-mutation DNAme signature was able to predict breast cancer mortality (AUC = 0.67; 95% CI 0.51 to 0.83; P = 0.02). We also found that the 1,074 CpGs which are hypermethylated in BRCA1 mutation carriers are significantly enriched for stem cell polycomb group target genes (P <10-20).
A DNAme signature derived from BRCA1 carriers is able to predict breast cancer risk and death years in advance of diagnosis. Future studies may need to focus on DNAme profiles in epithelial cells in order to reach the AUC thresholds required of preventative measures or early detection strategies.
PMCID: PMC4110671  PMID: 25067956
4.  Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis 
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.
PMCID: PMC4100184  PMID: 24937687
non-Gaussian statistical models; dimension reduction; unsupervised learning; feature selection; DNA methylation analysis
5.  Genome-wide age-related changes in DNA methylation and gene expression in human PBMCs 
Age  2014;36(3):9648.
Aging is a progressive process that results in the accumulation of intra- and extracellular alterations that in turn contribute to a reduction in health. Age-related changes in DNA methylation have been reported before and may be responsible for aging-induced changes in gene expression, although a causal relationship has yet to be shown. Using genome-wide assays, we analyzed age-induced changes in DNA methylation and their effect on gene expression with and without transient induction with the synthetic transcription modulating agent WY14,643. To demonstrate feasibility of the approach, we isolated peripheral blood mononucleated cells (PBMCs) from five young and five old healthy male volunteers and cultured them with or without WY14,643. Infinium 450K BeadChip and Affymetrix Human Gene 1.1 ST expression array analysis revealed significant differential methylation of at least 5 % (ΔYO > 5 %) at 10,625 CpG sites between young and old subjects, but only a subset of the associated genes were also differentially expressed. Age-related differential methylation of previously reported epigenetic biomarkers of aging including ELOVL2, FHL2, PENK, and KLF14 was confirmed in our study, but these genes did not display an age-related change in gene expression in PBMCs. Bioinformatic analysis revealed that differentially methylated genes that lack an age-related expression change predominantly represent genes involved in carcinogenesis and developmental processes, and expression of most of these genes were silenced in PBMCs. No changes in DNA methylation were found in genes displaying transiently induced changes in gene expression. In conclusion, aging-induced differential methylation often targets developmental genes and occurs mostly without change in gene expression.
Electronic supplementary material
The online version of this article (doi:10.1007/s11357-014-9648-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4082572  PMID: 24789080
Molecular aging; Epigenetics; DNA methylation; Gene expression; PBMCs; Epigenetic biomarkers of aging
6.  Genomic architecture characterizes tumor progression paths and fate in breast cancer patients 
Science translational medicine  2010;2(38):38ra47.
Distinct molecular subtypes of breast carcinomas have been identified, but translation into clinical use has been limited. We have developed two platform independent algorithms to explore genomic architectural distortion using array comparative genomic hybridization (aCGH) data to measure 1) whole arm gains and losses (WAAI) and 2) complex rearrangements (CAAI). By applying CAAI and WAAI to data from 595 breast cancer patients we were able to separate the cases into eight subgroups with different distributions of genomic distortion. Within each subgroup data from expression analyses, sequencing and ploidy indicated that progression occurs along separate paths into more complex genotypes. Histological grade had prognostic impact only in the Luminal related groups while the complexity identified by CAAI had an overall independent prognostic power. This study emphasizes the relationship between structural genomic alterations, molecular subtype and clinical behavior, and show that objective score of genomic complexity (CAAI) is an independent prognostic marker in breast cancer.
PMCID: PMC3972440  PMID: 20592421
7.  JAK2-Centered Interactome Hotspot Identified by an Integrative Network Algorithm in Acute Stanford Type A Aortic Dissection 
PLoS ONE  2014;9(2):e89406.
The precise mechanisms underlying dissections, especially those without connective tissue diseases or congenital vascular diseases, are incompletely understood. This study attempted to identify both the expression profile of the dissected ascending aorta and the interactome hotspots associated with the disease, using microarray technology and gene regulatory network analysis. There were 2,737 genes differentially expressed between patients with acute Stanford type A aortic dissection and controls. Eight interactome hotspots significantly associated with aortic dissection were identified by an integrative network algorithm. In particular, we identified a JAK2-centered expression module, which was validated in an independent gene expression microarray data set, and which was characterized by over-expressed cytokines and receptors in acute aortic dissection cases, indicating that JAK2 may play a key role in the inflammatory process, which potentially contributes to the occurrence of acute aortic dissection. Overall, the analytical strategy used in this study offered the possibility to identify functional relevant network modules and subsequently facilitated the biological interpretation in the complicated disease.
PMCID: PMC3933461  PMID: 24586754
8.  Using high-density DNA methylation arrays to profile copy number alterations 
Genome Biology  2014;15(2):R30.
The integration of genomic and epigenomic data is an increasingly popular approach for studying the complex mechanisms driving cancer development. We have developed a method for evaluating both methylation and copy number from high-density DNA methylation arrays. Comparing copy number data from Infinium HumanMethylation450 BeadChips and SNP arrays, we demonstrate that Infinium arrays detect copy number alterations with the sensitivity of SNP platforms. These results show that high-density methylation arrays provide a robust and economic platform for detecting copy number and methylation changes in a single experiment. Our method is available in the ChAMP Bioconductor package:
PMCID: PMC4054098  PMID: 24490765
9.  Role of DNA Methylation and Epigenetic Silencing of HAND2 in Endometrial Cancer Development 
PLoS Medicine  2013;10(11):e1001551.
TB filled in by Laureen
Please see later in the article for the Editors' Summary
Endometrial cancer incidence is continuing to rise in the wake of the current ageing and obesity epidemics. Much of the risk for endometrial cancer development is influenced by the environment and lifestyle. Accumulating evidence suggests that the epigenome serves as the interface between the genome and the environment and that hypermethylation of stem cell polycomb group target genes is an epigenetic hallmark of cancer. The objective of this study was to determine the functional role of epigenetic factors in endometrial cancer development.
Methods and Findings
Epigenome-wide methylation analysis of >27,000 CpG sites in endometrial cancer tissue samples (n = 64) and control samples (n = 23) revealed that HAND2 (a gene encoding a transcription factor expressed in the endometrial stroma) is one of the most commonly hypermethylated and silenced genes in endometrial cancer. A novel integrative epigenome-transcriptome-interactome analysis further revealed that HAND2 is the hub of the most highly ranked differential methylation hotspot in endometrial cancer. These findings were validated using candidate gene methylation analysis in multiple clinical sample sets of tissue samples from a total of 272 additional women. Increased HAND2 methylation was a feature of premalignant endometrial lesions and was seen to parallel a decrease in RNA and protein levels. Furthermore, women with high endometrial HAND2 methylation in their premalignant lesions were less likely to respond to progesterone treatment. HAND2 methylation analysis of endometrial secretions collected using high vaginal swabs taken from women with postmenopausal bleeding specifically identified those patients with early stage endometrial cancer with both high sensitivity and high specificity (receiver operating characteristics area under the curve = 0.91 for stage 1A and 0.97 for higher than stage 1A). Finally, mice harbouring a Hand2 knock-out specifically in their endometrium were shown to develop precancerous endometrial lesions with increasing age, and these lesions also demonstrated a lack of PTEN expression.
HAND2 methylation is a common and crucial molecular alteration in endometrial cancer that could potentially be employed as a biomarker for early detection of endometrial cancer and as a predictor of treatment response. The true clinical utility of HAND2 DNA methylation, however, requires further validation in prospective studies.
Please see later in the article for the Editors' Summary
Editors' Summary
Cancer, which is responsible for 13% of global deaths, can develop anywhere in the body, but all cancers are characterized by uncontrolled cell growth and reduced cellular differentiation (the process by which unspecialized cells such as “stem” cells become specialized during development, tissue repair, and normal cell turnover). Genetic alterations—changes in the sequence of nucleotides (DNA's building blocks) in specific genes—are required for this cellular transformation and subsequent cancer development (carcinogenesis). However, recent evidence suggests that epigenetic modifications—reversible, heritable changes in gene function that occur in the absence of nucleotide sequence changes—may also be involved in carcinogenesis. For example, the addition of methyl groups to a set of genes called stem cell polycomb group target genes (PCGTs; polycomb genes control the expression of their target genes by modifying their DNA or associated proteins) is one of the earliest molecular changes in human cancer development, and increasing evidence suggests that hypermethylation of PCGTs is an epigenetic hallmark of cancer.
Why Was This Study Done?
The methylation of PCGTs, which is triggered by age and by environmental factors that are associated with cancer development, reduces cellular differentiation and leads to the accumulation of undifferentiated cells that are susceptible to cancer development. It is unclear, however, whether epigenetic modifications have a causal role in carcinogenesis. Here, the researchers investigate the involvement of epigenetic factors in the development of endometrial (womb) cancer. The risk of endometrial cancer (which affects nearly 50,000 women annually in the United States) is largely determined by environmental and lifestyle factors. Specifically, the risk of this cancer is increased in women in whom estrogen (a hormone that drives cell proliferation in the endometrium) is functionally dominant over progesterone (a hormone that inhibits endometrial proliferation and causes cell differentiation); obese women and women who have taken estrogen-only hormone replacement therapies fall into this category. Thus, endometrial cancer is an ideal model in which to study whether epigenetic mechanisms underlie carcinogenesis.
What Did the Researchers Do and Find?
The researchers collected data on genome-wide DNA methylation at cytosine- and guanine-rich sites in endometrial cancers and normal endometrium and integrated this information with the human interactome and transcriptome (all the physical interactions between proteins and all the genes expressed, respectively, in a cell) using an algorithm called Functional Epigenetic Modules (FEM). This analysis identified HAND2 as the hub of the most highly ranked differential methylation hotspot in endometrial cancer. HAND2 is a progesterone-regulated stem cell PCGT. It encodes a transcription factor that is expressed in the endometrial stroma (the connective tissue that lies below the epithelial cells in which most endometrial cancers develop) and that suppresses the production of the growth factors that mediate the growth-inducing effects of estrogen on the endometrial epithelium. The researchers hypothesized, therefore, that epigenetic deregulation of HAND2 could be a key step in endometrial cancer development. In support of this hypothesis, the researchers report that HAND2 methylation was increased in premalignant endometrial lesions (cancer-prone, abnormal-looking tissue) compared to normal endometrium, and was associated with suppression of HAND2 expression. Moreover, a high level of endometrial HAND2 methylation in premalignant lesions predicted a poor response to progesterone treatment (which stops the growth of some endometrial cancers), and analysis of HAND2 methylation in endometrial secretions collected from women with postmenopausal bleeding (a symptom of endometrial cancer) accurately identified individuals with early stage endometrial cancer. Finally, mice in which the Hand2 gene was specifically deleted in the endometrium developed precancerous endometrial lesions with age.
What Do These Findings Mean?
These and other findings identify HAND2 methylation as a common, key molecular alteration in endometrial cancer. These findings need to be confirmed in more women, and studies are needed to determine the immediate molecular and cellular consequences of HAND2 silencing in endometrial stromal cells. Nevertheless, these results suggest that HAND2 methylation could potentially be used as a biomarker for the early detection of endometrial cancer and for predicting treatment response. More generally, these findings support the idea that methylation of HAND2 (and, by extension, the methylation of other PCGTs) is not a passive epigenetic feature of cancer but is functionally involved in cancer development, and provide a framework for identifying other genes that are epigenetically regulated and functionally important in carcinogenesis.
Additional Information
Please access these websites via the online version of this summary at
The US National Cancer Institute provides information on all aspects of cancer and has detailed information about endometrial cancer for patients and professionals (in English and Spanish)
The not-for-profit organization American Cancer Society provides information on cancer and how it develops and specific information on endometrial cancer (in several languages)
The UK National Health Service Choices website includes an introduction to cancer, a page on endometrial cancer, and a personal story about endometrial cancer
The not-for-profit organization Cancer Research UK provides general information about cancer and specific information about endometrial cancer
Wikipedia has a page on cancer epigenetics (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
The Eve Appeal charity that supported this research provides useful information on gynecological cancers
PMCID: PMC3825654  PMID: 24265601
10.  Cellular network entropy as the energy potential in Waddington's differentiation landscape 
Scientific Reports  2013;3:3039.
Differentiation is a key cellular process in normal tissue development that is significantly altered in cancer. Although molecular signatures characterising pluripotency and multipotency exist, there is, as yet, no single quantitative mark of a cellular sample's position in the global differentiation hierarchy. Here we adopt a systems view and consider the sample's network entropy, a measure of signaling pathway promiscuity, computable from a sample's genome-wide expression profile. We demonstrate that network entropy provides a quantitative, in-silico, readout of the average undifferentiated state of the profiled cells, recapitulating the known hierarchy of pluripotent, multipotent and differentiated cell types. Network entropy further exhibits dynamic changes in time course differentiation data, and in line with a sample's differentiation stage. In disease, network entropy predicts a higher level of cellular plasticity in cancer stem cell populations compared to ordinary cancer cells. Importantly, network entropy also allows identification of key differentiation pathways. Our results are consistent with the view that pluripotency is a statistical property defined at the cellular population level, correlating with intra-sample heterogeneity, and driven by the degree of signaling promiscuity in cells. In summary, network entropy provides a quantitative measure of a cell's undifferentiated state, defining its elevation in Waddington's landscape.
PMCID: PMC3807110  PMID: 24154593
11.  Epigenetic aging: insights from network biology 
Aging (Albany NY)  2013;5(10):719-720.
PMCID: PMC3838773  PMID: 24145222
12.  Age-associated epigenetic drift: implications, and a case of epigenetic thrift? 
Human Molecular Genetics  2013;22(R1):R7-R15.
It is now well established that the genomic landscape of DNA methylation (DNAm) gets altered as a function of age, a process we here call ‘epigenetic drift’. The biological, functional, clinical and evolutionary significance of this epigenetic drift, however, remains unclear. We here provide a brief review of epigenetic drift, focusing on the potential implications for ageing, stem cell biology and disease risk prediction. It has been demonstrated that epigenetic drift affects most of the genome, suggesting a global deregulation of DNAm patterns with age. A component of this drift is tissue-specific, allowing remarkably accurate age-predictive models to be constructed. Another component is tissue-independent, targeting stem cell differentiation pathways and affecting stem cells, which may explain the observed decline of stem cell function with age. Age-associated increases in DNAm target developmental genes, overlapping those associated with environmental disease risk factors and with disease itself, notably cancer. In particular, cancers and precursor cancer lesions exhibit aggravated age DNAm signatures. Epigenetic drift is also influenced by genetic factors. Thus, drift emerges as a promising biomarker for premature or biological ageing, and could potentially be used in geriatrics for disease risk prediction. Finally, we propose, in the context of human evolution, that epigenetic drift may represent a case of epigenetic thrift, or bet-hedging. In summary, this review demonstrates the growing importance of the ‘ageing epigenome’, with potentially far-reaching implications for understanding the effect of age on stem cell function and differentiation, as well as for disease prevention.
PMCID: PMC3782071  PMID: 23918660
13.  Corruption of the Intra-Gene DNA Methylation Architecture Is a Hallmark of Cancer 
PLoS ONE  2013;8(7):e68285.
Epigenetic processes - including DNA methylation - are increasingly seen as having a fundamental role in chronic diseases like cancer. It is well known that methylation levels at particular genes or loci differ between normal and diseased tissue. Here we investigate whether the intra-gene methylation architecture is corrupted in cancer and whether the variability of levels of methylation of individual CpGs within a defined gene is able to discriminate cancerous from normal tissue, and is associated with heterogeneous tumour phenotype, as defined by gene expression. We analysed 270985 CpGs annotated to 18272 genes, in 3284 cancerous and 681 normal samples, corresponding to 14 different cancer types. In doing so, we found novel differences in intra-gene methylation pattern across phenotypes, particularly in those genes which are crucial for stem cell biology; our measures of intra-gene methylation architecture are a better determinant of phenotype than measures based on mean methylation level alone (K-S test in all 14 diseases tested). These per-gene methylation measures also represent a considerable reduction in complexity, compared to conventional per-CpG beta-values. Our findings strongly support the view that intra-gene methylation architecture has great clinical potential for the development of DNA-based cancer biomarkers.
PMCID: PMC3712966  PMID: 23874574
14.  An integrative network algorithm identifies age-associated differential methylation interactome hotspots targeting stem-cell differentiation pathways 
Scientific Reports  2013;3:1630.
Epigenetic changes have been associated with ageing and cancer. Identifying and interpreting epigenetic changes associated with such phenotypes may benefit from integration with protein interactome models. We here develop and validate a novel integrative epigenome-interactome approach to identify differential methylation interactome hotspots associated with a phenotype of interest. We apply the algorithm to cancer and ageing, demonstrating the existence of hotspots associated with these phenotypes. Importantly, we discover tissue independent age-associated hotspots targeting stem-cell differentiation pathways, which we validate in independent DNA methylation data sets, encompassing over 1000 samples from different tissue types. We further show that these pathways would not have been discovered had we used a non-network based approach and that the use of the protein interaction network improves the overall robustness of the inference procedure. The proposed algorithm will be useful to any study seeking to identify interactome hotspots associated with common phenotypes.
PMCID: PMC3620664  PMID: 23568264
15.  An evaluation of analysis pipelines for DNA methylation profiling using the Illumina HumanMethylation450 BeadChip platform 
Epigenetics  2013;8(3):333-346.
The proper identification of differentially methylated CpGs is central in most epigenetic studies. The Illumina HumanMethylation450 BeadChip is widely used to quantify DNA methylation; nevertheless, the design of an appropriate analysis pipeline faces severe challenges due to the convolution of biological and technical variability and the presence of a signal bias between Infinium I and II probe design types. Despite recent attempts to investigate how to analyze DNA methylation data with such an array design, it has not been possible to perform a comprehensive comparison between different bioinformatics pipelines due to the lack of appropriate data sets having both large sample size and sufficient number of technical replicates. Here we perform such a comparative analysis, targeting the problems of reducing the technical variability, eliminating the probe design bias and reducing the batch effect by exploiting two unpublished data sets, which included technical replicates and were profiled for DNA methylation either on peripheral blood, monocytes or muscle biopsies. We evaluated the performance of different analysis pipelines and demonstrated that: (1) it is critical to correct for the probe design type, since the amplitude of the measured methylation change depends on the underlying chemistry; (2) the effect of different normalization schemes is mixed, and the most effective method in our hands were quantile normalization and Beta Mixture Quantile dilation (BMIQ); (3) it is beneficial to correct for batch effects. In conclusion, our comparative analysis using a comprehensive data set suggests an efficient pipeline for proper identification of differentially methylated CpGs using the Illumina 450K arrays.
PMCID: PMC3669124  PMID: 23422812
technical variability; DNA methylation; microarray; Illumina 450K; normalization
16.  Identification and functional validation of HPV-mediated hypermethylation in head and neck squamous cell carcinoma 
Genome Medicine  2013;5(2):15.
Human papillomavirus-positive (HPV+) head and neck squamous cell carcinoma (HNSCC) represents a distinct clinical and epidemiological condition compared with HPV-negative (HPV-) HNSCC. To test the possible involvement of epigenetic modulation by HPV in HNSCC, we conducted a genome-wide DNA-methylation analysis.
Using laser-capture microdissection of 42 formalin-fixed paraffin wax-embedded (FFPE) HNSCCs, we generated DNA-methylation profiles of 18 HPV+ and 14 HPV- samples, using Infinium 450 k BeadArray technology. Methylation data were validated in two sets of independent HPV+/HPV- HNSCC samples (fresh-frozen samples and cell lines) using two independent methods (Infinium 450 k and whole-genome methylated DNA immunoprecipitation sequencing (MeDIP-seq)). For the functional analysis, an HPV- HNSCC cell line was transduced with lentiviral constructs containing the two HPV oncogenes (E6 and E7), and effects on methylation were assayed using the Infinium 450 k technology.
Results and discussion
Unsupervised clustering over the methylation variable positions (MVPs) with greatest variation showed that samples segregated in accordance with HPV status, but also that HPV+ tumors are heterogeneous. MVPs were significantly enriched at transcriptional start sites, leading to the identification of a candidate CpG island methylator phenotype in a sub-group of the HPV+ tumors. Supervised analysis identified a strong preponderance (87%) of MVPs towards hypermethylation in HPV+ HNSCC. Meta-analysis of our HNSCC and publicly available methylation data in cervical and lung cancers confirmed the observed DNA-methylation signature to be HPV-specific and tissue-independent. Grouping of MVPs into functionally more significant differentially methylated regions identified 43 hypermethylated promoter DMRs, including for three cadherins of the Polycomb group target genes. Integration with independent expression data showed strong negative correlation, especially for the cadherin gene-family members. Combinatorial ectopic expression of the two HPV oncogenes (E6 and E7) in an HPV- HNSCC cell line partially phenocopied the hypermethylation signature seen in HPV+ HNSCC tumors, and established E6 as the main viral effector gene.
Our data establish that archival FFPE tissue is very suitable for this type of methylome analysis, and suggest that HPV modulates the HNSCC epigenome through hypermethylation of Polycomb repressive complex 2 target genes such as cadherins, which are implicated in tumor progression and metastasis.
PMCID: PMC3706778  PMID: 23419152
17.  A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data 
Bioinformatics  2012;29(2):189-196.
Motivation: The Illumina Infinium 450 k DNA Methylation Beadchip is a prime candidate technology for Epigenome-Wide Association Studies (EWAS). However, a difficulty associated with these beadarrays is that probes come in two different designs, characterized by widely different DNA methylation distributions and dynamic range, which may bias downstream analyses. A key statistical issue is therefore how best to adjust for the two different probe designs.
Results: Here we propose a novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes. The strategy involves application of a three-state beta-mixture model to assign probes to methylation states, subsequent transformation of probabilities into quantiles and finally a methylation-dependent dilation transformation to preserve the monotonicity and continuity of the data. We validate our method on cell-line data, fresh frozen and paraffin-embedded tumour tissue samples and demonstrate that BMIQ compares favourably with two competing methods. Specifically, we show that BMIQ improves the robustness of the normalization procedure, reduces the technical variation and bias of type2 probe values and successfully eliminates the type1 enrichment bias caused by the lower dynamic range of type2 probes. BMIQ will be useful as a preprocessing step for any study using the Illumina Infinium 450 k platform.
Availability: BMIQ is freely available from
Supplementary information: Supplementary data are available at Bioinformatics online
PMCID: PMC3546795  PMID: 23175756
18.  Differential network entropy reveals cancer system hallmarks 
Scientific Reports  2012;2:802.
The cellular phenotype is described by a complex network of molecular interactions. Elucidating network properties that distinguish disease from the healthy cellular state is therefore of critical importance for gaining systems-level insights into disease mechanisms and ultimately for developing improved therapies. By integrating gene expression data with a protein interaction network we here demonstrate that cancer cells are characterised by an increase in network entropy. In addition, we formally demonstrate that gene expression differences between normal and cancer tissue are anticorrelated with local network entropy changes, thus providing a systemic link between gene expression changes at the nodes and their local correlation patterns. In particular, we find that genes which drive cell-proliferation in cancer cells and which often encode oncogenes are associated with reductions in network entropy. These findings may have potential implications for identifying novel drug targets.
PMCID: PMC3496163  PMID: 23150773
19.  Comments on: Interpretation of genome-wide infinium methylation data from ligated DNA in formalin-fixed paraffin-embedded paired tumor and normal tissue 
BMC Research Notes  2012;5:631.
BMC Research Notes recently published a research article regarding the use of ligated DNA extracted from formalin-fixed paraffin embedded (FFPE) tissue on the Illumina Infinium methylation platform - “Interpretation of genome-wide infinium methylation data from ligated DNA in formalin-fixed, paraffin-embedded paired tumor and normal tissue” Jasmine et al. BMC Research Notes 2012, 5:117. This article repeatedly refers to our previous work and concludes that methylation data obtained from ligated FFPE extracted DNA should be used with great caution. In this Discussion we review the data analysis performed in Jasmine et al’s paper and suggest limitations which subsequently lead the authors to draw what we believe are incorrect conclusions. Moreover, we continue to analyse genome-wide methylation data from DNA extracted from FFPE tissue successfully on both the HumMeth27 and 450 K arrays.
PMCID: PMC3531275  PMID: 23148593
20.  Differential oestrogen receptor binding is associated with clinical outcome in breast cancer 
Nature  2012;481(7381):389-393.
Oestrogen receptor-α (ER) is the defining and driving transcription factor in the majority of breast cancers and its target genes dictate cell growth and endocrine response, yet genomic understanding of ER function has been restricted to model systems1-3. We now map genome-wide ER binding events, by chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq), in primary breast cancers from patients with different clinical outcome and in distant ER positive (ER+) metastases. We find that drug resistant cancers still have ER-chromatin occupancy, but that ER binding is a dynamic process, with the acquisition of unique ER binding regions in tumours from patients that are likely to relapse. The acquired, poor outcome ER regulatory regions observed in primary tumours reveal gene signatures that predict clinical outcome in ER+ disease exclusively. We find that the differential ER binding programme observed in tumours from patients with poor outcome is not due to the selection of a rare subpopulation of cells, but is due to the FoxA1-mediated reprogramming of ER binding on a rapid time scale. The parallel redistribution of ER and FoxA1 cis-regulatory elements in drug resistant cellular contexts is supported by histological co-expression of ER and FoxA1 in metastatic samples. By establishing transcription factor mapping in primary tumour material, we show that there is plasticity in ER binding capacity, with distinct combinations of cis-regulatory elements linked with the different clinical outcomes.
PMCID: PMC3272464  PMID: 22217937
21.  A comparison of feature selection and classification methods in DNA methylation studies using the Illumina Infinium platform 
BMC Bioinformatics  2012;13:59.
The 27k Illumina Infinium Methylation Beadchip is a popular high-throughput technology that allows the methylation state of over 27,000 CpGs to be assayed. While feature selection and classification methods have been comprehensively explored in the context of gene expression data, relatively little is known as to how best to perform feature selection or classification in the context of Illumina Infinium methylation data. Given the rising importance of epigenomics in cancer and other complex genetic diseases, and in view of the upcoming epigenome wide association studies, it is critical to identify the statistical methods that offer improved inference in this novel context.
Using a total of 7 large Illumina Infinium 27k Methylation data sets, encompassing over 1,000 samples from a wide range of tissues, we here provide an evaluation of popular feature selection, dimensional reduction and classification methods on DNA methylation data. Specifically, we evaluate the effects of variance filtering, supervised principal components (SPCA) and the choice of DNA methylation quantification measure on downstream statistical inference. We show that for relatively large sample sizes feature selection using test statistics is similar for M and β-values, but that in the limit of small sample sizes, M-values allow more reliable identification of true positives. We also show that the effect of variance filtering on feature selection is study-specific and dependent on the phenotype of interest and tissue type profiled. Specifically, we find that variance filtering improves the detection of true positives in studies with large effect sizes, but that it may lead to worse performance in studies with smaller yet significant effect sizes. In contrast, supervised principal components improves the statistical power, especially in studies with small effect sizes. We also demonstrate that classification using the Elastic Net and Support Vector Machine (SVM) clearly outperforms competing methods like LASSO and SPCA. Finally, in unsupervised modelling of cancer diagnosis, we find that non-negative matrix factorisation (NMF) clearly outperforms principal components analysis.
Our results highlight the importance of tailoring the feature selection and classification methodology to the sample size and biological context of the DNA methylation study. The Elastic Net emerges as a powerful classification algorithm for large-scale DNA methylation studies, while NMF does well in the unsupervised context. The insights presented here will be useful to any study embarking on large-scale DNA methylation profiling using Illumina Infinium beadarrays.
PMCID: PMC3364843  PMID: 22524302
DNA methylation; Classification; Feature selection; Beadarrays
22.  Epigenetic variability in cells of normal cytology is associated with the risk of future morphological transformation 
Genome Medicine  2012;4(3):24.
Recently, it has been proposed that epigenetic variation may contribute to the risk of complex genetic diseases like cancer. We aimed to demonstrate that epigenetic changes in normal cells, collected years in advance of the first signs of morphological transformation, can predict the risk of such transformation.
We analyzed DNA methylation (DNAm) profiles of over 27,000 CpGs in cytologically normal cells of the uterine cervix from 152 women in a prospective nested case-control study. We used statistics based on differential variability to identify CpGs associated with the risk of transformation and a novel statistical algorithm called EVORA (Epigenetic Variable Outliers for Risk prediction Analysis) to make predictions.
We observed many CpGs that were differentially variable between women who developed a non-invasive cervical neoplasia within 3 years of sample collection and those that remained disease-free. These CpGs exhibited heterogeneous outlier methylation profiles and overlapped strongly with CpGs undergoing age-associated DNA methylation changes in normal tissue. Using EVORA, we demonstrate that the risk of cervical neoplasia can be predicted in blind test sets (AUC = 0.66 (0.58 to 0.75)), and that assessment of DNAm variability allows more reliable identification of risk-associated CpGs than statistics based on differences in mean methylation levels. In independent data, EVORA showed high sensitivity and specificity to detect pre-invasive neoplasia and cervical cancer (AUC = 0.93 (0.86 to 1) and AUC = 1, respectively).
We demonstrate that the risk of neoplastic transformation can be predicted from DNA methylation profiles in the morphologically normal cell of origin of an epithelial cancer. Having profiled only 0.1% of CpGs in the human genome, studies of wider coverage are likely to yield improved predictive and diagnostic models with the accuracy needed for clinical application.
Trial registration
The ARTISTIC trial is registered with the International Standard Randomised Controlled Trial Number ISRCTN25417821.
PMCID: PMC3446274  PMID: 22453031
24.  The Dynamics and Prognostic Potential of DNA Methylation Changes at Stem Cell Gene Loci in Women's Cancer 
PLoS Genetics  2012;8(2):e1002517.
Aberrant DNA methylation is an important cancer hallmark, yet the dynamics of DNA methylation changes in human carcinogenesis remain largely unexplored. Moreover, the role of DNA methylation for prediction of clinical outcome is still uncertain and confined to specific cancers. Here we perform the most comprehensive study of DNA methylation changes throughout human carcinogenesis, analysing 27,578 CpGs in each of 1,475 samples, ranging from normal cells in advance of non-invasive neoplastic transformation to non-invasive and invasive cancers and metastatic tissue. We demonstrate that hypermethylation at stem cell PolyComb Group Target genes (PCGTs) occurs in cytologically normal cells three years in advance of the first morphological neoplastic changes, while hypomethylation occurs preferentially at CpGs which are heavily Methylated in Embryonic Stem Cells (MESCs) and increases significantly with cancer invasion in both the epithelial and stromal tumour compartments. In contrast to PCGT hypermethylation, MESC hypomethylation progresses significantly from primary to metastatic cancer and defines a poor prognostic signature in four different gynaecological cancers. Finally, we associate expression of TET enzymes, which are involved in active DNA demethylation, to MESC hypomethylation in cancer. These findings have major implications for cancer and embryonic stem cell biology and establish the importance of systemic DNA hypomethylation for predicting prognosis in a wide range of different cancers.
Author Summary
DNA methylation is an important chemical modification of DNA that can affect and regulate the activity of genes in human tissue. Abnormal DNA methylation and its subsequent effects on gene activity are a hallmark of cancer, yet when precisely these DNA methylation changes occur and how they contribute to the development of cancer remains largely unexplored. In this work we measure the methylation state of DNA at over 14,000 genes in over 1,475 samples, including normal and benign cells, invasive cancers, and metastatic cancer tissue. Using cervical cancer as a model, we show that gain of abnormal methylation at genes typically un-methylated in stem cells can be detected up to 3 years in advance of the appearance of pre-cancerous cells, while those genes typically methylated in stem cells lose this methylation progressively throughout cancer development. Furthermore, we discover that this process of methylation loss during cancer progression is a marker of poor disease outcome common to all four major women-specific cancers: breast, ovarian, endometrial, and cervical cancers. Finally we demonstrate the relationship between loss of methylation and cancer-specific over-production of a specific protein known to play an active role in removing methylation from DNA. Taken together these findings highlight the complex nature of DNA methylation dynamics in cancer development as well as their potential exploitation for clinical gain.
PMCID: PMC3276553  PMID: 22346766
25.  DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference 
BMC Bioinformatics  2011;12:403.
Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (e.g gene or protein expression) with independent and highly curated structural interaction data (e.g protein interaction networks) or more generally with prior knowledge pathway databases. However, it is unclear how best to use the pathway knowledge information in the context of molecular profiles of any given study.
We present an algorithm called DART (Denoising Algorithm based on Relevance network Topology) which filters out noise before estimating pathway activity. Using simulated and real multidimensional cancer genomic data and by comparing DART to other algorithms which do not assess the relevance of the prior pathway information, we here demonstrate that substantial improvement in pathway activity predictions can be made if prior pathway information is denoised before predictions are made. We also show that genes encoding hubs in expression correlation networks represent more reliable markers of pathway activity. Using the Netpath resource of signalling pathways in the context of breast cancer gene expression data we further demonstrate that DART leads to more robust inferences about pathway activity correlations. Finally, we show that DART identifies a hypothesized association between oestrogen signalling and mammographic density in ER+ breast cancer.
Evaluating the consistency of prior information of pathway databases in molecular tumour profiles may substantially improve the subsequent inference of pathway activity in clinical tumour specimens. This de-noising strategy should be incorporated in approaches which attempt to infer pathway activity from prior pathway models.
PMCID: PMC3228554  PMID: 22011170

Results 1-25 (44)