PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (905673)

Clipboard (0)
None

Related Articles

1.  A species-generalized probabilistic model-based definition of CpG islands 
The DNA of most vertebrates is depleted in CpG dinucleotides, the target for DNA methylation. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). CGI have been useful as marking functionally relevant epigenetic loci for genome studies. For example, CGI are enriched in the promoters of vertebrate genes and thought to play an important role in regulation. Currently, CGI are defined algorithmically as an observed-to-expected ratio (O/E) of CpG greater than 0.6, G+C content greater than 0.5, and usually but not necessarily greater than a certain length. Here we find that the current definition leaves out important CpG clusters associated with epigenetic marks, relevant to development and disease, and does not apply at all to nonvertabrate genomes. We propose an alternative Hidden Markov model-based approach that solves these problems. We fit our model to genomes from 30 species, and the results support a new epigenomic view toward the development of DNA methylation in species diversity and evolution. The O/E of CpG in islands and nonislands segregated closely phylogenetically and showed substantial loss in both groups in animals of greater complexity, while maintaining a nearly constant difference in CpG O/E between islands and nonisland compartments. Lists of CGI for some species are available at http://www.rafalab.org.
doi:10.1007/s00335-009-9222-5
PMCID: PMC2962567  PMID: 19777308
2.  Redefining CpG islands using hidden Markov models 
Biostatistics (Oxford, England)  2010;11(3):499-514.
The DNA of most vertebrates is depleted in CpG dinucleotide: a C followed by a G in the 5′ to 3′ direction. CpGs are the target for DNA methylation, a chemical modification of cytosine (C) heritable during cell division and the most well-characterized epigenetic mechanism. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). Knowing CGI locations is important because they mark functionally relevant epigenetic loci in development and disease. For various mammals, including human, a readily available and widely used list of CGI is available from the UCSC Genome Browser. This list was derived using algorithms that search for regions satisfying a definition of CGI proposed by Gardiner-Garden and Frommer more than 20 years ago. Recent findings, enabled by advances in technology that permit direct measurement of epigenetic endpoints at a whole-genome scale, motivate the need to adapt the current CGI definition. In this paper, we propose a procedure, guided by hidden Markov models, that permits an extensible approach to detecting CGI. The main advantage of our approach over others is that it summarizes the evidence for CGI status as probability scores. This provides flexibility in the definition of a CGI and facilitates the creation of CGI lists for other species. The utility of this approach is demonstrated by generating the first CGI lists for invertebrates, and the fact that we can create CGI lists that substantially increases overlap with recently discovered epigenetic marks. A CGI list and the probability scores, as a function of genome location, for each species are available at http://www.rafalab.org.
doi:10.1093/biostatistics/kxq005
PMCID: PMC2883304  PMID: 20212320
CpG island; Epigenetics; Hidden Markov model; Sequence analysis
3.  Analysis of CpG methylation sites and CGI among human papillomavirus DNA genomes 
BMC Genomics  2011;12:580.
Background
The Human Papillomavirus (HPV) genome is divided into early and late coding sequences, including 8 open reading frames (ORFs) and a regulatory region (LCR). Viral gene expression may be regulated through epigenetic mechanisms, including cytosine methylation at CpG dinucleotides. We have analyzed the distribution of CpG sites and CpG islands/clusters (CGI) among 92 different HPV genomes grouped in function of their preferential tropism: cutaneous or mucosal. We calculated the proportion of CpG sites (PCS) for each ORF and calculated the expected CpG values for each viral type.
Results
CpGs are underrepresented in viral genomes. We found a positive correlation between CpG observed and expected values, with mucosal high-risk (HR) virus types showing the smallest O/E ratios. The ranges of the PCS were similar for most genomic regions except E4, where the majority of CpGs are found within islands/clusters. At least one CGI belongs to each E2/E4 region. We found positive correlations between PCS for each viral ORF when compared with the others, except for the LCR against four ORFs and E6 against three other ORFs. The distribution of CpG islands/clusters among HPV groups is heterogeneous and mucosal HR-HPV types exhibit both lower number and shorter island sizes compared to cutaneous and mucosal Low-risk (LR) HPVs (all of them significantly different).
Conclusions
There is a difference between viral and cellular CpG underrepresentation. There are significant correlations between complete genome PCS and a lack of correlations between several genomic region pairs, especially those involving LCR and E6. L2 and L1 ORF behavior is opposite to that of oncogenes E6 and E7. The first pair possesses relatively low numbers of CpG sites clustered in CGIs while the oncogenes possess a relatively high number of CpG sites not associated to CGIs. In all HPVs, E2/E4 is the only region with at least one CGI and shows a higher content of CpG sites in every HPV type with an identified E4. The mucosal HR-HPVs show either the shortest CGI size, followed by the mucosal LR-HPVs and lastly by the cutaneous viral subgroup, and a trend to the lowest CGI number, followed by the cutaneous viral subgroup and lastly by the mucosal LR-HPVs.
doi:10.1186/1471-2164-12-580
PMCID: PMC3293833  PMID: 22118413
4.  CpG_MI: a novel approach for identifying functional CpG islands in mammalian genomes 
Nucleic Acids Research  2009;38(1):e6.
CpG islands (CGIs) are CpG-rich regions compared to CpG-depleted bulk DNA of mammalian genomes and are generally regarded as the epigenetic regulatory regions in association with unmethylation, promoter activity and histone modifications. Accurate identification of CpG islands with epigenetic regulatory function in bulk genomes is of wide interest. Here, the common features of functional CGIs are identified using an average mutual information method to differentiate functional CGIs from the remaining CGIs. A new approach (CpG mutual information, CpG_MI) was further explored to identify functional CGIs based on the cumulative mutual information of physical distances between two neighboring CpGs. Compared to current approaches, CpG_MI achieved the highest prediction accuracy. This approach also identified new functional CGIs overlapping with gene promoter regions which were missed by other algorithms. Nearly all CGIs identified by CpG_MI overlapped with histone modification marks. CpG_MI could also be used to identify potential functional CGIs in other mammalian genomes, as the CpG dinucleotide contents and cumulative mutual information distributions are almost the same among six mammalian genomes in our analysis. It is a reliable quantitative tool for the identification of functional CGIs from bulk genomes and helps in understanding the relationships between genomic functional elements and epigenomic modifications.
doi:10.1093/nar/gkp882
PMCID: PMC2800233  PMID: 19854943
5.  CpG Island Mapping by Epigenome Prediction 
PLoS Computational Biology  2007;3(6):e110.
CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1) reliance on arbitrary threshold parameters that bear little biological justification, (2) failure to account for widespread heterogeneity among CpG islands, and (3) apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of “CpG island strength” that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted “bona fide” CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to their characteristic epigenetic and functional states. And it is superior to purely experimental epigenome mapping for CpG island detection since it abstracts from specific properties that are limited to a single cell type or tissue. In addition, using computational epigenetics methods we could identify high correlation between the epigenome and characteristics of the DNA sequence, a finding which emphasizes the need for a better understanding of the mechanistic links between genome and epigenome.
Author Summary
A key challenge for bioinformatic research is the identification of regulatory regions in the human genome. Regulatory regions are DNA elements that control gene expression and thereby contribute to the organism's phenotype. An important class of regulatory regions consists of so-called CpG islands, which are characterized by frequent occurrence of the CG sequence pattern. CpG islands are strongly associated with open and transcriptionally competent chromatin structure, they play a critical role in gene regulation, and they are involved in the epigenetic causes of cancer. In this article we make several conceptual improvements to the definition and mapping of CpG islands. First, we show that the traditional distinction between CpG islands and non-CpG islands is too harsh, and instead we propose a quantitative measure of CpG island strength to gradually distinguish between stronger and weaker regulatory regions. Second, by genome-wide comparison of multiple epigenome datasets we identify high correlation between features of the genome's DNA sequence and the epigenome, indicating strong functional interdependence. Third, we develop and apply a novel method for predicting the strength of all CpG islands in the human genome, giving rise to an improved and more accurate CpG island mapping.
doi:10.1371/journal.pcbi.0030110
PMCID: PMC1892605  PMID: 17559301
6.  CpG Island Mapping by Epigenome Prediction 
PLoS Computational Biology  2007;3(6):e110.
CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1) reliance on arbitrary threshold parameters that bear little biological justification, (2) failure to account for widespread heterogeneity among CpG islands, and (3) apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of “CpG island strength” that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted “bona fide” CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to their characteristic epigenetic and functional states. And it is superior to purely experimental epigenome mapping for CpG island detection since it abstracts from specific properties that are limited to a single cell type or tissue. In addition, using computational epigenetics methods we could identify high correlation between the epigenome and characteristics of the DNA sequence, a finding which emphasizes the need for a better understanding of the mechanistic links between genome and epigenome.
Author Summary
A key challenge for bioinformatic research is the identification of regulatory regions in the human genome. Regulatory regions are DNA elements that control gene expression and thereby contribute to the organism's phenotype. An important class of regulatory regions consists of so-called CpG islands, which are characterized by frequent occurrence of the CG sequence pattern. CpG islands are strongly associated with open and transcriptionally competent chromatin structure, they play a critical role in gene regulation, and they are involved in the epigenetic causes of cancer. In this article we make several conceptual improvements to the definition and mapping of CpG islands. First, we show that the traditional distinction between CpG islands and non-CpG islands is too harsh, and instead we propose a quantitative measure of CpG island strength to gradually distinguish between stronger and weaker regulatory regions. Second, by genome-wide comparison of multiple epigenome datasets we identify high correlation between features of the genome's DNA sequence and the epigenome, indicating strong functional interdependence. Third, we develop and apply a novel method for predicting the strength of all CpG islands in the human genome, giving rise to an improved and more accurate CpG island mapping.
doi:10.1371/journal.pcbi.0030110
PMCID: PMC1892605  PMID: 17559301
7.  CpG island chromatin 
Epigenetics  2011;6(2):147-152.
The majority of mammalian gene promoters are encompassed within regions of the genome called CpG islands that have an elevated level of non-methylated CpG dinucleotides. Despite over 20 years of study, the precise mechanisms by which CpG islands contribute to regulatory element function remain poorly understood. Recently it has been demonstrated that specific histone modifying enzymes are recruited directly to CpG islands through recognition of non-methylated CpG dinucleotide sequence. These enzymes then impose unique chromatin architecture on CpG islands that distinguish them from the surrounding genome. In the context of this work we discuss how CpG island elements may contribute to the function of gene regulatory elements through the utilization of chromatin and epigenetic processes.
doi:10.4161/epi.6.2.13640
PMCID: PMC3278783  PMID: 20935486
CpG island; chromatin; histone; methylation; promoter; ZF-CxxC domain; transcription; histone lysine demethylase
8.  CpG island hypermethylation in human astrocytomas 
Cancer research  2010;70(7):2718-2727.
Astrocytomas are common and lethal human brain tumors. We have analyzed the methylation status of over 28,000 CpG islands and 18,000 promoters in normal human brain and in astrocytomas of various grades using the methylated-CpG island recovery assay (MIRA). We identified six to seven thousand methylated CpG islands in normal human brain. ~5% of the promoter-associated CpG islands in normal brain are methylated. Promoter CpG island methylation is inversely and intragenic methylation is directly correlated with gene expression levels in brain tissue. In astrocytomas, several hundred CpG islands undergo specific hypermethylation relative to normal brain with 428 methylation peaks common to more than 25% of the tumors. Genes involved in brain development and neuronal differentiation, such as BMP4, POU4F3, GDNF, OTX2, NEFM, CNTN4, OTP, SIM1, FYN, EN1, CHAT, GSX2, NKX6-1, PAX6, RAX, and DLX2, were strongly enriched among genes frequently methylated in tumors. There was an overrepresentation of homeobox genes and 31% of the most commonly methylated genes represent targets of the Polycomb complex. We identified several chromosomal loci in which many (sometimes more than 20) consecutive CpG islands were hypermethylated in tumors. Seven of such loci were near homeobox genes, including the HOXC and HOXD clusters, and the BARHL2, DLX1, and PITX2 genes. Two other clusters of hypermethylated islands were at sequences of recent gene duplication events. Our analysis offers mechanistic insights into brain neoplasia suggesting that methylation of genes involved in neuronal differentiation, in cooperation with other oncogenic events, may shift the balance from regulated differentiation towards gliomagenesis.
doi:10.1158/0008-5472.CAN-09-3631
PMCID: PMC2848870  PMID: 20233874
9.  CpGislandEVO: A Database and Genome Browser for Comparative Evolutionary Genomics of CpG Islands 
BioMed Research International  2013;2013:709042.
Hypomethylated, CpG-rich DNA segments (CpG islands, CGIs) are epigenome markers involved in key biological processes. Aberrant methylation is implicated in the appearance of several disorders as cancer, immunodeficiency, or centromere instability. Furthermore, methylation differences at promoter regions between human and chimpanzee strongly associate with genes involved in neurological/psychological disorders and cancers. Therefore, the evolutionary comparative analyses of CGIs can provide insights on the functional role of these epigenome markers in both health and disease. Given the lack of specific tools, we developed CpGislandEVO. Briefly, we first compile a database of statistically significant CGIs for the best assembled mammalian genome sequences available to date. Second, by means of a coupled browser front-end, we focus on the CGIs overlapping orthologous genes extracted from OrthoDB, thus ensuring the comparison between CGIs located on truly homologous genome segments. This allows comparing the main compositional features between homologous CGIs. Finally, to facilitate nucleotide comparisons, we lifted genome coordinates between assemblies from different species, which enables the analysis of sequence divergence by direct count of nucleotide substitutions and indels occurring between homologous CGIs. The resulting CpGislandEVO database, linking together CGIs and single-cytosine DNA methylation data from several mammalian species, is freely available at our website.
doi:10.1155/2013/709042
PMCID: PMC3800598  PMID: 24205506
10.  CpG islands: algorithms and applications in methylation studies 
Methylation occurs frequently at 5′-cytosine of the CpG dinucleotides in vertebrate genomes; however, this epigenetic feature is rarely observed in CpG islands (CGIs) or CpG clusters in the promoter regions of genes. Aberrant methylation of the promoter-associated CGIs might influence gene expression and cause carcinogenesis. Because of the functional importance, multiple algorithms have been available for identifying CGIs in a genome or a sequence. They can be categorized into the traditional algorithms (e.g., Gardiner-Garden and Frommer (1987), Takai and Jones (2002), and CpGPRoD (2002)) or statistical property based algorithms (CpGcluster (2006) and CG cluster (2007)). We reviewed the features of these algorithms and evaluated their performance on identifying functional CGIs using genome-wide methylation data. Moreover, identification of CGIs is an initial step in many recent studies for predicting methylation status as well as in the design of methylation detection platforms. We reviewed the benchmarks and features used in these studies.
doi:10.1016/j.bbrc.2009.03.076
PMCID: PMC2679166  PMID: 19302978
CpG island; CpG cluster; CG clusters; Methylation; Epigenetics; Promoter; Prediction algorithm
11.  The Relationship of DNA Methylation with Age, Gender and Genotype in Twins and Healthy Controls 
PLoS ONE  2009;4(8):e6767.
Cytosine-5 methylation within CpG dinucleotides is a potentially important mechanism of epigenetic influence on human traits and disease. In addition to influences of age and gender, genetic control of DNA methylation levels has recently been described. We used whole blood genomic DNA in a twin set (23 MZ twin-pairs and 23 DZ twin-pairs, N = 92) as well as healthy controls (N = 96) to investigate heritability and relationship with age and gender of selected DNA methylation profiles using readily commercially available GoldenGate bead array technology. Despite the inability to detect meaningful methylation differences in the majority of CpG loci due to tissue type and locus selection issues, we found replicable significant associations of DNA methylation with age and gender. We identified associations of genetically heritable single nucleotide polymorphisms with large differences in DNA methylation levels near the polymorphism (cis effects) as well as associations with much smaller differences in DNA methylation levels elsewhere in the human genome (trans effects). Our results demonstrate the feasibility of array-based approaches in studies of DNA methylation and highlight the vast differences between individual loci. The identification of CpG loci of which DNA methylation levels are under genetic control or are related to age or gender will facilitate further studies into the role of DNA methylation and disease.
doi:10.1371/journal.pone.0006767
PMCID: PMC2747671  PMID: 19774229
12.  Putative Zinc Finger Protein Binding Sites Are Over-Represented in the Boundaries of Methylation-Resistant CpG Islands in the Human Genome 
PLoS ONE  2007;2(11):e1184.
Background
Majority of CpG dinucleotides in mammalian genomes tend to undergo DNA methylation, but most CpG islands are resistant to such epigenetic modification. Understanding about mechanisms that may lead to the methylation resistance of CpG islands is still very poor.
Methodology/Principal Findings
Using the genome-scale in vivo DNA methylation data from human brain, we investigated the flanking sequence features of methylation-resistant CpG islands, and discovered that there are several over-represented putative Transcription Factor Binding Sites (TFBSs) in methylation-resistant CpG islands, and a specific group of zinc finger protein binding sites are over-represented in boundary regions (∼400 bp) flanking such CpG islands. About 77% of the over-represented putative TFBSs are conserved among human, mouse and rat. We also observed the enrichment of 4 histone methylations in methylation-resistant CpG islands or their boundaries.
Conclusions/Significance
Our results suggest a possible mechanism that certain putative zinc finger protein binding sites over-represented in the boundary regions of the methylation-resistant CpG islands may block the spreading of methylation into these islands, and those TFBSs over-represented within the islands may both reinforce the methylation blocking and promote transcription. Some histone modifications may also enhance the immunity of the CpG islands against DNA methylation by augmenting these TFs' binding. We speculate that the dynamical equilibrium between methylation spreading and blocking is likely to be responsible for the establishment and maintenance of the relatively stable DNA methylation pattern in human somatic cells.
doi:10.1371/journal.pone.0001184
PMCID: PMC2065907  PMID: 18030324
13.  Bio-CAP: a versatile and highly sensitive technique to purify and characterise regions of non-methylated DNA 
Nucleic Acids Research  2011;40(4):e32.
Across vertebrate genomes methylation of cytosine residues within the context of CpG dinucleotides is a pervasive epigenetic mark that can impact gene expression and has been implicated in various developmental and disease-associated processes. Several biochemical approaches exist to profile DNA methylation, but recently an alternative approach based on profiling non-methylated CpGs was developed. This technique, called CxxC affinity purification (CAP), uses a ZF-CxxC (CxxC) domain to specifically capture DNA containing clusters of non-methylated CpGs. Here we describe a new CAP approach, called biotinylated CAP (Bio-CAP), which eliminates the requirement for specialized equipment while dramatically improving and simplifying the CxxC-based DNA affinity purification. Importantly, this approach isolates non-methylated DNA in a manner that is directly proportional to the density of non-methylated CpGs, and discriminates non-methylated CpGs from both methylated and hydroxymethylated CpGs. Unlike conventional CAP, Bio-CAP can be applied to nanogram quantities of genomic DNA and in a magnetic format is amenable to efficient parallel processing of samples. Furthermore, Bio-CAP can be applied to genome-wide profiling of non-methylated DNA with relatively small amounts of input material. Therefore, Bio-CAP is a simple and streamlined approach for characterizing regions of the non-methylated DNA, whether at specific target regions or genome wide.
doi:10.1093/nar/gkr1207
PMCID: PMC3287171  PMID: 22156374
14.  Latent Regulatory Potential of Human-Specific Repetitive Elements 
Molecular Cell  2013;49(2):262-272.
Summary
At least half of the human genome is derived from repetitive elements, which are often lineage specific and silenced by a variety of genetic and epigenetic mechanisms. Using a transchromosomic mouse strain that transmits an almost complete single copy of human chromosome 21 via the female germline, we show that a heterologous regulatory environment can transcriptionally activate transposon-derived human regulatory regions. In the mouse nucleus, hundreds of locations on human chromosome 21 newly associate with activating histone modifications in both somatic and germline tissues, and influence the gene expression of nearby transcripts. These regions are enriched with primate and human lineage-specific transposable elements, and their activation corresponds to changes in DNA methylation at CpG dinucleotides. This study reveals the latent regulatory potential of the repetitive human genome and illustrates the species specificity of mechanisms that control it.
Highlights
► A mouse carrying human chromosome 21 fails to repress primate-specific repeats ► The lack of repression was revealed by H3K4me3 and transcription factor binding ► Activation corresponded to a decrease in CpG methylation ► Primate-specific repeats activated in human testes were activated in the Tc1 mouse
doi:10.1016/j.molcel.2012.11.013
PMCID: PMC3560060  PMID: 23246434
15.  Role of hMOF-dependent histone H4 lysine 16 acetylation in the maintenance of TMS1/ASC gene activity1 
Cancer research  2008;68(16):6810-6821.
Epigenetic silencing of tumor suppressor genes in human cancers is associated with aberrant methylation of promoter region CpG islands and local alterations in histone modifications. However, the mechanisms that drive these events remain unclear. Here, we establish an important role for histone H4 lysine 16 acetylation (H4K16Ac) and the histone acetyltransferase hMOF in the regulation of TMS1/ASC, a proapoptotic gene that undergoes epigenetic silencing in human cancers. In the unmethylated and active state, the TMS1 CpG island is spanned by positioned nucleosomes and marked by histone H3K4 methylation. H4K16Ac was uniquely localized to two sharp peaks that flanked the unmethylated CpG island and corresponded to strongly positioned nucleosomes. Aberrant methylation and silencing of TMS1 was accompanied by loss of the H4K16Ac peaks, loss of nucleosome positioning, hypomethylation of H3K4 and hypermethylation of H3K9. In addition, a single peak of histone H4 lysine 20 trimethylation was observed near the transcription start site. Downregulation of hMOF or another component of the MSL complex resulted in a gene-specific decrease in H4K16Ac, loss of nucleosome positioning and silencing of TMS1. Gene silencing induced by H4K16 deacetylation occurred independently of changes in histone methylation and DNA methylation and was reversed upon hMOF re-expression. These results indicate that the selective marking of nucleosomes flanking the CpG island by hMOF is required to maintain TMS1 gene activity, and suggest that the loss of H4K16Ac, mobilization of nucleosomes and transcriptional downregulation may be important events in the epigenetic silencing of certain tumor suppressor genes in cancer.
doi:10.1158/0008-5472.CAN-08-0141
PMCID: PMC2585755  PMID: 18701507
DNA methylation; gene regulation; histone modifications; chromatin; cancer
16.  CpG Island microarray probe sequences derived from a physical library are representative of CpG Islands annotated on the human genome 
Nucleic Acids Research  2005;33(9):2952-2961.
An effective tool for the global analysis of both DNA methylation status and protein–chromatin interactions is a microarray constructed with sequences containing regulatory elements. One type of array suited for this purpose takes advantage of the strong association between CpG Islands (CGIs) and gene regulatory regions. We have obtained 20 736 clones from a CGI Library and used these to construct CGI arrays. The utility of this library requires proper annotation and assessment of the clones, including CpG content, genomic origin and proximity to neighboring genes. Alignment of clone sequences to the human genome (UCSC hg17) identified 9595 distinct genomic loci; 64% were defined by a single clone while the remaining 36% were represented by multiple, redundant clones. Approximately 68% of the loci were located near a transcription start site. The distribution of these loci covered all 23 chromosomes, with 63% overlapping a bioinformatically identified CGI. The high representation of genomic CGI in this rich collection of clones supports the utilization of microarrays produced with this library for the study of global epigenetic mechanisms and protein–chromatin interactions. A browsable database is available on-line to facilitate exploration of the CGIs in this library and their association with annotated genes or promoter elements.
doi:10.1093/nar/gki582
PMCID: PMC1137027  PMID: 15911630
17.  A Novel CpG Island Set Identifies Tissue-Specific Methylation at Developmental Gene Loci 
PLoS Biology  2008;6(1):e22.
CpG islands (CGIs) are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%–8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.
Author Summary
The human genome contains about 22,000 genes, each encoding one of the proteins required for human life. A particular cell type (e.g., blood, skin, etc.) expresses a specific subset of protein genes and silences the remainder. To shed light on the mechanisms that cause genes to be activated or shut down, we studied DNA sequences called “CpG islands” (CGIs). These sequences are found at over half of all human genes and can exist in either the active or silent state depending on the presence or absence of methyl groups on the DNA. We devised a method for purifying all CGIs and showed that, unexpectedly, only half occur at the beginning of genes near the promoter, the rest occurring within or between genes. Notably, methylation of CGIs causes stable gene silencing. We tested 17,000 CGIs in four human tissues and found that 6%–8% were methylated in each. Genes whose protein products play an essential role during embryonic development were preferentially methylated, suggesting that gene expression during development could be regulated by CGI methylation.
CpG island methylation, an epigenetic phenomenon usually associated with abnormality in disease, is little characterised in the context of "normal" human cells. Here we highlight tissue-specific CpG Island methylation, which frequently associates with developmental genes.
doi:10.1371/journal.pbio.0060022
PMCID: PMC2214817  PMID: 18232738
18.  Strategies for discovery and validation of methylated and hydroxymethylated DNA biomarkers 
Cancer Medicine  2012;1(2):237-260.
DNA methylation, consisting of the addition of a methyl group at the fifth-position of cytosine in a CpG dinucleotide, is one of the most well-studied epigenetic mechanisms in mammals with important functions in normal and disease biology. Disease-specific aberrant DNA methylation is a well-recognized hallmark of many complex diseases. Accordingly, various studies have focused on characterizing unique DNA methylation marks associated with distinct stages of disease development as they may serve as useful biomarkers for diagnosis, prognosis, prediction of response to therapy, or disease monitoring. Recently, novel CpG dinucleotide modifications with potential regulatory roles such as 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxylcytosine have been described. These potential epigenetic marks cannot be distinguished from 5-methylcytosine by many current strategies and may potentially compromise assessment and interpretation of methylation data. A large number of strategies have been described for the discovery and validation of DNA methylation-based biomarkers, each with its own advantages and limitations. These strategies can be classified into three main categories: restriction enzyme digestion, affinity-based analysis, and bisulfite modification. In general, candidate biomarkers are discovered using large-scale, genome-wide, methylation sequencing, and/or microarray-based profiling strategies. Following discovery, biomarker performance is validated in large independent cohorts using highly targeted locus-specific assays. There are still many challenges to the effective implementation of DNA methylation-based biomarkers. Emerging innovative methylation and hydroxymethylation detection strategies are focused on addressing these gaps in the field of epigenetics. The development of DNA methylation- and hydroxymethylation-based biomarkers is an exciting and rapidly evolving area of research that holds promise for potential applications in diverse clinical settings.
doi:10.1002/cam4.22
PMCID: PMC3544446  PMID: 23342273
Affinity-based methylation analysis; bisulfite modification; hydroxymethylation; methylation-sensitive restriction enzymes; microarrays; next-generation sequencing
19.  CpG island density and its correlations with genomic features in mammalian genomes 
Genome Biology  2008;9(5):R79.
A systematic analysis of CpG islands in ten mammalian genomes suggests that an increase in chromosome number elevates GC content and prevents loss of CpG islands.
Background
CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features.
Results
In this study, we performed a systematic analysis of CpG islands in ten mammalian genomes. We found that both the number of CpG islands and their density vary greatly among genomes, though many of these genomes encode similar numbers of genes. We observed significant correlations between CpG island density and genomic features such as number of chromosomes, chromosome size, and recombination rate. We also observed a trend of higher CpG island density in telomeric regions. Furthermore, we evaluated the performance of three computational algorithms for CpG island identifications. Finally, we compared our observations in mammals to other non-mammal vertebrates.
Conclusion
Our study revealed that CpG islands vary greatly among mammalian genomes. Some factors such as recombination rate and chromosome size might have influenced the evolution of CpG islands in the course of mammalian evolution. Our results suggest a scenario in which an increase in chromosome number increases the rate of recombination, which in turn elevates GC content to help prevent loss of CpG islands and maintain their density. These findings should be useful for studying mammalian genomes, the role of CpG islands in gene function, and molecular evolution.
doi:10.1186/gb-2008-9-5-r79
PMCID: PMC2441465  PMID: 18477403
20.  Genomic Distribution and Inter-Sample Variation of Non-CpG Methylation across Human Cell Types 
PLoS Genetics  2011;7(12):e1002389.
DNA methylation plays an important role in development and disease. The primary sites of DNA methylation in vertebrates are cytosines in the CpG dinucleotide context, which account for roughly three quarters of the total DNA methylation content in human and mouse cells. While the genomic distribution, inter-individual stability, and functional role of CpG methylation are reasonably well understood, little is known about DNA methylation targeting CpA, CpT, and CpC (non-CpG) dinucleotides. Here we report a comprehensive analysis of non-CpG methylation in 76 genome-scale DNA methylation maps across pluripotent and differentiated human cell types. We confirm non-CpG methylation to be predominantly present in pluripotent cell types and observe a decrease upon differentiation and near complete absence in various somatic cell types. Although no function has been assigned to it in pluripotency, our data highlight that non-CpG methylation patterns reappear upon iPS cell reprogramming. Intriguingly, the patterns are highly variable and show little conservation between different pluripotent cell lines. We find a strong correlation of non-CpG methylation and DNMT3 expression levels while showing statistical independence of non-CpG methylation from pluripotency associated gene expression. In line with these findings, we show that knockdown of DNMTA and DNMT3B in hESCs results in a global reduction of non-CpG methylation. Finally, non-CpG methylation appears to be spatially correlated with CpG methylation. In summary these results contribute further to our understanding of cytosine methylation patterns in human cells using a large representative sample set.
Author Summary
Epigenetic modifications including DNA methylation at the position 5 of the cytosine base provide regulatory information to the genome sequence. The primary target of cytosine methylation in mammals is the CpG dinucleotide. However, previous studies in the mouse and more recent work in humans have highlighted the presence of non-CpG methylation in pluripotent cells. Currently, little is known about the role of this type of DNA methylation. We sought to further characterize non-CpG methylation by employing a comprehensive data set of genome-scale methylation maps across various human cell types. Our analysis reveals that non-CpG methylation varies dramatically between pluripotent cells and is closely linked to CpG methylation. Moreover, we show that depletion of the de novo DNA methyltransferases results in a global reduction of non-CpG methylation levels. Taken together, these findings further advance our understanding of cytosine methylation and describe its distribution among a large number of human cell types.
doi:10.1371/journal.pgen.1002389
PMCID: PMC3234221  PMID: 22174693
21.  Distinct DNA methylation changes highly correlated with chronological age in the human brain 
Human Molecular Genetics  2011;20(6):1164-1172.
Methylation at CpG sites is a critical epigenetic modification in mammals. Altered DNA methylation has been suggested to be a central mechanism in development, some disease processes and cellular senescence. Quantifying the extent and identity of epigenetic changes in the aging process is therefore potentially important for understanding longevity and age-related diseases. In the current study, we have examined DNA methylation at >27 000 CpG sites throughout the human genome, in frontal cortex, temporal cortex, pons and cerebellum from 387 human donors between the ages of 1 and 102 years. We identify CpG loci that show a highly significant, consistent correlation between DNA methylation and chronological age. The majority of these loci are within CpG islands and there is a positive correlation between age and DNA methylation level. Lastly, we show that the CpG sites where the DNA methylation level is significantly associated with age are physically close to genes involved in DNA binding and regulation of transcription. This suggests that specific age-related DNA methylation changes may have quite a broad impact on gene expression in the human brain.
doi:10.1093/hmg/ddq561
PMCID: PMC3043665  PMID: 21216877
22.  Discovering Cooperative Relationships of Chromatin Modifications in Human T Cells Based on a Proposed Closeness Measure 
PLoS ONE  2010;5(12):e14219.
Background
Eukaryotic transcription is accompanied by combinatorial chromatin modifications that serve as functional epigenetic markers. Composition of chromatin modifications specifies histone codes that regulate the associated gene. Discovering novel chromatin regulatory relationships are of general interest.
Methodology/Principal Findings
Based on the premise that the interaction of chromatin modifications is hypothesized to influence CpG methylation, we present a closeness measure to characterize the regulatory interactions of epigenomic features. The closeness measure is applied to genome-wide CpG methylation and histone modification datasets in human CD4+T cells to select a subset of potential features. To uncover epigenomic and genomic patterns, CpG loci are clustered into nine modules associated with distinct chromatin and genomic signatures based on terms of biological function. We then performed Bayesian network inference to uncover inherent regulatory relationships from the feature selected closeness measure profile and all nine module-specific profiles respectively. The global and module-specific network exhibits topological proximity and modularity. We found that the regulatory patterns of chromatin modifications differ significantly across modules and that distinct patterns are related to specific transcriptional levels and biological function. DNA methylation and genomic features are found to have little regulatory function. The regulatory relationships were partly validated by literature reviews. We also used partial correlation analysis in other cells to verify novel regulatory relationships.
Conclusions/Significance
The interactions among chromatin modifications and genomic elements characterized by a closeness measure help elucidate cooperative patterns of chromatin modification in transcriptional regulation and help decipher complex histone codes.
doi:10.1371/journal.pone.0014219
PMCID: PMC2997069  PMID: 21151929
23.  Methylation detection oligonucleotide microarray analysis: a high-resolution method for detection of CpG island methylation 
Nucleic Acids Research  2009;37(12):e89.
Methylation of CpG islands associated with genes can affect the expression of the proximal gene, and methylation of non-associated CpG islands correlates to genomic instability. This epigenetic modification has been shown to be important in many pathologies, from development and disease to cancer. We report the development of a novel high-resolution microarray that detects the methylation status of over 25 000 CpG islands in the human genome. Experiments were performed to demonstrate low system noise in the methodology and that the array probes have a high signal to noise ratio. Methylation measurements between different cell lines were validated demonstrating the accuracy of measurement. We then identified alterations in CpG islands, both those associated with gene promoters, as well as non-promoter-associated islands in a set of breast and ovarian tumors. We demonstrate that this methodology accurately identifies methylation profiles in cancer and in principle it can differentiate any CpG methylation alterations and can be adapted to analyze other species.
doi:10.1093/nar/gkp413
PMCID: PMC2709589  PMID: 19474344
24.  Bioinformatic interrogation of expression array data to identify nutritionally regulated genes potentially modulated by DNA methylation 
Genes & Nutrition  2008;3(3-4):167-171.
DNA methylation occurs at CpG dinucleotide sites within the genome and is recognised as one of the mechanisms involved in regulation of gene expression. CpG sites are relatively underrepresented in the mammalian genome, but occur densely in regions called CpG islands (CGIs). CGIs located in the promoters of genes inhibit transcription when methylated by impeding transcription factor binding. Due to the malleable nature of DNA methylation, environmental factors are able to influence promoter CGI methylation patterns and thus influence gene expression. Recent studies have provided evidence that nutrition (and other environmental exposures) can cause altered CGI methylation but, with a few exceptions, the genes influenced by these exposures remain largely unknown. Here we describe a novel bioinformatics approach for the analysis of gene expression microarray data designed to identify regulatory sites within promoters of differentially expressed genes that may be influenced by changes in DNA methylation.
doi:10.1007/s12263-008-0095-0
PMCID: PMC2593010  PMID: 19034551
Bioinformatics; CpG islands; DNA methylation; Gene expression; In silico promoter analysis; Transcription factor binding sites
25.  Genome-Wide Analysis of DNA Methylation in Human Amnion 
The Scientific World Journal  2013;2013:678156.
The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor) and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR) gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3) gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies.
doi:10.1155/2013/678156
PMCID: PMC3590748  PMID: 23533356

Results 1-25 (905673)