Search tips
Search criteria

Results 1-20 (20)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina–associated domains 
Nature genetics  2011;44(1):40-46.
Extensive changes in DNA methylation are common in cancer and may contribute to oncogenesis through transcriptional silencing of tumor-suppressor genes1. Genome-scale studies have yielded important insights into these changes2, 3, 4, 5 but have focused on CpG islands or gene promoters. We used whole-genome bisulfite sequencing (bisulfite-seq) to comprehensively profile a primary human colorectal tumor and adjacent normal colon tissue at single-basepair resolution. Regions of focal hypermethylation in the tumor were located primarily at CpG islands and were concentrated within regions of long-range (>100 kb) hypomethylation. These hypomethylated domains covered nearly half of the genome and coincided with late replication and attachment to the nuclear lamina in human cell lines. We confirmed the confluence of hypermethylation and hypomethylation within these domains in 25 diverse colorectal tumors and matched adjacent tissue. We propose that widespread DNA methylation changes in cancer are linked to silencing programs orchestrated by the three-dimensional organization of chromatin within the nucleus.
PMCID: PMC4309644  PMID: 22120008
2.  Multiscale Representation of Genomic Signals 
Nature methods  2014;11(6):689-694.
Genomic information is encoded on a wide range of distance scales, ranging from tens of base pairs to megabases. We developed a multiscale framework to analyze and visualize the information content of genomic signals. Different types of signals, such as GC content or DNA methylation, are characterized by distinct patterns of signal enrichment or depletion across scales spanning several orders of magnitude. These patterns are associated with a variety of genomic annotations, including genes, nuclear lamina associated domains, and repeat elements. By integrating the information across all scales, as compared to using any single scale, we demonstrate improved prediction of gene expression from Polymerase II chromatin immunoprecipitation sequencing (ChIP-seq) measurements and we observed that gene expression differences in colorectal cancer are not most strongly related to gene body methylation, but rather to methylation patterns that extend beyond the single-gene scale.
PMCID: PMC4040162  PMID: 24727652
3.  Global loss of DNA methylation uncovers intronic enhancers in genes showing expression changes 
Genome Biology  2014;15(9):469.
Gene expression is epigenetically regulated by a combination of histone modifications and methylation of CpG dinucleotides in promoters. In normal cells, CpG-rich promoters are typically unmethylated, marked with histone modifications such as H3K4me3, and are highly active. During neoplastic transformation, CpG dinucleotides of CG-rich promoters become aberrantly methylated, corresponding with the removal of active histone modifications and transcriptional silencing. Outside of promoter regions, distal enhancers play a major role in the cell type-specific regulation of gene expression. Enhancers, which function by bringing activating complexes to promoters through chromosomal looping, are also modulated by a combination of DNA methylation and histone modifications.
Here we use HCT116 colorectal cancer cells with and without mutations in DNA methyltransferases, the latter of which results in a 95% reduction in global DNA methylation levels. These cells are used to study the relationship between DNA methylation, histone modifications, and gene expression. We find that the loss of DNA methylation is not sufficient to reactivate most of the silenced promoters. In contrast, the removal of DNA methylation results in the activation of a large number of enhancer regions as determined by the acquisition of active histone marks.
Although the transcriptome is largely unaffected by the loss of DNA methylation, we identify two distinct mechanisms resulting in the upregulation of distinct sets of genes. One is a direct result of DNA methylation loss at a set of promoter regions and the other is due to the presence of new intragenic enhancers.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0469-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4203885  PMID: 25239471
4.  Characterizing the genetic basis of methylome diversity in histologically normal human lung tissue 
Nature communications  2014;5:3365.
The genetic regulation of the human epigenome is not fully appreciated. Here we describe the effects of genetic variants on the DNA methylome in human lung based on methylation-quantitative trait loci (meQTL) analyses. We report 34,304 cis- and 585 trans-meQTLs, a genetic-epigenetic interaction of surprising magnitude, including a regulatory hotspot. These findings are replicated in both breast and kidney tissues and show distinct patterns: cis-meQTLs mostly localize to CpG sites outside of genes, promoters, and CpG islands (CGIs), while trans-meQTLs are over-represented in promoter CGIs. meQTL SNPs are enriched in CTCF binding sites, DNaseI hypersensitivity regions and histone marks. Importantly, 4 of the 5 established lung cancer risk loci in European ancestry are cis-meQTLs and, in aggregate, cis-meQTLs are enriched for lung cancer risk in a genome-wide analysis of 11,587 subjects. Thus, inherited genetic variation may affect lung carcinogenesis by regulating the human methylome.
PMCID: PMC3982882  PMID: 24572595
6.  Integrated Transcriptomic and Epigenomic Analysis of Primary Human Lung Epithelial Cell Differentiation 
PLoS Genetics  2013;9(6):e1003513.
Elucidation of the epigenetic basis for cell-type specific gene regulation is key to gaining a full understanding of how the distinct phenotypes of differentiated cells are achieved and maintained. Here we examined how epigenetic changes are integrated with transcriptional activation to determine cell phenotype during differentiation. We performed epigenomic profiling in conjunction with transcriptomic profiling using in vitro differentiation of human primary alveolar epithelial cells (AEC). This model recapitulates an in vivo process in which AEC transition from one differentiated cell type to another during regeneration following lung injury. Interrogation of histone marks over time revealed enrichment of specific transcription factor binding motifs within regions of changing chromatin structure. Cross-referencing of these motifs with pathways showing transcriptional changes revealed known regulatory pathways of distal alveolar differentiation, such as the WNT and transforming growth factor beta (TGFB) pathways, and putative novel regulators of adult AEC differentiation including hepatocyte nuclear factor 4 alpha (HNF4A), and the retinoid X receptor (RXR) signaling pathways. Inhibition of the RXR pathway confirmed its functional relevance for alveolar differentiation. Our incorporation of epigenetic data allowed specific identification of transcription factors that are potential direct upstream regulators of the differentiation process, demonstrating the power of this approach. Integration of epigenomic data with transcriptomic profiling has broad application for the identification of regulatory pathways in other models of differentiation.
Author Summary
Understanding the role of epigenetic control of gene expression is critical to the full description of biological processes, such as development and regeneration. Herein we utilize the differentiation of cells from the distal lung to gain insight into the correlation between the epigenetic landscape, molecular signaling events, and eventual changes in transcription and phenotype. We found that by integrating epigenetic profiling with whole genome transcriptomic data we were able to determine which molecular signaling events were activated and repressed during adult alveolar epithelial cell differentiation, and we identified epigenetic changes that contributed to these changes. Furthermore, we validated the role of one of these predicted but not previously identified pathways, retinoid X receptor signaling, in this process.
PMCID: PMC3688557  PMID: 23818859
8.  Bis-SNP: Combined DNA methylation and SNP calling for Bisulfite-seq data 
Genome Biology  2012;13(7):R61.
Bisulfite treatment of DNA followed by high-throughput sequencing (Bisulfite-seq) is an important method for studying DNA methylation and epigenetic gene regulation, yet current software tools do not adequately address single nucleotide polymorphisms (SNPs). Identifying SNPs is important for accurate quantification of methylation levels and for identification of allele-specific epigenetic events such as imprinting. We have developed a model-based bisulfite SNP caller, Bis-SNP, that results in substantially better SNP calls than existing methods, thereby improving methylation estimates. At an average 30× genomic coverage, Bis-SNP correctly identified 96% of SNPs using the default high-stringency settings. The open-source package is available at
PMCID: PMC3491382  PMID: 22784381
9.  FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs 
Nucleic Acids Research  2012;40(18):e139.
Single nucleotide polymorphisms (SNPs) are increasingly used to tag genetic loci associated with phenotypes such as risk of complex diseases. Technically, this is done genome-wide without prior restriction or knowledge of biological feasibility in scans referred to as genome-wide association studies (GWAS). Depending on the linkage disequilibrium (LD) structure at a particular locus, such tagSNPs may be surrogates for many thousands of other SNPs, and it is difficult to distinguish those that may play a functional role in the phenotype from those simply genetically linked. Because a large proportion of tagSNPs have been identified within non-coding regions of the genome, distinguishing functional from non-functional SNPs has been an even greater challenge. A strategy was recently proposed that prioritizes surrogate SNPs based on non-coding chromatin and epigenomic mapping techniques that have become feasible with the advent of massively parallel sequencing. Here, we introduce an R/Bioconductor software package that enables the identification of candidate functional SNPs by integrating information from tagSNP locations, lists of linked SNPs from the 1000 genomes project and locations of chromatin features which may have functional significance. Availability: FunciSNP is available from Bioconductor (
PMCID: PMC3467035  PMID: 22684628
10.  Dynamic Nucleosome-Depleted Regions at Androgen Receptor Enhancers in the Absence of Ligand in Prostate Cancer Cells ▿  
Molecular and Cellular Biology  2011;31(23):4648-4662.
Nucleosome positioning at transcription start sites is known to regulate gene expression by altering DNA accessibility to transcription factors; however, its role at enhancers is poorly understood. We investigated nucleosome positioning at the androgen receptor (AR) enhancers of TMPRSS2, KLK2, and KLK3/PSA in prostate cancer cells. Surprisingly, a population of enhancer modules in androgen-deprived cultures showed nucleosome-depleted regions (NDRs) in all three loci. Under androgen-deprived conditions, NDRs at the TMPRSS2 enhancer were maintained by the pioneer AR transcriptional collaborator GATA-2. Androgen treatment resulted in AR occupancy, an increased number of enhancer modules with NDRs without changes in footprint width, increased levels of histone H3 acetylation (AcH3), and dimethylation (H3K4me2) at nucleosomes flanking the NDRs. Our data suggest that, in the absence of ligand, AR enhancers exist in an equilibrium in which a percentage of modules are occupied by nucleosomes while others display NDRs. We propose that androgen treatment leads to the disruption of the equilibrium toward a nucleosome-depleted state, rather than to enhancer de novo “remodeling.” This allows the recruitment of histone modifiers, chromatin remodelers, and ultimately gene activation. The “receptive” state described here could help explain AR signaling activation under very low ligand concentrations.
PMCID: PMC3232925  PMID: 21969603
11.  Genome-wide Mapping of Nucleosome Positioning and DNA Methylation Within Individual DNA Molecules 
DNA methylation and nucleosome positioning work together to generate chromatin structures that regulate gene expression. Nucleosomes are typically mapped using nuclease digestion requiring significant amounts of material and varying enzyme concentrations. We have developed a method that uses a GpC methyltransferase (M.CviPI) and next generation sequencing to footprint nucleosome positioning genome-wide using less than 1 million cells, which does not suffer from sequence based biases associated with MNase digestion and retains endogenous DNA methylation information. Using a novel bioinformatics pipeline we identify chromatin configurations associated with a variety of functional genomic loci including distinct promoter types, enhancers, insulators, X-inactivated and imprinted genes. Importantly, DNA methylation and nucleosome positioning information are obtained from the same DNA molecule, giving the first genome-wide DNA methylation and nucleosome positioning correlation at the single molecule level that can be used to monitor disease progression and response to therapy.
PMCID: PMC3630584
12.  Genome-wide Runx2 occupancy in prostate cancer cells suggests a role in regulating secretion 
Nucleic Acids Research  2011;40(8):3538-3547.
Runx2 is a metastatic transcription factor (TF) increasingly expressed during prostate cancer (PCa) progression. Using PCa cells conditionally expressing Runx2, we previously identified Runx2-regulated genes with known roles in epithelial–mesenchymal transition, invasiveness, angiogenesis, extracellular matrix proteolysis and osteolysis. To map Runx2-occupied regions (R2ORs) in PCa cells, we first analyzed regions predicted to bind Runx2 based on the expression data, and found that recruitment to sites upstream of the KLK2 and CSF2 genes was cyclical over time. Genome-wide ChIP-seq analysis at a time of maximum occupancy at these sites revealed 1603 high-confidence R2ORs, enriched with cognate motifs for RUNX, GATA and ETS TFs. The R2ORs were distributed with little regard to annotated transcription start sites (TSSs), mainly in introns and intergenic regions. Runx2-upregulated genes, however, displayed enrichment for R2ORs within 40 kb of their TSSs. The main annotated functions enriched in 98 Runx2-upregulated genes with nearby R2ORs were related to invasiveness and membrane trafficking/secretion. Indeed, using SDS–PAGE, mass spectrometry and western analyses, we show that Runx2 enhances secretion of several proteins, including fatty acid synthase and metastasis-associated laminins. Thus, combined analysis of Runx2's transcriptome and genomic occupancy in PCa cells lead to defining its novel role in regulating protein secretion.
PMCID: PMC3333873  PMID: 22187159
13.  H2A.Z Maintenance During Mitosis Reveals Nucleosome Shifting on Mitotically Silenced Genes 
Molecular cell  2010;39(6):901-911.
Profound chromatin changes occur during mitosis to allow for gene silencing and chromosome segregation followed by re-activation of memorized transcription states in daughter cells. Using genome-wide sequencing, we found H2A.Z containing +1 nucleosomes of active genes shift upstream to occupy TSSs during mitosis, significantly reducing nucleosome-depleted regions. Single molecule analysis confirmed nucleosome shifting and demonstrated that mitotic shifting is specific to active genes that are silenced during mitosis and thus is not seen on promoters, which are silenced by methylation or mitotically expressed genes. Using the GRP78 promoter as a model, we found H3K4 tri-methylation is also maintained while other indicators of active chromatin are lost and expression is decreased. These key changes provide a potential mechanism for rapid silencing and re-activation of genes during the cell cycle.
PMCID: PMC2947862  PMID: 20864037
14.  Identification of a CpG Island Methylator Phenotype that Defines a Distinct Subgroup of Glioma 
Cancer cell  2010;17(5):510-522.
We have profiled promoter DNA methylation alterations in 272 glioblastoma tumors in the context of The Cancer Genome Atlas (TCGA). We found that a distinct subset of samples displays concerted hypermethylation at a large number of loci, indicating the existence of a glioma-CpG Island Methylator Phenotype (G-CIMP). We validated G-CIMP in a set of non-TCGA glioblastomas and low-grade gliomas. G-CIMP tumors belong to the Proneural subgroup, are more prevalent among low-grade gliomas, display distinct copy-number alterations and are tightly associated with IDH1 somatic mutations. Patients with G-CIMP tumors are younger at the time of diagnosis and experience significantly improved outcome. These findings identify G-CIMP as a distinct subset of human gliomas on molecular and clinical grounds.
PMCID: PMC2872684  PMID: 20399149
DNA methylation; glioma; CIMP; IDH1; TCGA
15.  Functional Enhancers at the Gene-Poor 8q24 Cancer-Linked Locus 
PLoS Genetics  2009;5(8):e1000597.
Multiple discrete regions at 8q24 were recently shown to contain alleles that predispose to many cancers including prostate, breast, and colon. These regions are far from any annotated gene and their biological activities have been unknown. Here we profiled a 5-megabase chromatin segment encompassing all the risk regions for RNA expression, histone modifications, and locations occupied by RNA polymerase II and androgen receptor (AR). This led to the identification of several transcriptional enhancers, which were verified using reporter assays. Two enhancers in one risk region were occupied by AR and responded to androgen treatment; one contained a single nucleotide polymorphism (rs11986220) that resides within a FoxA1 binding site, with the prostate cancer risk allele facilitating both stronger FoxA1 binding and stronger androgen responsiveness. The study reported here exemplifies an approach that may be applied to any risk-associated allele in non-protein coding regions as it emerges from genome-wide association studies to better understand the genetic predisposition of complex diseases.
Author Summary
Genome-wide scans of inherited genetic variation in the normal population have recently identified many sites (loci) associated with the predisposition to complex diseases such as cancer. Some of these cancer-associated loci, however, are devoid of genes (situated in so-called “gene deserts”) and the mechanism(s) of the association are not readily apparent. In the work reported here, we show that loci associated with several cancers in a gene desert found at chromosomal area 8q24 have embedded regulatory sequences affecting gene expression as enhancers, and in one case this activity is modulated by genetic variation. The results provide insight into the mechanism(s) governing genetic cancer risk.
PMCID: PMC2717370  PMID: 19680443
16.  Genomic Androgen Receptor-Occupied Regions with Different Functions, Defined by Histone Acetylation, Coregulators and Transcriptional Capacity 
PLoS ONE  2008;3(11):e3645.
The androgen receptor (AR) is a steroid-activated transcription factor that binds at specific DNA locations and plays a key role in the etiology of prostate cancer. While numerous studies have identified a clear connection between AR binding and expression of target genes for a limited number of loci, high-throughput elucidation of these sites allows for a deeper understanding of the complexities of this process.
Methodology/Principal Findings
We have mapped 189 AR occupied regions (ARORs) and 1,388 histone H3 acetylation (AcH3) loci to a 3% continuous stretch of human genomic DNA using chromatin immunoprecipitation (ChIP) microarray analysis. Of 62 highly reproducible ARORs, 32 (52%) were also marked by AcH3. While the number of ARORs detected in prostate cancer cells exceeded the number of nearby DHT-responsive genes, the AcH3 mark defined a subclass of ARORs much more highly associated with such genes – 12% of the genes flanking AcH3+ARORs were DHT-responsive, compared to only 1% of genes flanking AcH3−ARORs. Most ARORs contained enhancer activities as detected in luciferase reporter assays. Analysis of the AROR sequences, followed by site-directed ChIP, identified binding sites for AR transcriptional coregulators FoxA1, CEBPβ, NFI and GATA2, which had diverse effects on endogenous AR target gene expression levels in siRNA knockout experiments.
We suggest that only some ARORs function under the given physiological conditions, utilizing diverse mechanisms. This diversity points to differential regulation of gene expression by the same transcription factor related to the chromatin structure.
PMCID: PMC2577007  PMID: 18997859
17.  Global analysis of patterns of gene expression during Drosophila embryogenesis 
Genome Biology  2007;8(7):R145.
Embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome were documented, of which 40% show tissue-restricted expression.
Cell and tissue specific gene expression is a defining feature of embryonic development in multi-cellular organisms. However, the range of gene expression patterns, the extent of the correlation of expression with function, and the classes of genes whose spatial expression are tightly regulated have been unclear due to the lack of an unbiased, genome-wide survey of gene expression patterns.
We determined and documented embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome with over 70,000 images and controlled vocabulary annotations. Individual expression patterns are extraordinarily diverse, but by supplementing qualitative in situ hybridization data with quantitative microarray time-course data using a hybrid clustering strategy, we identify groups of genes with similar expression. Of 4,496 genes with detectable expression in the embryo, 2,549 (57%) fall into 10 clusters representing broad expression patterns. The remaining 1,947 (43%) genes fall into 29 clusters representing restricted expression, 20% patterned as early as blastoderm, with the majority restricted to differentiated cell types, such as epithelia, nervous system, or muscle. We investigate the relationship between expression clusters and known molecular and cellular-physiological functions.
Nearly 60% of the genes with detectable expression exhibit broad patterns reflecting quantitative rather than qualitative differences between tissues. The other 40% show tissue-restricted expression; the expression patterns of over 1,500 of these genes are documented here for the first time. Within each of these categories, we identified clusters of genes associated with particular cellular and developmental functions.
PMCID: PMC2323238  PMID: 17645804
18.  Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura 
Genome Biology  2004;5(9):R61.
27 predicted gene-regulatory regions in the Drosophila melanogaster genome were analyzed in vivo, confirming 15 active enhancer regions. A comparison with Drosophila pseudoobscura sequences revealed that conservation of binding-site clusters accurately discriminates functional regions from non-functional ones.
The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters.
We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns.
Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.
PMCID: PMC522868  PMID: 15345045
19.  Annotation of the Drosophila melanogaster euchromatic genome: a systematic review 
Genome Biology  2002;3(12):research0083.1-83.22.
The recent completion of the Drosophila melanogaster genomic sequence to high quality, and the availability of a greatly expanded set of Drosophila cDNA sequences, afforded FlyBase the opportunity to significantly improve genomic annotations.
The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences.
Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes.
Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations.
PMCID: PMC151185  PMID: 12537572
20.  Functional annotation of colon cancer risk SNPs 
Nature Communications  2014;5:5114.
Colorectal cancer (CRC) is a leading cause of cancer-related deaths in the United States. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with increased risk for CRC. A molecular understanding of the functional consequences of this genetic variation has been complicated because each GWAS SNP is a surrogate for hundreds of other SNPs, most of which are located in non-coding regions. Here we use genomic and epigenomic information to test the hypothesis that the GWAS SNPs and/or correlated SNPs are in elements that regulate gene expression, and identify 23 promoters and 28 enhancers. Using gene expression data from normal and tumour cells, we identify 66 putative target genes of the risk-associated enhancers (10 of which were also identified by promoter SNPs). Employing CRISPR nucleases, we delete one risk-associated enhancer and identify genes showing altered expression. We suggest that similar studies be performed to characterize all CRC risk-associated enhancers.
Previous studies identified genetic variants associated with colorectal cancer (CRC), but the functional consequences of these genetic risk factors remain poorly understood. Here, the authors report that CRC risk variants reside in promoters and enhancers and could increase colon cancer risk through gene expression regulation.
PMCID: PMC4200523  PMID: 25268989

Results 1-20 (20)