Consistent proper function of all biologic processes relies on the precise spatial and temporal expression of genes (
1–3). Development, differentiation, proliferation, apoptosis, and even aging, are the culmination of cell type-specific and ubiquitous gene expression. Since transcription was first described researchers have sought to define the molecular mechanisms that regulate this phenomenon, driven by the belief that understanding the gene expression profiles of normal and disease states will facilitate discoveries of therapeutic targets to alleviate human and animal suffering. These works have defined several types of
cis-acting transcriptional regulators, including promoters, enhancers, insulators and locus control regions (LCR) (
1,
4,
5) and the
trans-acting factors that bind to them. Nonetheless, the relative roles of these regulatory DNA elements have yet to be fully elucidated. The introduction of high-throughput sequencing (
6) and its massive amounts of data spanning entire genomes of species has provided a platform from which we may begin to examine global patterns of gene expression and compare these patterns among different cell types to gain a clearer understanding of the molecular mechanisms underlying the dynamic and complex processes of life.
Next-generation sequencing (NGS) has become a popular approach to identifying gene regulatory elements and to performing accurate functional analysis (
6). NGS of DNA–protein complexes isolated by chromatin immunoprecipitation, a procedure known as (ChIP-seq), has allowed for global localization of regulatory elements associated with a specific protein of interest (
7–11). Unfortunately, this combined technique is only applicable to known (previously characterized)
trans-acting factors and is limited by its requirement for a high quality ChIP-grade antibody to isolate the transcription factor (TF) to be analyzed (
9). By coupling the NGS method with DNaseI hypersensitive (DNaseI HS) site mapping (long considered the gold-standard for comprehensively identifying the location of various classes of transcriptional regulatory elements), a particularly powerful high-resolution procedure, DNase-seq, was developed (
12–18). Like the ChIP-seq procedure, though, DNase-seq suffers from some inherent limitations. DNase-seq provides only location data and is unable to directly characterize function or identify the particular TF(s) associated with the region.
The data obtained from each of these combined-NGS procedures may be analyzed in parallel (along with data obtained from gene expression arrays) to facilitate the identification of bona fide transcriptional regulatory elements. First, though, we must obtain a thorough understanding of the different types of cis-regulatory sequence elements and epigenetic modulatory mechanisms in order to accurately investigate their contributions to spatial and temporal gene expression.
The first genome-wide maps of histone methylation (
7) and acetylation marks (
19) were generated from human resting CD4
+ T cells. Histone modifications associated with gene transcription were designated as active, while those associated with repressed transcription were designated as repressive. Intriguingly, some of the ‘active’ were identified in transcriptionally silent genes (
7,
20–23), suggesting that these modifications may act more as markers of genes primed for transcriptional activity. Not surprisingly, then, histone modification is not the sole mediator of expression level (
24). By performing DNase-seq and DNase fragment, hybridization to microarray chip (DNase-chip), Boyle
et al. (
25) created a comprehensive genome-wide map of the open chromatin regions in CD4
+ T cells. Their analysis of the resultant data sets did not identify a clear correlation between DNaseI HS and levels of gene expression. Shortly thereafter, Xi
et al. (
26) used DNase-chip to comparatively analyze six human cell types in order to identify functional cell type specific and ubiquitous DNaseI HS sites (DHSs). Their examination of 1% of the human genome revealed that cell type-specific DNaseI HS sites co-localized with cell type-specific gene expression. Recently, Stitzel
et al. (
27) conducted genome-wide analysis of DNaseI hypersensitive sites in human islets. Ling
et al. (
28) produced a set of detailed, high-quality, genome-wide DNaseI hypersensitivity maps in the mouse liver
in vivo. These studies highlight the utility of DNase-seq for systematically uncovering
cis-regulatory elements on a genome-wide scale.
In the study presented herein, we performed a genome-wide meta-analysis of DNaseI HS sites identified in 29 different cell types. We sought to determine the relationship between DNaseI HS, histone modifications and gene expression. We found that specific correlations exist between DNaseI HS, gene expression and the amounts of active and repressive histone modifications across different cell types. These correlations displayed four distinct modes (repressive, active, bivalent and primed), reflecting different functions of the chromatin domains. Furthermore, CCCTC binding factor (CTCF) binding sites were newly identified based on these integrative data. Our findings revealed a situation of complex regulation of gene expression mediated by DNaseI hypersensitive chromatin regions and their histone modifications.