One of the most prominent features displayed by transcriptional enhancers, compared to that of promoters and insulator elements, is their cell-type-specific activities. These cell-type-specific regulatory interactions play an essential role in establishing cell type and developmental stage specific gene expression patterns in higher eukaryotes.
Several recent genome-wide expression quantitative trait loci (eQTLs) studies in humans have provided us a first glimpse of regulatory variations in the human population (
1–5). Strikingly, about 70–80% of regulatory variants operate in a cell-type-specific manner and are found at larger distances from protein-coding genes, suggesting that a large proportion of these variants could be located in distal enhancers.
In terms of human diseases, a large body of previous studies has uncovered many causal and risk-conferring mutations located in transcriptional enhancers. Examples include thalassemia (
6,
7), preaxial polydactyly (
8,
9), Hirschsprung's disease (
10,
11), cleft clip (
12) and prostate cancer (
13), among others. At a genome scale, Visel
et al. (
14) recently performed a meta-analysis of 1200 single nucleotide polymorphisms (SNPs) identified as the most significantly trait- and/or disease-associated variants in a compendium of genome-wide association studies (GWAS) published up to March 2009 (
15). Using conservative parameters that tend to overestimate the size of linkage disequilibrium blocks, they found that in 40% of cases (472 of 1170) no known exons overlap, either the linked SNP or its associated haplotype block, suggesting that in more than one-third of cases non-coding sequence variation causally contributes to the traits under investigation. The major classes of non-coding sequences include enhancers, proximal promoters, insulators and non-coding RNAs. Among these, enhancers comprise a large fraction. Therefore, it is likely that many yet-to-be-discovered causal genetic variations reside in enhancers.
Taken together, recent genome-wide mapping of regulatory variants in both healthy and diseased cells has demonstrated the abundance of enhancer sequence variation and its impact on gene expression and disease etiology. Therefore, a comprehensive set of enhancers may facilitate the identification of many causal non-coding variants. To this end, integrating genome-wide enhancer catalogs with GWAS data becomes an effective strategy for linking enhancer mutations with diseases. Likewise, integrating enhancer catalogs with eQTL data will enable us to establish regulatory relationships between enhancers and their target promoters at the systems level.
Transcription enhancers are notoriously difficult to map, which hinders studies of their biology and links to diseases. In the past, reporter gene assays, comparative genomics and transcription factor (TF) ChIP-Chip/Seq have been used to experimentally map enhancers. Computational algorithms based on DNA sequence analysis have also been developed to predict enhancers. However, significant challenges remain for the aforementioned approaches, including low through-put, lack of tissue/specific information, high cost and low accuracy. Recently, a number of studies (
16–21) have demonstrated that unique chromatin modification patterns associated with enhancer elements can serve as an effective and accurate mark for cell-type-specific enhancers. Compared with previous approaches, this chromatin-signature-based approach is better suited for finding cell- and developmental-stage-specific enhancers since the activity of enhancers is often modulated by chromatin structure in a condition-specific manner.
Towards the goal of a systems-level understanding of cell-type-specific enhancers, we have used cell-type-specific histone modification maps to generate a genome-wide atlas of transcriptional enhancers in three human cell types: B and T lymphocytes and embryonic stem cells (ES cells). We corroborated the set of predicted enhancers using several complementary lines of evidence, including overlap with other genomic marks for enhancers; location bias of enhancers to cell-type-specific genes; enrichment of cell-type-specific TF binding sites (TFBSs). Our integrative analyses generated a wealth of high-confidence novel enhancers for each cell type. Most importantly, we used our set of predictions to gain insights into enhancer evolution and disease link. We first examined the connections between enhancers and mobile DNA elements (MEs). We also mapped a compendium of eQTL and GWAS SNPs onto our predicted enhancers. Our analyses led to a number of hypotheses suggesting a role of predicted enhancers in disease etiology. Further, comparative analyses of enhancers from different cells revealed unique characteristics of ES cell enhancers in terms of their evolutionary history and disease association.