Previous studies attempting to identify critical regulatory elements of membrane protein genes have largely been limited to the study of core promoters, relying primarily on in vitro assays, such as reporter gene assays and EMSA (6
). ChIP techniques have rarely been applied to the study of these genes (108
). When ChIP has been employed, the data have been of limited utility for a number of reasons, including the limited regions analyzed, the time and cost required to complete the studies, and investigator bias in choosing regions of study. High throughput ChIP-based technologies overcome many of these problems, allowing unbiased analyses of large contiguous genomic regions and using cost-effective commercially available platforms, as well as allowing comparisons and classifications of numerous genes in a single experiment.
This is the first report to utilize ChIP-chip to study in an unbiased manner both GATA-1 and NF-E2 binding in erythroid cell-expressed genes in numerous loci spread throughout the human genome. This approach has revealed that the majorities of regions of GATA-1 and NF-E2 occupancy are not at core erythrocyte promoters, as expected, but are located apart from the core promoter region, primarily in introns, frequently in intron 1. GATA-1 and NF-E2 in vivo binding outside core promoter regions has been described to occur in erythrocyte enhancers, most notably the β-globin LCR, but extensive non-promoter-related binding throughout numerous genomic loci was unexpected. High throughput, ChIP-based studies of other transcription factors have also demonstrated that the majority of binding sites of some, but not all, regulatory proteins may not necessarily be located at promoters or CpG islands. Together, these data provide additional evidence for the growing body of data supporting the critical role of long-range mechanisms, such as chromatin looping, in gene regulation (9
Genome-wide ChIP-based studies provide additional data supporting the role of long-range interactions in gene regulation. Several reports utilizing high throughput ChIP-based techniques have found multiple regions of factor occupancy in vivo that do not contain a consensus binding site in the corresponding DNA (4
). Similarly, not all regions of GATA-1 occupancy in membrane protein genes contain a consensus GATA-1 binding site in the corresponding DNA. GATA-1 has been shown to play a critical role in long-range gene interactions at the c-kit and β-globin loci via formation of chromatin loops (39
), a possible mechanism of action in regions of GATA-1 and NF-E2 binding that lack consensus GATA-1 or NF-E2 DNA binding sites.
Many different strategies have been employed to precisely identify and/or predict which GATA-1 consensus binding sites in genomic DNA interact with the GATA-1 protein in vivo. These have included various in silico and experimental techniques which have demonstrated that GATA-1 binding sites that regulate gene expression during erythropoiesis are under strong selection constraint (10
). As other reports have shown, regions of DNA in membrane protein genes demonstrating GATA-1 binding in vivo were more likely to demonstrate evolutionary conservation across species. However, even though these regions were identified as having conservation scores predictive of a cis
-regulatory element, attempts to refine sites of GATA-1 binding by using in vitro binding with EMSA were unsuccessful, with no correlation between EMSA binding and PhastCons score. Numerous factors contribute to DNA-protein binding and subsequent erythrocyte gene expression, including cis
sequences, concentration and stability of regulatory proteins, and chromatin architecture (7
). It is likely that a complex combination of factors regulates GATA-1 binding to its cognate DNA binding site.
As noted above, several recent studies have detailed binding partners at sites of GATA-1 and NF-E2 binding, as well as the composition of DNA-protein complexes. The GATA-1-associated proteins identified for membrane protein genes, including FOG-1, SCL, and MTA-2, have been described previously. In addition to heterodimerizing with small Maf proteins, p45 NF-E2 has been shown to interact with various proteins, such as MLL2, WWP1, and MCRS2 (2
). With a few exceptions, proteins and multiprotein regulatory complexes interacting and associating with NF-E2 have not been characterized to the extent that GATA-1 has (68
). Finding SCL and MTA-2 at all sites of NF-E2 binding implies that these transcription factors may play a broader role in the regulation of erythroid cell-expressed genes than may previously have been appreciated.
Importantly, these studies revealed a common erythroid cell type-specific chromatin signature located throughout the genomic loci of many erythroid cell-expressed genes. This mark includes H3Me3K4, with cooccupancy of GATA-1, NF-E2, FOG-1, SCL, and MTA-2 proteins, present at approximately a quarter of the GATA-1 binding sites. In the genomic DNA underlying half of these sites are single GATA-1-E-box consensus motifs separated from NF-E2 consensus motifs by 34 to 90 bp. An important goal in understanding erythrocyte gene regulation would be to have a cell type-specific topographic map of chromatin architecture with the interacting regulatory proteins, as well as the primary sequence and epigenetic configuration of the associated genomic DNA.
Disorders of erythrocyte shape comprise an important group of inherited hemolytic anemias. These disorders include the hereditary spherocytosis, elliptocytosis, and pyropoikilocytosis syndromes, which are often associated with qualitative and quantitative abnormalities of major erythrocyte membrane proteins. In many cases, causative mutations have been identified in the genes encoding these proteins. However, in as many as 25% of cases, despite specific protein deficiency and/or genetic linkage, the causative mutation is not identified, even after nucleotide sequence analysis of the coding exons, the immediate flanking intronic sequences, and the promoter regions (103
). The high throughput genomic strategies employed in this study identify numerous excellent candidate regions for mutations associated with membrane-linked hemolytic anemia.