|Home | About | Journals | Submit | Contact Us | Français|
GATA factors interact with simple DNA motifs (WGATAR) to regulate critical processes, including hematopoiesis, but very few WGATAR motifs are occupied in genomes. Given the rudimentary knowledge of mechanisms underlying this restriction, and how GATA factors establish genetic networks, we used ChIP-seq to define GATA-1 and GATA-2 occupancy genome-wide in erythroid cells. Coupled with genetic complementation analysis and transcriptional profiling, these studies revealed a rich collection of targets containing a characteristic binding motif of greater complexity than WGATAR. GATA factors occupied loci encoding multiple components of the Scl/TAL1 complex, a master regulator of hematopoiesis and leukemogenic target. Mechanistic analyses provided evidence for cross-regulatory and autoregulatory interactions among components of this complex, including GATA-2 induction of the hematopoietic corepressor ETO-2 and an ETO-2 negative autoregulatory loop. These results establish fundamental principles underlying GATA factor mechanisms in chromatin and illustrate a complex network of considerable importance for the control of hematopoiesis.
Master regulators of development are commonly transcription factors that instigate complex genetic networks. Mechanisms underlying the function of these regulators are highly stringent, as deviations in their expression, chromatin site selection and protein-protein interactions elicit catastrophic phenotypes. In the context of hematopoiesis, defective genetic networks cause anemias, leukemias and lymphomas. Given the essential role of GATA factors in controlling hematopoiesis (Cantor and Orkin, 2002) and mutations in human leukemias (Crispino, 2005), it is crucial to elucidate their mechanisms, with perhaps the most rudimentary goal to establish the ensemble of target genes genome-wide.
GATA-2 expression occurs early in hematopoiesis and is required for the maintenance and expansion of hematopoietic stem cells and/or multipotent progenitors (Tsai et al., 1994). GATA-1 expression is induced subsequent to GATA-2 and is essential for the development of erythrocytes (Pevny et al., 1991; Simon et al., 1992), megakaryocytes (Shivdasani et al., 1997), eosinophils (Yu et al., 2002) and mast cells (Migliaccio et al., 2003). Whereas GATA-1 and GATA-2 bind DNA with a similar specificity (Ko and Engel, 1993; Merika and Orkin, 1993), and function redundantly to promote primitive erythroblast development (Fujiwara et al., 2004), they also exert distinct functions. GATA factors activate and repress genes, with or without the coregulator Friend of GATA-1 (FOG-1) (Crispino et al., 1999; Johnson et al., 2007; Tsang et al., 1997). FOG-1-dependent activation involves facilitation of GATA-1 chromatin occupancy (Letting et al., 2004; Pal et al., 2004a) and GATA-2 displacement from target sites (Pal et al., 2004a). FOG-1-dependent repression can be accompanied by broad histone deacetylation (Grass et al., 2003), and FOG-1 binds two corepressors, NuRD (Hong et al., 2005) and CtBP (Turner and Crossley, 1998).
Analyses at several loci suggest that GATA-1 and GATA-2 occupy a small fraction of the abundant WGATAR motif (Grass et al., 2003; Grass et al., 2006; Im et al., 2005; Johnson et al., 2002; Pal et al., 2004b). Even the presence of a conserved motif appears to be insufficient for implicating a GATA factor in regulation. Given that GATA factor DNA binding specificities were defined with naked DNA, nucleotides flanking WGATAR or cis-elements near WGATAR may mediate occupancy in vivo, or WGATAR might not be critical in chromatin.
GATA-1 and GATA-2 function cooperatively with the master regulator of hematopoiesis Scl/TAL1 on E-box (CANNTG)-WGATAR-containing composite elements (Lahlil et al., 2004; Vyas et al., 1999; Wadman et al., 1997; Wozniak et al., 2007; Xu et al., 2003). GATA-1 (Tripic et al., 2008) and GATA-2 (Wozniak et al., 2008) co-localize on chromatin sites with Scl/TAL1. Scl/TAL1 assembles a multimeric complex containing E2A, LMO2, Ldb1 and GATA-1 (Gottgens et al., 2002; Lahlil et al., 2004; Lecuyer et al., 2002; Wadman et al., 1997; Xu et al., 2003). Another critical factor that binds the Scl/TAL1 complex is the corepressor ETO-2 (Amann et al., 2001; Schuh et al., 2005), which like Scl/TAL1 (Aplan et al., 1992) and LMO2 (Hacein-Bey-Abina et al., 2003), is disrupted in leukemia (Gamou et al., 1998). Targeted disruption of Cbfa2t3, which encodes ETO-2, revealed an ETO-2 requirement for hematopoietic progenitor fate decisions, proliferation, and stress-dependent hematopoiesis (Chyla et al., 2008). Although other studies implicated ETO-2 in controlling erythropoiesis (Goardon et al., 2006) and megakaryopoiesis (Hamlett et al., 2008), little is known about its function in GATA-2-expressing cells.
GATA-2 and Scl/TAL1 co-localize at chromatin sites containing E-box-WGATAR motifs (Wozniak et al., 2008). As only a small fraction of E-box-WGATAR motifs are occupied in chromatin, resembling that of WGATAR motifs (Wozniak et al., 2008), the composite motif does not appear to confer a major advantage for chromatin occupancy vs. WGATAR.
Since GATA factor DNA binding specificities have been deduced through in vitro analyses, it is critical to generate and validate datasets of GATA factor occupancy in vivo. The magnitude and qualitative features of GATA factor target gene ensembles is unclear. We describe ChIP-seq analysis, in conjunction with expression profiling, target validation in primary cells, and computational mining, which yielded principles governing GATA factor-chromatin interactions and a genetic network of considerable importance for controlling hematopoiesis.
ChIP-seq was conducted with human K562 erythroleukemia cells that express GATA-1 (Tsai et al., 1989) and GATA-2 (Dorfman et al., 1992) and are studied intensively in the ENCODE project (http://genome.ucsc.edu/ENCODE). The ChIP assay was validated by measuring GATA-1 occupancy at β-globin LCR HS2 (Johnson et al., 2002). Immunoprecipitated DNA from two biological replicates was used to prepare libraries for deep-sequencing. Sequences were mapped to the UCSC Human Genome assembly. Replicates A and B yielded 4.4 million and 10.3 million uniquely mapped sequences, respectively. Using a false discovery rate of 0.001, we identified 1,536 and 6,104 GATA-1 binding sites (peaks) in replicates A and B respectively. To assess reproducibility, peak overlap was determined. 90% of the peaks (1,380) from replicate A were present in replicate B. All reads for both replicates were merged to yield 5,749 sites (Suppl. Table 1) corresponding to 4,061 genes. The peak heights ranged from 20-259 sequence reads, with an average height of 38 and width of 327 bp, and peak number did not correlate with chromosome size (Suppl. Table 2). The analysis revealed established and new GATA-1 targets, including hematopoietic transcription factors, signaling molecules, and red cell cytoskeletal components (Fig. 1).
K562 cells are transformed and have a primitive erythroid phenotype (Lozzio et al., 1979), whereas murine G1E cells are untransformed and resemble normal definitive proerythroblasts (Weiss et al., 1997; Welch et al., 2004). Given these differences, its target sites might differ greatly in the two cell types. Quantitative ChIP analysis was conducted with untreated and β-estradiol-treated G1E cells stably expressing ER-GATA-1 (G1E-ER-GATA-1) to test whether ER-GATA-1 and GATA-2 occupy sites containing conserved WGATAR motifs that were detected by ChIP-seq. Although the absolute levels of occupancy were higher for peaks with high vs. low peak values, ER-GATA-1 and GATA-2 occupancy were detected in 10/13 (Fig. 2A) and 8/13 (Fig. 2B) of the peaks, respectively.
Prior work defined GATA-1 and GATA-2 occupancy at dispersed regions of several loci (Grass et al., 2003; Grass et al., 2006; Im et al., 2005; Johnson et al., 2007; Johnson et al., 2002; Martowicz et al., 2005; Munugalavadla et al., 2005; Pal et al., 2004b; Scherzer et al., 2008; Wozniak et al., 2007). At the β-globin locus, GATA-1 instigates chromatin looping (Vakoc et al., 2005), which brings dispersed complexes in close proximity to each other. Since GATA-1 occupancy at very few loci were analyzed, it is unclear whether occupancy occurs mainly at promoters or other regions. Location analysis with the 5,749 peaks, using the Cis-regulatory Element Annotation System (http://ceas.cbi.pku.edu.cn/), revealed occupancy predominantly within introns (37%) and >1 kb away from RefSeq genes (“enhancer”) (47%) (Fig. 3A). The highest frequency of peaks was at −10 to −100 kb and +10 kb to +100 kb (Suppl. Fig. 1). Only 10% of the sites reside in proximal promoters (<1 kb upstream of RefSeq 5′ start).
GATA-1 preferentially binds WGATAR-containing DNA (Evans et al., 1988). In a site selection analysis, recombinant chicken GATA-1 preferentially bound NNAGATAANN (Ko and Engel, 1993). Site selection with recombinant mouse GATA-1 (Merika and Orkin, 1993) and Mouse Erythroleukemia Cell (MEL) extracts (Wadman et al., 1997) yielded consensus sequences (G/C/A)NGAT(A/G/T)G(GCT) and CGATAA, respectively.
Based on the highly restricted occupancy of WGATAR-containing sites in cells, sequences preferred in vitro might not be an obligate requirement in chromatin. DNA binding-defective mutants of Scl/TAL1 (Porcher et al., 1999) and the glucocorticoid receptor (Reichardt et al., 1998) can function in vivo, presumably through binding DNA-bound activators. Thus, GATA-1 tethered to DNA-bound factors might crosslink to sites lacking WGATAR motifs.
De novo motif finding with Cosmo (Bembom et al., 2007) and Meme (Bailey and Elkan, 1995) was used to ascertain the percent of targets containing specific cis-elements. Of the 5,749 ChIP-seq peaks, 5,051 (88%), 155 (3%), 193 (3%), 79 (1%) contained WGATAR, WGATA, GATAR, and WGATA + GATAR motifs (Fig. 3B), respectively, which were conserved similarly from mouse to man. Of the 6,976,111 WGATAR motifs in the human genome, 9,741 (0.14%) reside within the 5,051 peaks (0.07-0.14% occupancy). A small subset of these motifs (6.0%) reside in E-box-WGATAR motifs. De novo motif finding using peaks containing at least one WGATAR identified a highly significant (E-value = 2.8e−3685) position weight matrix with the consensus (C/G)(A/T)GATAA(G/A/C)(G/A/C) (Fig. 3C, left), which occurs 297,124 times in the human genome. Of the 5,051 WGATAR-containing peaks, 3,165 (63%) contain this consensus. Based on 3,165 peaks containing ≥1 copy of the consensus, 0.71% were occupied by GATA-1, an order of magnitude higher than the 0.07% occupancy of WGATAR (p < 0.0001). As the three extended positions (2, 9, 10) exhibit compositions that deviate greatly from the equal probability of bases and the [(A, T): 0.3] and [(C, G):0.2] configuration (p < 1e−100), we further evaluated the significance of this composition by randomly drawing the same number of WGATAR occurrences from the genome and constructing a position weight matrix with WGATAR and its first left and two right flanking positions. We repeated this process 1000 times, and the flanking region information contents were significantly smaller than that of the position weight matrix constructed from WGATAR-containing peaks (p = 0 based on 1000 randomization experiments).
Of the 62,412 E-box-WGATAR composite elements in the human genome, 307 reside within 304 peaks (0.49% occupancy). Analysis of the 301 peaks containing one E-boxWGATAR element revealed the logo in Fig. 3C, right. Positions 16, 23, and 24 resemble the respective positions of the more complex WGATAR motif (Fig. 3C) and deviate from random nucleotides (p < 1e−90). The analysis also revealed unexpected information content in the NN residues of CANNTG (Fig. 3C, right). GATA-1-occupied composite elements had a similar probability of having G or T in the 1st N position and C, A, or G in the 2nd N position (Fig. 3C). Site-selection analysis with recombinant Scl/TAL1-E2A heterodimers identified the consensus AACAGATGGT, and 86-100% of bound sequences had GA in the NN positions (Hsu et al., 1994). The in vitro preference for AA and GT at the 5′ and 3' ends deviated from occupied chromatin sites, which had no sequence preference at the 5′ end and either C, G, or T following TG. Site-selection studies with MEL cell extracts identified E-box-GATA elements with the consensus CAGGTG(N)9GATA (Wadman et al., 1997). The GG sequence in the NN positions differed from GATA-1-occupied E-box-WGATAR elements in chromatin.
De novo motif finding on the 698 peaks lacking WGATAR identified GGAATGGAATG as overrepresented in this group. This sequence appears 3 and 64 times in WGATAR-containing and -lacking peaks, respectively (p < 2.2e−16). GGAATGGAATG resides in microsatellite repeats (Gangwal et al., 2008) and contains the Ets binding motif GGAA (Sharrocks, 2001). The oncogenic Ewings Sarcoma protein fusion to FLI functions through this sequence (Gangwal et al., 2008). GAATGGAATGGAAT-containing GATA-1 occupancy sites defined by ChIP-seq lack WGATAR motifs. As GATA factors physically associate with Ets factors (Rekhtman et al., 1999), a DNA-bound Ets factor might tether GATA-1 to chromatin at this class of sites.
To assess whether the ChIP-seq peaks pinpoint GATA-1-regulated genes, gene expression was profiled in untreated and β-estradiol-treated G1E-ER-GATA-1 cells (Suppl. Table 3). ER-GATA-1 induced and repressed 1,166 and 1,010 genes, respectively, >1.5 fold. Merging ChIP-seq and profiling datasets revealed 142 and 154 activated and repressed genes, respectively, which were GATA-1-occupied (the top 60 are shown in Fig. 3D). These genes include known GATA-1 targets, such as Slc4a1 and Epb4.9 (Kim et al., 2007) encoding red cell cytoskeletal proteins (Mohandas and Gallagher, 2008), yet many had not been implicated in GATA-1 function or hematopoiesis. 44% of the genes (Fig. 3D) were not described in a prior profiling analysis in this system (Welch et al., 2004). In primary Ter119+ bone marrow erythroblasts, GATA-1 occupied 32/36 of the sites significantly higher than the negative control Ey promoter and 14/36 higher than the positive control β-globin HS2, respectively (p < 0.05) (Fig. 3E). Using tiled microarrays containing sequences from 120 genes of Fig. 3D (with 120,000 bp of upstream and downstream sequence), ChIP-chip analysis in primary Ter119+ bone marrow erythroblasts revealed significant GATA-1 occupancy at 90% of the targets (98% and 82% of activated and repressed targets, respectively) (Fig. 3F, Suppl. Table 4).
Our previous comparison of GATA-1 and GATA-2 occupancy at several loci in G1E and G1E-ER-GATA-1 cells revealed that almost all of the GATA-1-occupied sites were occupied by GATA-2 in uninduced G1E-ER-GATA-1 cells and G1E cells. To assess the extent of overlap in a cell expressing endogenous GATA-1 and GATA-2, we analyzed GATA-2 occupancy by ChIP-seq in K562 cells. Analysis of two biological replicates (21,167 peaks) (Suppl. Table 5) revealed major overlap (Fig. 4A, B). Since the GATA-1 and GATA-2 samples analyzed by ChIP-seq were isolated on different days, we conducted quantitative ChIP analysis for GATA-1 and GATA-2 at the same time. Sampling representative GATA-1- and GATA-2-unique peaks identified GATA-1- and GATA-2-selective targets (Fig. 4C, D). The extensive sharing of sites by GATA-1 and GATA-2 provides insights into the finding that GATA-1 and GATA-2 function redundantly to generate primitive erythroblasts (Fujiwara et al., 2004). Our analysis also revealed GATA factor-selective targets, including: a kinase critical for controlling hematopoiesis (AK2) (Lagresle-Peyrou et al., 2009); a cell type-specific component of the Mediator complex (MED10) that regulates Wnt and Nodal signaling (Lin et al., 2007b); a Hox gene (HOXB9) induced by Wnt signaling (Nguyen et al., 2009); a Forkhead transcription factor (FOXK2); a factor (BST2) that suppresses HIV-1 release from the cell surface (Goffinet et al., 2009) and is downregulated by Kaposi's sarcoma herpesvirus (Mansouri et al., 2009); a Set domain-containing histone H3K4 methyltransferase (SMYD3) (Nguyen et al., 2009); and a putative RNA binding protein (RBM15) that regulates Notch signaling (Ma et al., 2007), controls hematopoiesis (Raffel et al., 2007) and is implicated in acute megakaryoblastic leukemia (Ma et al., 2001). As GATA-1 mutations are linked to acute megakaryoblastic leukemia (Wechsler et al., 2002), and GATA-1 represses GATA2 transcription (Grass et al., 2003), our discovery of RBM15 as a GATA-2-selective target is intriguing.
ChIP-seq identified five GATA factor-bound sites at CBFA2T3 (Fig. 4B), which encodes ETO-2, a co-repressor that controls hematopoiesis. We tested whether GATA factors occupy all or a subset of conserved WGATAR motifs at and near Cbfa2t3. Conserved WGATAR motifs reside at −37.3, −35.7, −21.7, −21.6, −21.6, −21.5, −13.3, −1.9, −0.2, +13.3 and +15.1 kb relative to the start site (Fig. 5A). Conservation of the immediate region was high (>75%) at −35.7, −21.7, −21.6, −21.6, −21.5, −13.3, −1.9 and −0.2 kb, intermediate (>50%) at +13.3 kb, and low (<50%) at −37.3 and +15.1 kb (Fig. 5A). Among five ChIP-seq peaks at human CBFA2T3 (Fig. 4B), −27.3, −15.6, −2.1, and −0.1 kb peaks corresponded to −21.6, −13.3, −1.9 kb, and the promoter of murine Cbfa2t3 (Fig. 5A). GATA-2 occupied −37.3, −21.6, −13.3, −1.9, −0.2, and +13.3 kb sites (Suppl. Fig. 2A).
Scl/TAL1 forms a complex with E2A, LMO2, and Ldb1 (Lahlil et al, 2004; Wadman et al., 1997), which co-localizes with GATA-2 (Wozniak et al., 2008) and GATA-1 (Anguita et al., 2004; Lahlil et al., 2004; Tripic et al., 2008; Xu et al., 2003) in chromatin. We tested whether this complex resides at GATA-2-occupied regions of Cbfa2t3. Scl/TAL1 occupied all except the −37.3, −35.7, and +15.1 kb sites (Suppl. Fig. 2B). The Scl/TAL1-interacting factor ETO-2 (Schuh et al., 2005) occupied only the Scl/TAL1-bound sites (Suppl. Fig. 2C). The occupancy of each factor correlated with the others (Suppl. Fig. 2D).
To test whether ETO-2 occupancy at Cbfa2t3 (Fig. 5A, Suppl. Fig. 2C) reflects negative or positive autoregulation, ETO-2 was knocked-down in G1E cells with Cbfa2t3 siRNA. Cbfa2t3 primary transcripts were quantitated as a metric of transcription. Western blotting and real-time RT-PCR revealed a knockdown of ETO-2 protein and Cbfa2t3 mRNA, respectively (Fig. 5B), and ETO-2 occupancy at −21.6 kb was reduced by 74% (p = 0.002) (Fig. 5C). The knockdown increased Cbfa2t3 primary transcripts (58%, p = 0.01; 71%; p = 0.00009) (Fig. 5D), suggesting that ETO-2 represses Cbfa2t3. To further evaluate negative autoregulation, we measured Pol II and Ser 5-phosphorylated Pol II (P-Ser5-Pol II) at the Cbfa2t3 promoter. The knockdown increased Pol II and P-Ser5-Pol II occupancy at two sites (~2.5 and ~2 fold, respectively, p < 0.05), without affecting occupancy at the RPII215 promoter (Fig. 5E). As ETO-2 interacts with class I HDACs (Amann et al., 2001), we asked whether knocking-down ETO-2 affects histone acetylation at Cbfa2t3. Acetylated histone H3 increased at −21.6 kb (p = 0.015) and the promoter (p = 0.016), but not at the RPII215 promoter (Fig. 5F). These results establish an ETO-2 negative autoregulatory loop.
ETO-2-mediated negative autoregulation might reflect a non-redundant repressor function at all of its target genes. We asked whether ETO-2 occupied and regulated other GATA factor targets (Fig. 6A), including Scl/TAL1 (Lugus et al., 2007) and Lmo2 (Landry et al., 2009). ETO-2 occupied these sites with a level comparable to that at Cbfa2t3 in G1E cells (Fig. 6B, Suppl. Fig. 2C). ETO-2 knockdown induced expression of certain GATA factor targets (Icam4, Epb4.9, Slc4a1, μmajor, Eraf, and Alas2), while others were unaffected (Fig. 6C). Similarly, ETO-2 knockdown facilitated ER-GATA-1-mediated activation of certain, but not all, targets (Fig. 6D). The knockdown might not reduce ETO-2 below a threshold at which occupancy at all targets would be impaired. If ETO-2 interacts with targets in different chromatin environments with distinct affinities, this could explain the differential sensitivities. Thus, knocking-down ETO-2 would reduce its concentration sufficiently to impair occupancy and regulation at sites with the lowest apparent affinities. However, the knockdown significantly decreased ETO-2 occupancy at sensitive and resistant targets (Fig. 6B), inconsistent with this possibility.
Since Slc4a1, which is strongly induced by the ETO-2 knockdown, is bound by GATA-2 in the repressed state, ETO-2 loss might suffice for induction, or might be inextricably coupled to GATA-2 loss (Fig. 6E). The ETO-2 knockdown induced ETO-2, but not GATA-2, loss from the Slc4a1 promoter (Fig. 6F), indicating that transcriptional activation solely requires ETO-2 eviction.
Despite the lack of GATA-1 in G1E cells, knocking-down ETO-2 induced certain GATA-1 targets. GATA-1-mediated activation might therefore require ETO-2 displacement. However, GATA-1 activation of Slc4a1 in G1E-ER-GATA-1 cells is associated with increased ETO-2 occupancy at its promoter (data not shown). Thus, GATA-1-associated coactivators dominate over ETO-2 corepressor activity, negating the need to evict ETO-2, or ETO-2 has dual corepressor/coactivator activities.
GATA-1, GATA-2, and Scl/TAL1 occupancy at Cbfa2t3 suggested that these factors regulate Cbfa2t3 transcription. Interactions among components of GATA factor – Scl/TAL1 complexes include GATA-1 repression of Gata2 (Grass et al., 2003), Scl/TAL1 induction of Gata2 (Lugus et al., 2007), and GATA-2 induction of Scl/TAL1 (Chan et al., 2007; Lugus et al., 2007; Wozniak et al., 2008). We used knockout and conditional overexpression approaches in ES cells to examine all possible regulatory influences of GATA-1, GATA-2 and Scl/TAL1 on Gata1, Gata2, Scl/TAL1, Lmo2, Ldb1 and Cbfa2t3 expression. Gene expression was analyzed upon conditional expression of GATA-2 and Scl/TAL1 during ES cell differentiation (Lugus et al., 2007) and in Gata2−/− and Scl/TAL1−/− vs. wild-type EBs. Dox-mediated induction of GATA-2 induced Gata1, Cbfa2t3, and Lmo2 (383, 6.2, and 5.1-fold; p = 0.003, 0.03, and 0.002, respectively) (Fig. 7A). Previously, we demonstrated that GATA-2 increases Scl/TAL1 expression in this system 12-fold (Wozniak et al., 2008). Gata1, Cbfa2t3, Scl/TAL1, and Lmo2 were weakly to modestly downregulated in Gata2-null EBs (p = 0.0055, 0.09, 0.20 and 0.58, respectively) (Fig. 7A). As GATA-3 can compensate for GATA-2 (Kobayashi-Osaki et al., 2005), this might limit the downregulation. In Gata2-null cells, reduced Cbfa2t3 expression would abrogate the negative autoregulatory loop. The altered Cbfa2t3 expression in GATA-2-expressed and Gata2-null ES cells, as well as in G1E-ERGATA-1 cells, establishes Cbfa2t3 as a GATA factor target. Dox-mediated induction of Scl/TAL1 increased Gata2 (p < 0.001) and Lmo2 expression (p < 0.0001), and their expression, as well as that of Gata1, was downregulated in Scl/TAL1-null EBs (p = 0.0044, 0.0089, and 0.04 respectively). Cbfa2t3 was repressed in Scl-null EBs (p = 0.025), and Scl/TAL1 induction in iSCL EBs repressed its expression (-2.2 fold, p < 0.0001) (Fig. 7A).
The genome-wide analysis of GATA-1 chromatin occupancy identified a consensus element as a hallmark of the repertoire of GATA-1 chromatin occupancy sites, which changes the paradigm of how GATA-1 selects sequences at target genes. Contrasting with naked DNA binding in which WGATAR is believed to be sufficient, specific nucleotides at the 5′ and 3′ flanks of WGATAR were preferred, and A dominated in the R position, both with WGATAR alone and E-box-WGATAR elements, yielding the chromatin occupancy consensus (C/G)(A/T)GATAA(G/A/C)(G/A/C) (Fig. 3C). Within GATA-1-occupied E-box-WGATAR elements, the NN residues of the CANNTG E-box consensus exhibited significant sequence preferences. As FOG-1 facilitates GATA-1 chromatin occupancy (Letting et al., 2004; Pal et al., 2004a), and FoxA1 stabilizes GATA-4 chromatin complexes (Sekiya et al., 2009), protein-protein interactions are also important determinants. An obvious candidate for controlling GATA factor occupancy at E-box-WGATAR composite elements is the E-box, although GATA-1 occupancy was only slightly greater at composite elements vs. WGATAR motifs.
The localization of 90% of the occupied sites away from promoters (Fig. 3A) indicates that a canonical mode of GATA factor function involves long-range control. GATA-1 induces looping at the β-globin locus (Kim et al., 2007; Kim et al., 2009; Vakoc et al., 2005) and alters a pre-existing loop at c-kit (Jing et al., 2008), but whether looping is common or infrequent for GATA factors was unknown. Estrogen receptor-α induces looping (Carroll et al., 2005), and genome-wide analyses of estrogen receptor-α chromatin occupancy (Lin et al., 2007a) revealed abundant non-promoter sites. By contrast, >80% of E2F1 targets are promoters (Bieda et al., 2006). While GATA-1 (Blobel et al., 1998) and E2F1 (Fry et al., 1999; Trouche and Kouzarides, 1996) utilize CBP/p300 to regulate transcription, GATA-1 uniquely utilizes FOG-1 (Crispino et al., 1999) and MED1 (Stumpf et al., 2006). The primary sequence determinants and topographic constraints constitute chromatin occupancy rules that govern how GATA-1 establishes genetic networks that control critical processes.
Analysis of histone modification patterns in K562 cells generated by the ENCODE project with the identical line of K562 cells used in our ChIP-seq analysis (UCSC Genome Browser) revealed GATA-1 occupancy in introns that was often mutually exclusive with histone H3K36 trimethylation (Suppl. Fig. 3). Trimethylated H3K36 marks open reading frames, facilitates HDAC recruitment, and prevents aberrant transcription initiation (Carrozza et al., 2005; Li et al., 2007). Acetylated H3K9, dimethylated H3K4, and monomethylated K3K4 were commonly enriched at GATA-1 occupancy sites (Suppl. Fig. 3). This result suggests that H3K36 trimethylation is incompatible with GATA-1 function at intronic sites. Since H3K36 trimethylation mediates HDAC recruitment (Li et al., 2007), H3K36 trimethylation-dependent deacetylation at GATA-1-bound introns might oppose GATA-1-induced looping and/or intergenic transcription (Kim et al., 2009). Alternatively, H3K36 trimethylation or the associated HDAC recruitment might disfavor the assembly of functional GATA factor complexes within introns. As mutually exclusive factor occupancy and H3K36 trimethylation had not been described, it will be important to assess whether this is unique to GATA factors or can be applied in a broader context.
Mining the consolidated ChIP-seq and transcriptional profiling dataset revealed a rich set of GATA factor-shared and -selective targets, which were validated with high fidelity in primary erythroblasts. Many of these targets had not been implicated in GATA factor function or hematopoiesis, while others, such as RBM15, had been implicated in hematopoiesis and leukemogenesis, but were not known to function in GATA factor pathways. Our GATA-1 and GATA-2 ChIP-seq data provides a important resource for elucidating hematopoietic regulatory mechanisms.
Using rigorous computational and mechanistic analyses, our results establish a conceptual framework for understanding how the actions of GATA factors, Scl/TAL1, and ETO-2 are integrated to establish a genetic network that controls hematopoiesis and instigates leukemogenesis. GATA-1 directly represses Gata2 by displacing GATA-2 from this locus (Grass et al., 2003; Grass et al., 2006; Martowicz et al., 2005). Given the short t1/2 of GATA-2 (Lurie et al., 2008), GATA switches rapidly yield cells solely expressing GATA-1. By contrast to GATA-2 and GATA-1, which are expressed early and late in hematopoiesis, respectively, Scl/TAL1 is expressed in both stages (Begley et al., 1989; Lecuyer et al., 2002; Porcher et al., 1996; Schuh et al., 2005; Shivdasani et al., 1995). Though the mechanism by which LDB1 and Lmo2 mediate Scl/TAL1 function is unclear, ETO-2 is an attractive candidate for differentially controlling Scl/TAL1 activity during hematopoiesis. Reduced ETO-2 expression during erythropoiesis (Goardon et al., 2006) can be explained by our discovery that GATA-2 directly induces ETO-2, which is followed by ETO-2 negative autoregulation and GATA switch-mediated ETO-2 repression (Fig. 7B,C).
Our loss-of-function and gain-of-function studies define fundamental insights into how cell type-specific trans-acting factors function combinatorially to control a complex developmental process. The specific interactions include: GATA-2 induction of interacting trans-acting factors (Scl/TAL1, GATA-1, ETO-2) and an interacting co-repressor (LMO2); GATA-1 repression of GATA-2; GATA-1 repression of ETO-2, and ETO-2 negative autoregulation (Fig. 7B,C). Since GATA-2 and Scl/TAL1 co-localize (Wozniak et al., 2008) on chromatin, and ETO-2 antagonizes Scl/TAL1 (Goardon et al., 2006; Schuh et al., 2005), ETO-2 indirectly inhibits GATA-2. Accordingly, ETO-2 counteracts GATA-2-mediated induction of GATA-1, and given that GATA-1 represses ETO-2 expression, the balance between these positive and negative interactions must be strictly managed and dynamically regulated to ensure high fidelity of hematopoiesis. The elaborate integration of the activities of GATA factors, Scl/TAL1, and ETO-2, ensures the efficient and rapid transition from a GATA-2- to a GATA-1-driven genetic network. The differential co-repressor function of ETO-2 at endogenous targets, including key regulators of erythropoiesis and erythroid cell function, suggests that minimizing ETO-2 level/activity during later stages of hematopoiesis enables GATA-1 and Scl/TAL1 to efficiently establish the genetic network of the developing red blood cell. Further mining of our highly validated resource is expected to reveal additional fundamental insights into hematopoiesis and a broader spectrum of important biological processes.
GATA-1-null G1E (Weiss et al., 1997) and G1E-ER-GATA-1 cells (Gregory et al., 1999; Johnson et al., 2002) were maintained as described (Gregory et al., 1999; Johnson et al., 2002) and in Supplementary Experimental Procedures. ES cells were cultured and differentiated as described (Park et al., 2004) and in Supplemental Experimental Procedures.
Antibodies are described in Supplementary Experimental Procedures.
Quantitative chromatin immunoprecipitation (ChIP) analysis was conducted and validated as described (Im et al., 2004).
ChIP-Seq analysis was conducted as described in Supplementary Experimental Procedures.
The knockdown was conducted as described in Supplementary Experimental Procedures.
Quantitative RT-PCR analysis was conducted as described in Supplementary Experimental Procedures
Murine bone marrow erythroblasts were separated by magnetic cell sorting system (Miltenyi Biotec) using anti-Ter119 microbeads (Miltenyi Biotec).
This work was funded by NIH grants DK50107 (EHB), DK68634 (EHB), HG003747 (SK), HL55337 (KC) and 1U54HG004558 (PJF). We thank Stuart Orkin for providing Gata2−/− and Scl−/− ES cells.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.