Our results are consistent with prior studies on the specific activity of individual members of a family of transcription factors and suggest an emerging model of how related transcription factors maintain some common functions and yet achieve specific transcriptional activity. Similar to studies on the ETS family of factors, MYOD and NEUROD2 bind to a shared E-box motif and each has its own distinct private E-box motif. Binding at the NEUROD2 private sites, and to a lesser extent at the MYOD private sites, is correlated with transcriptional activation of their respective differentiation programs, which is similar to the reported association of factor-specific binding sites with genes regulated by individual members of the ETS family. In contrast, binding at the NEUROD2 or MYOD shared sites does not show the same degree of regional gene activation. In addition, NEUROD2 showed stronger transcriptional activation of a reporter driven by its private E-boxes compared to the shared E-box motifs, and MYOD showed the same trend. This does not appear secondary to affinity, since peak height was similar at private and shared sites (data not shown). These findings indicate that motif sequence might confer a level of transcriptional activity on the bound NEUROD2 or MYOD, similar to the sequence-specific allosteric activation described for the GR receptor (Meijsing et al., 2009
The E-box motif for NEUROD2 is similar to the consensus binding site identified for the related neurogenic bHLH factor ATOH1 (RMCAKMTGKY) in a ChIP-Seq study from mouse cerebellum (Klisch et al., 2011
). The central dinucleotide preferences are similar to NEUROD2, whereas ATOH1 appears to have a palindromic flanking nucleotide preference different from NEUROD2, although this might result from the motif algorithm method used. Interestingly, a subset of flanking nucleotides are enriched at ATOH1 E-boxes in enhancers of genes expressed in dorsal interneurons (AMCAGMTG) (Lai et al., 2011
), suggesting E-box specificity might have a role in neuronal subtype gene regulation; however, functional differences were not observed in this study.
The biological role of the NEUROD2 and MYOD shared sites remains unclear. Although we do not yet know the biological significance of these shared sites, the induction of a bimodal H4 acetylation signal is similar to the criteria developed for biologically functional binding sites for several transcription factors (FOXA2, PDX1, HNF4A) in liver development (Hoffman et al., 2010
), and it is interesting to speculate that the alteration of histone modifications at many thousands of sites genome-wide might have a yet unknown biological function that is distinct from regional transcription, perhaps related to nuclear compartments and/or architecture (Lieberman-Aiden et al., 2009
It is interesting that MYOD and NEUROD2 both induce the expression of Znf238
. ZNF238 binds to a consensus sequence that includes the CAGATG E-box, whereas the ZEB1 site includes the CAGGTG E-box. In skeletal muscle cells, ZNF238 has been shown to inhibit the expression of the Id
genes and its binding appears to prevent MYOD activity at the same region (Yokoyama et al., 2009
). Similarly, ZEB has been shown to bind the E-box in the IgH
enhancer and prevent its activation in non-B cells (Genetta et al., 1994
). Therefore, MYOD and NEUROD2 initiate the expression of factors that can suppress their activities at a subset of E-boxes, possibly limiting the genes regulated by each factor. This is consistent with the transient activation of Id
genes by MYOD and might be a general method of suppressing the early programs initiated by MYOD and NEUROD2.
It is interesting that NEUROD2 activates approximately the same number of genes in MEFs as in P19 cells but there is very little overlap in the set of regulated genes. Similarly, MYOD activated different sets of genes with partial overlap in P19 cells and MEFs. Therefore, both are active transcription factors in both cell types, but the cell-type determines the target genes that will be activated. Our nuclease access studies indicate that chromatin structure is a major determinant of binding site accessibility in the different cell lineages. This is consistent with the studies showing that nuclease accessibility predicts GR binding (Biddie et al., 2011
; John et al., 2011
). However, accessibility is not the only determinant of binding at a particular site. Motif analysis determined that additional E-boxes were associated with both NEUROD2 and MYOD peaks and PBX and homeobox-like motifs with NEUROD2 peaks. This study together with our prior MYOD ChIP-Seq study (Cao et al., 2010
) identified MEIS and RUNX motifs with MYOD peaks. Therefore, accessibility is important for the spectrum of sites available for MYOD and NEUROD2, whereas other factor motifs may influence the degree of binding at particular accessible sites. Although associated with NEUROD2 or MYOD binding, we did not find an association of these motifs specifically with regulated genes (data not shown), suggesting a role in binding rather than transcriptional activation. This is in contrast to the strong association of RUNX1 motifs near TAL1 binding sites in T-cells (Palii et al., 2011
), where RUNX appears to play a direct role in TAL1 binding and gene regulation. It is also important to note that in our study we are identifying associated motifs and have not directly identified the factors binding at these motfis.
In this study and in our prior MYOD ChIP-Seq study (Cao et al., 2010
) we identified tens of thousands of bound sites. In both studies, neither peak height nor p-value accurately predicted peaks that were associated with a regulated gene. An important consideration in this study is that we have forced the expression of both MYOD and NEUROD2 by lentiviral transduction. Our previous publication on endogenous MYOD binding in C2C12 mouse muscle cells and MEFs virally transduced with MYOD showed a 90% similarity in peak location. In addition, comparison of the lentiviral MYOD binding in MEFs with endogenous MYOD in C2C12 cells and primary mouse myotubes shows a similar level of concordance (Z. Yao, manuscript in preparation), indicating that the lentiviral transduction produces an accurate representation of the binding of endogenous MYOD, possibly because of limiting amounts of the endogenous E-protein dimerization partner, which would also be true for NEUROD2.
In summary, both NEUROD2 and MYOD bind to tens of thousands of sites genome-wide. Factor-specific transcriptional programs appear to be encoded, at least in part, by private E-boxes that drive the transcriptional programs of neurons and muscles in P19 cells and MEFs, respectively; whereas many thousands of shared sites are associated with histone acetylation but not as strongly associated with regional gene transcription, particularly for NEUROD2. Cell lineage determines the accessibility of the sites and constrains the transcriptional response by each factor. The fact that NEUROD2 and MYOD activate the expression of large numbers of genes that are not normally a part of their differentiation program when expressed in a different lineage (i.e., NEUROD2 in MEFs and MYOD in P19 cells) indicates that lineage transitions, such as epithelial-to-mesenchymal transition, could profoundly alter the transcriptional program of these, or other, transcription factors.