A critical question in gene regulation is how selective sets of transcription factors are specifically recruited to their target sites. For site-specific DNA binding factors, a major component of the genomic recruitment mechanism is the highly specific interaction of the DNA binding protein with its consensus motif. The relatively new technology of ChIP-seq has allowed very precise analyses of sequences involved in recruitment of site-specific DNA binding factors (and proteins complexes associated with DNA binding factors) to specific genomic locations. For example, the
in vivo binding sites for transcription factors such as p63, STAT1, and REST show high enrichment for a specific motif. In fact, ~75% of the peaks identified by ChIP-seq for these factors contain the known consensus motif for that factor within 50 nucleotides of either side of the center of the peak (
1). However, there are clear examples of genomic recruitment of site-specific transcription factors being dictated, at least in part, by protein-protein interactions. For example, approximately half of the binding sites for the serum response factor are cell type-specific, and it has been proposed that the cell type-specific binding is due to serum response factor making different protein-protein interactions in different cell types (
2). Although tethered recruitment has been proposed as a mechanism by which human transcription factors can be recruited to the genome, very few studies have tested this possibility by analyzing the
in vivo binding patterns of transcription factors that have been mutated in their DNA binding and/or protein interaction domains. However, a recent study has shown that the estrogen receptor can be recruited to the genome through both a direct interaction of its DNA binding domain with a well characterized estrogen response element and via tethering mediated by interactions of the estrogen receptor and other DNA binding proteins such as Runx (
3).
E2F1 is the founding member of a set of transcription factors that have been implicated in controlling critical cellular (entrance into S phase, regulation of mitosis, apoptosis, DNA repair, and DNA damage checkpoint control) and organismal (regulation of differentiation, development, and tumorigenesis) functions (
4–
6). There are eight genes for E2F family members encoded in the human genome (see Refs.
5 and
7 for recent reviews of the E2F family), with the highest degree of homology among the E2F family members being in their DNA binding domains (DBDs).
3 E2F family members bind poorly
in vitro unless they are complexed with a member of the DP family of transcription factors (
5,
8–
10). However, E2F7 and E2F8 are exceptions to this rule, functioning as homodimers or heterodimers with each other (
11–
17). The DBD of E2F1, located between amino acids 120–191, consists of a basic helix-loop-helix structure (
4), with a fold resembling a winged helix DNA binding motif, as revealed by crystal structure analysis (
18). Although the DBD is required for direct binding to DNA, it is not sufficient for
in vitro binding. High affinity binding to DNA also requires the contribution of the adjacent hydrophobic heptad repeat leucine zipper domain (amino acids 188–241), which is known to be involved in heterodimerization with the DP family of transcription factors (
10,
19–
23). A multitude of
in vitro DNA-protein interaction studies and promoter reporter assays have identified an E2F consensus motif of TTTSSCGC, where S is either a G or a C (
4,
24), which is both necessary and sufficient for E2F binding
in vitro (
4,
24).
Although the DNA binding domain of E2F1 is clearly critical for
in vitro DNA binding (
25), it has also been suggested that other site-specific transcription factors may influence the recruitment of E2F family members to
in vivo binding sites. For example, using cells stably transfected with wild type (WT) or mutant herpes simplex virus thymidine kinase promoter constructs, Karlseder
et al. (
26) showed that occupancy of the E2F site in that promoter required the adjacent SP1 consensus site. Furthermore, the N terminus of the E2F1 protein was shown to directly interact with SP1, suggesting that tethering of E2F1 to the genome was mediated by SP1 (
27). Several additional studies have investigated a possible partnership between these two transcription factors and confirmed cooperative binding between SP1 and E2F1 at the c-
myc,
DHFR, and mouse
TK promoters (
26,
28,
29). Because an SP1 consensus motif has been identified as one of the most common motifs present in human promoters (
30), it is possible that tethering of E2F1 to the genome via interaction of its N terminus with SP1 may be an important recruitment mechanism. In addition to the N terminus, other domains of E2F1 have been implicated in protein-protein interactions. For example, previous studies have demonstrated that TFE-3 physically interacts with E2F3 and helps to recruit E2F3 to the ribonucleotide reductase 1, ribonucleotide reductase 2, and DNA polymerase α p68 subunit promoters (
31,
32). Similarly, RYBP (
Ring1 and
YY1
binding
protein) was identified as a “bridging” molecule between YY1 and certain E2F family members that can assist in the regulation of the
CDC6 promoter (
33). Of note, the protein-protein interactions between either TFE-3 or RYBP with E2F proteins were shown to be dependent on the E2F marked box domain (amino acids 243–358). The E2F marked box domain has also been implicated in facilitating DNA binding of E2F proteins via its interaction with DP1, in contributing to E2F-mediated DNA bending (
34,
35), and in interactions with other factors such as Jab1 (
36). Finally, NF-YA has been shown to be required for adjacent binding of E2F3 to the
cdc2 promoter (
36), whereas E2F4 binding to the c-
myc promoter was shown to depend on simultaneous binding of the SMAD proteins (
37). However, in these latter two cases, the domain of E2F required for the interaction has not been delineated.
In addition to interacting with other site-specific DNA binding factors, members of the E2F family have also been shown to interact with components of the general transcriptional machinery and/or other types of co-regulatory proteins. For example, the C-terminal transactivation domain of E2F1 (amino acids 368–437) can interact with the basal transcription factors TFIID, TFIIH, and TBP, as well as with transcription coactivators, including CBP/p300, TRRAP, GCN5, Tip60, and NCOA3 (
38–
46). Unlike many transcription factors that bind to both promoter and enhancer regions (see Ref.
47 for a review), E2F1 binds almost exclusively to core promoter regions (
48–
50), and the binding pattern of E2F1 is essentially indistinguishable from that of RNA polymerase II or TAF1 (the largest subunit of TFIID). Therefore, it is quite possible that E2F1 could be tethered to certain promoters via the strong interactions of its C-terminal transactivation domain with general transcription factors. The transactivation domain of E2F1 can also interact with members of the retinoblastoma tumor suppressor protein family. Although retinoblastoma lacks the ability to bind directly to DNA, it does interact with site-specific transcription factors such as AP2 and thus may serve as a bridge that allows AP2 to tether E2F1 to the genome (
51,
52). The C-terminal 70 amino acids of E2F1 can also interact with ANCCA (
AAA
nuclear
coregulator
cancer-
associated protein, also known as ATAD2) (
53). In addition to interacting with SP1, the N terminus of E2F can also interact with ANCCA and with cyclin A (
53,
54).
Taken together, the many functional studies of the E2F family suggest that protein-protein interactions may play an important role in recruiting E2F1 to the genome. However, most of the above-mentioned studies were performed in vitro or focused on one, or at most a handful, of genomic binding sites. Therefore, we have now used ChIP-seq to test the hypothesis that protein-protein interactions are involved in recruiting E2F1 to target sites in the human genome.