|Home | About | Journals | Submit | Contact Us | Français|
The Polycomb group proteins (PcGs) play a vital role throughout development by maintaining precise gene expression patterns. In Drosophila melanogaster, PcG-mediated gene silencing is achieved through DNA elements called Polycomb response elements (PREs); however, the mechanism for establishing silencing and the requirements and composition of a working PRE are not fully understood. We have used the computer program jPREdictor to uncover PREs located within the invected (inv) locus. The functionalities of these predicted PREs were tested in two different assays: one analyzing their abilities to maintain expression of a β-galactosidase reporter gene and the other evaluating their abilities to establish pairing-sensitive silencing of the mini-white reporter in the vector pCaSpeR. We have identified two previously uncharacterized PREs at the inv gene and demonstrate that they produce similar results in the two assays. Our results indicate that clusters of protein binding sites do not accurately predict PREs and provide new insight into the DNA sequence requirements for the binding of the PcG protein Pho. Finally, our data show that PREs and regulatory DNA from different genes can function together to establish PcG-mediated silencing, highlighting the versatility of PREs despite discrepancies in the number and location of DNA binding sites.
Polycomb group (PcG) genes encode a large group of conserved proteins that act on chromatin to repress transcription. Originally identified in Drosophila melanogaster as silencers of homeotic genes, PcG proteins act as transcriptional repressors for many diverse targets. PcG proteins act in multiple protein complexes to modify chromatin with marks typically associated with gene repression, including trimethylation of histone H3 at lysine 27 and ubiquitination of H2A at lysine 119 (for a review, see reference 32). Histone deacetylases may also be involved. The exact protein complexes and mechanisms of PcG protein function are areas of intense investigation, with new complexes still being discovered (for a review, see references 32 and 38). Genetic evidence suggests that not all PcG proteins work on all targets (41), so it is probable that, in addition to having shared targets, different PcG protein complexes may regulate different genes. While all PcG proteins are specifically associated with chromatin, the only known PcG proteins with sequence-specific DNA binding activity are Pleiohomeotic (Pho) and the related protein Pho-like (Phol) (3, 5). Pho is present in a complex with dSfmbt (Pho-RC) and is thought to play a key role in recruitment of other PcG protein complexes (24).
In Drosophila, PcG proteins are recruited to specific cis-regulatory DNA elements called Polycomb response elements (PREs). These elements are typically found upstream of the transcript region and vary greatly in size and sequence (for a review, see reference 31). Extensive analysis of PREs at a variety of genes, including the biothorax complex (BX-C) genes, engrailed (en), polyhomeotic (PH), and others, has revealed that PREs consist of multiple short motifs (34). DNA binding proteins implicated in PRE function include the GAGA factor (GAF) (42) and Pipsqueak (Psq) (27), which bind the same sequence, and Zeste (20, 36), Pho (5, 28), Sp1/Krüppel-like factor (KLF) family members (4), DSP1 (9), and Grainyhead (Grh) (2).
As mentioned previously, the Pho subunit of Pho-RC is a DNA binding protein and has been shown to be a central component of PREs and PcG protein function. In vitro binding studies of the Pho protein and its mammalian homolog, YY1, have resulted in the generation of a core consensus sequence for Pho binding (GCCAT) (14, 21, 29). Recent chromatin immunoprecipitation (ChIP)-on-chip experiments have led to the development of longer, more stringent versions of the Pho consensus sequence (26, 33). Once the Pho-RC complex is bound to PREs, it is believed to assist in actively recruiting the other PcG complexes to DNA, placing Pho in a role of great importance for the establishment of PcG-mediated silencing. While it is easy to speculate that Pho is a central component of PcG function, it is interesting that the removal of Pho consensus sequences from PREs does not always result in the complete loss of PcG-mediated silencing (4, 5); furthermore, while Pho mutants exhibit phenotypes associated with Polycomb (PC), they are able to survive to the pharate adult stage (16). The identification of Phol, a factor with a high degree of homology to Pho and some degree of functional redundancy, helps explain why the pho mutant phenotype is not more severe (3).
Since deletion of any one individual DNA binding site often does not have a dramatic effect on PRE activity, it has been speculated that the multitude of binding sites work together to cooperatively recruit the factors necessary to establish gene silencing. Furthermore, it has been difficult to determine which combination of binding sites is required for PRE activity. To try and gain a better understanding of what key elements and binding sites may be required for PRE function, Ringrose et al. (35) analyzed experimentally defined PREs throughout the genome in an attempt to find similarities that might allow for detection of additional PREs. Their findings support the idea that the DNA binding sites work cooperatively, as PREs are likely to contain multiple binding sites close to one another, with multiple Pho binding sites being the strongest predictor of a PRE.
One particular target of PcG regulation, en, has been found to have two PREs and has been the subject of many studies regarding PcG function and PRE activity (1, 5, 22). en is part of a gene complex which includes another gene, invected (inv). The two genes have been shown previously to code for proteins with similar sizes and sequences, each containing a homeodomain (6, 18). Results from additional experiments indicate that the two genes are coregulated (17, 18). These data led us to the question of whether inv is regulated by its own set of PREs or whether en PREs regulate both genes. To date, no PREs have been identified at inv, so we sought to predict new PREs at inv by utilizing the established prediction methods. Here, we report the finding of two new PREs at inv. The PREs were identified using a combination of the jPREdictor program (13) and published ChIP-on-chip data (7). While the jPREdictor program helped us identify one of the PREs, it failed to identify the second PRE, presumably because this particular PRE lacks canonical Pho consensus sequences. Despite the lack of consensus Pho binding sites, we show that Pho binds to various regions of this PRE fragment in vitro, indicating that Pho is able to bind a variety of DNA sequences.
A 30-kb DNA sequence (from FlyBase D. melanogaster release 5.9) surrounding the inv transcription start site was entered into the jPREdictor program (http://bibiserv.techfak.uni-bielefeld.de/jpredictor/). The frequencies of individual and paired DNA binding motifs (as described in reference 35) were determined, and the numerical data output from the jPREdictor program was plotted in Microsoft Excel. The coordinates of the identified fragments, including the fragment-specific primer sequences (with restriction sites and FLP recombination target [FRT] sequences not included), are listed in Table Table11.
All constructs for the pairing-sensitive silencing (PSS) assay were derived from the pCaSpeR4 parent construct by using KpnI/XbaI sites for cloning of the inv fragments into pCaSpeR4. The SD10 construct was used to generate constructs for the embryonic PRE activity assay. SD10 was generated by cutting construct P[en2] (10) with SphI to remove a 2-kb fragment of DNA including PSE1 and PSE2 fragments. The inv fragments were generated by PCR using primers that contained SphI sites and FRT sites. Fragment insertion and orientation were verified by DNA sequencing.
All constructs were injected into w1118 embryos by Genetic Services (Sudbury, MA), and transformants were identified by the presence of eye color. PRE deletion (ΔPRE) lines were generated by crossing transgenic lines to a line with heat-sensitive FLP and heat shocking embryos for 1 h at 37°C 2 days post-egg laying. Removal of the inv inserts was verified by PCR.
Preparation of embryos for immunoperoxidase staining was done as described by DiNardo et al. (11). Anti-β-galactosidase (anti-β-Gal) primary antibody (1:15,000) was used with anti-rabbit secondary antibody (1:200), and antibodies were detected using an ABC elite kit (Vector Labs).
Full-length Pho was synthesized in vitro from the T7link/PHO vector (14) by using the TNT coupled transcription/translation system (Promega). Gel shifts were performed as described by Americo et al. (1), except that in vitro-synthesized full-length Pho was used in place of nuclear extract.
To better understand the coregulation of the en and inv genes, we sought to identify potential PREs upstream of and within the inv transcript. Previous studies by Ringrose et al. (35) concluded that combinations of specific PRE binding sites, including those for Pho, GAF, Zeste, and EN1, can predict that a DNA fragment is a PRE. This prediction analysis was subsequently presented in a format accessible online as the jPREdictor program (13). The DNA sequence around the inv gene (kb −12 to +13) was entered into the jPREdictor program, and prediction scores, graphically displayed in Fig. Fig.1,1, were generated using the standard PC combined-motif analysis included in the program. The PRE prediction scores ranged from 0 to 90, with an average background score of 6.2.
Recent evidence from multiple groups has shown that the jPREdictor program, while useful in narrowing down DNA fragments that may have PRE functions, is not very accurate in identifying PREs, as it misidentifies many and misses some (33, 37). In addition, while our scores were not extremely low, they were not as high as scores for some well-defined PREs, such as PRE1 and PRE2 at en, which scored 235 and 134, respectively (data not shown). To help us better differentiate between real and background peaks in our prediction, we took advantage of published genome-wide ChIP-on-chip data (7) to give us a better understanding of the protein-DNA interactions around inv. Protein binding data for the PC and Polyhomeotic (PH) PcG proteins (7) were aligned with the prediction data and plotted on the graph in Fig. Fig.1.1. As shown previously, the PC binding data reflect a general coating of PC along the length of the DNA, with no isolated peaks (data not shown). In contrast, the PH binding profile shows two main peaks—one approximately 4 kb upstream of the inv transcription start site and the second at the beginning of the inv transcript. The intergenic PH peak coincided with one of the strongest peaks identified by the jPREdictor program, suggesting that this particular prediction has a good chance of representing a biologically functioning PRE. Interestingly, the second PH binding peak, located around the transcription start site, did not align with any significant PRE prediction peaks. The remaining PRE prediction peaks did not coincide with any unique aspects of the PH or PC binding profiles. From these various data sets, we identified six DNA fragments (numbered 1, 2, and 4 through 7) that might be inv PREs and one control fragment (numbered 3) (Fig. (Fig.11).
The phenomenon of PSS was fortuitously discovered during studies of a 2.6-kb fragment of en DNA, subsequently found to contain two PREs (22, 23). Each of these PREs is able to repress the expression of the mini-white gene that is present in the constructs as a reporter gene. This repression is called PSS because it is much stronger in flies homozygous for a PRE-containing mini-white transgene than in flies heterozygous for the transgene. In practical terms, this often means that heterozygotes have orange eyes and that homozygotes have white eyes. Many other PRE fragments have since been shown to exhibit this ability to silence the mini-white gene, making PSS a good assay to screen for PREs among our inv DNA fragments. DNA fragments were inserted upstream of the mini-white gene in the orientation opposite of their orientation toward their own promoter (Fig. (Fig.2A).2A). Since PRE fragments have been shown previously to be orientation independent (1, 25), these fragments should be able to function in either direction. Transgenic flies were obtained, and the number of fly lines exhibiting PSS compared to the total number of transgenic lines was determined for each inv fragment (Fig. (Fig.2B).2B). Data on PSS by the well-characterized en PREs, PRE1 and PRE2, are included in Fig. Fig.2B2B for comparison.
The inv DNA fragments gave a range of results, from strong PSS activity (fragment 1) to moderate PSS activity (fragment 4) to little (fragment 5) or no PSS activity. Fragments 1 and 4 had the highest levels of PSS activity. Figure Figure11 shows that they are both located at the peak PH binding sites identified by ChIP, indicating that PcG proteins are able to bind to these DNA sequences in vivo and would likely contribute to their ability to silence mini-white in our assay. Fragment 5 overlaps with fragment 4 and appears to contain a low level of activity. Interestingly, while fragment 1 corresponds to a strong PRE prediction, indicative of having multiple PRE consensus binding sites present, fragment 4 has a much lower prediction score of 26.7, a comparatively insignificant score. This finding will be further discussed later in this study. It is also interesting that some of the fragments that have the highest scores for PRE prediction, such as fragments 6 and 7, have no PSS activity.
The PSS assay gives a good indication of the abilities of the predicted fragments to invoke downregulation of a reporter gene, in this case, the mini-white gene. While the PRE-containing mini-white transgene is present in every cell in the embryo, the mini-white gene is poised for activation only in the eye and can be assayed only with this tissue. In embryos, PcG proteins act on genes that are active in some cells and repressed in others and the PcG proteins must differentiate between genes that should be kept off and those that should be left on. Thus, while the PSS assay gives a good indication of whether or not the inv fragments are capable of silencing a reporter gene, it does not tell us whether these DNA fragments are capable of recruiting PcG proteins in a discriminatory manner to maintain specific gene-silencing patterns.
To specifically assay for PRE activity, we tested the abilities of the inv DNA fragments to maintain a specific pattern of expression of a reporter gene. We began with a vector that contains the en promoter and 8 kb of en regulatory DNA cloned upstream of the β-Gal reporter gene (P[en2] ) and that maintains expression in en pattern-like stripes throughout embryonic development. This construct contains regulatory DNA for en stripes and a 2-kb DNA fragment that contains en PRE1 and PRE2. We removed the 2-kb fragment that contains these PREs to generate our vector, SD10. Without the en PREs, β-Gal is expressed in stripes early in development but severe misexpression occurs during later developmental stages (10). Thus, our test inv fragments will restore the en stripe pattern late in development only if they function similarly to the en PREs.
Each inv DNA fragment was inserted into the SD10 construct between the upstream en regulatory DNA and the en promoter, yielding SD10-inv constructs. With the exception of fragment 4, all fragments were inserted in the same orientation as the en DNA and the β-Gal gene. Since fragment 4 contains the inv promoter, it was inserted in the reverse orientation to circumvent potential problems with the β-Gal reporter gene. In addition, each inv fragment was flanked by FRT sites to allow for specific excision of the fragments. These constructs were injected into fly embryos, and expression patterns of β-Gal in fixed embryos of transgenic fly lines were analyzed. While multiple lines were analyzed for each construct, representative results are shown in Fig. 3B and C.
The results obtained from this assay correlated well with those obtained from the PSS assay. Fragments 2, 3, 6, and 7 had no activity in this assay (an example of these results is presented in Fig. Fig.3C).3C). In general, the embryos carrying these fragments showed large amounts of misexpression in between stripes, indicating a lack of PcG-mediated repression of the en promoter in our transgene. However, some SD10-inv constructs exhibited significant variability in the degree of misexpression produced among the independent transgenic fly lines. For example, while most lines generated from the SD10 construct carrying fragment 7 (SD10-inv7) showed significant misexpression, suggesting that inv fragment 7 (inv7) does not function as a PRE, a few SD10-inv7 lines showed very little misexpression (data not shown). We have shown previously that the function of en regulatory DNA in this construct can sometimes be influenced through interaction with flanking genomic PREs, maintaining striped expression patterns late in development (10). Therefore, we utilized the FLP recombinase system to remove the inv fragments, leaving the remainder of the en promoter and upstream regulatory DNA intact. If the restrictive expression patterns observed for β-Gal are indeed the result of PRE activity from the inv fragments, then removal of the putative PRE should result in increased misexpression of the β-Gal reporter gene.
Embryos were collected and stained for β-Gal expression after removal of the inv DNA fragments. Fragments 1, 4, and 5 all led to increased misexpression in the ΔPRE embryos compared to the wild-type embryos (Fig. (Fig.3B3B and data not shown). The other inv fragment insertions, including the inv3 insertion (Fig. (Fig.3C3C and data not shown), produced no observable difference in β-Gal expression between the ΔPRE and wild-type embryos, further indicating that fragments 2, 3, 6, and 7 do not exhibit PRE activity.
Our results from the β-Gal expression experiments indicated that multiple DNA fragments from the inv locus can maintain en-like expression patterns of β-Gal in a manner indicative of PRE function. These results touch on the questions of what elements are required for PRE function and whether PREs, if fundamentally similar, are interchangeable. Our results indicate that inv PREs can substitute for en PREs at the en promoter. However, evidence suggests that inv and en are functionally redundant and likely share regulatory DNA (17, 18), so it is possible that these DNA fragments have evolved to interact with one another in vivo. To explore the possibility of whether other previously identified PRE fragments could also function with the en promoter, we inserted PRED (from the bxd region of Ubx ) upstream of the en promoter in the SD10 construct to see if it was able to regulate β-Gal expression. The embryo stainings in Fig. Fig.3D3D show that PRED is able to maintain almost perfect expression of β-Gal in en pattern-like stripes, even very late in embryonic development (data not shown). This result reaffirms not only that PRED is a strong PRE, but also that it is able to regulate the promoter of another gene, suggesting that there is some core element of these PREs that allows them to function with other promoters. PRED also appeared to be a stronger PRE than any of the inv PREs. That is, while nearly perfect maintenance of stripes was seen in the two PRED lines, all inv lines showed some misexpression between stripes (Fig. (Fig.3).3). Previous evidence suggests that there are several PRED subfragments that can silence mini-white, indicating that PRED is a complex element (19).
The data generated from the previous two experiments suggest that out of six predicted PREs in the inv locus, two acted as strong PREs and one acted as a weak PRE. Since most of the predicted fragments were chosen based on the frequency of consensus DNA binding sites in the jPREdictor program, we were interested to see if there were any clear differences in the binding site characteristics between the three identified PREs and the remaining negative fragments. In addition, other consensus binding sites that are thought to contribute to PRE function have been discovered since the 2003 study, which may aid in the development of an improved set of prediction criteria.
The DNA sequence for each predicted PRE fragment in this study was analyzed for the presence of various consensus binding sequences. In addition to the consensus sequences used in jPREdictor, we searched for DSP1 binding sites (GAAAA) (9) and Sp1 binding sites (RRGGYG) (4). Sp1 binding sites were included in the Ringrose study but were not specifically enriched in known PRE sequences versus random DNA sequences (35). DSP1 was discovered more recently in a screen for corepressors of Dorsal and was subsequently reported to aid in the recruitment of PcG proteins to the Ab-Fab PRE of Abd-B (8, 9). However, while DSP1 may aid in PRE function at some genes, it has been shown to be dispensable at others, rendering its involvement in general PRE function still uncertain (15, 25). Similarly, while the Grh protein has been shown to aid in Pho binding to the PREs, it has also been reported previously that Grh is not sufficient for PcG recruitment and also functions as an activator elsewhere in the genome (2). This observed variability in protein requirements suggests that not all PREs have the same DNA binding proteins.
The distribution of the binding consensus sites identified in each inv fragment is shown in Fig. Fig.44 and numerically displayed in Table Table2;2; the same analysis was done for the well-characterized en PRE fragments for comparison. Upon first inspection, the visual comparison between the fragments seemed to show that active PRE fragments (inv1 and inv4) had clusters of binding sites, in contrast to other fragments (inv3 and inv7) in which the binding sites appeared to be more diffuse across the region. However, inv6 also had clustered binding sites, and this fragment was negative for PRE function, so binding site proximity does not seem to be a clear marker of PREs. Looking at the numbers of binding sites in Table Table2,2, we find that inv4 does not contain any Pho binding sites. This was surprising since multiple studies have shown that Pho is an important contributor to the recruitment of PcG proteins (30, 34, 43). Analysis of the DSP1 and Sp1 sites did not show any significant enrichment in the PRE fragments compared to the negative fragments, and no Grh sites were found in any of the inv fragments.
Overall, it did not seem that DNA consensus sequences were useful in predicting functional PREs. This is especially obvious when comparing negative fragments with high prediction scores (like inv6) to inv4, which had strong PRE activity but virtually no PRE prediction score (Fig. (Fig.11).
DNA sequence analysis of the inv PRE fragments revealed that inv4 PRE did not contain a consensus Pho binding site, which is believed to be important for PRE function. However, recent ChIP-on-chip binding profiles show that Pho is bound to the regions where inv1 and inv4 are located, with stronger binding of Pho at inv1 and weaker binding at inv4 (33). This proposes an interesting question: how does the Pho protein bind a PRE that appears to lack Pho binding sites?
The DNA sequence required for Pho binding has been a source of much investigation. Pho is homologous to the mammalian transcription factor YY1, exhibiting 95% identity in the DNA binding domain (5). Previous studies have tried to identify a Pho-specific consensus sequence, proposing CNGCCATNDNND (28) and GCCATHWY (14) as possible motifs. However, these efforts have not yielded agreement on any sequences outside a GCCAT core sequence, similar to the YY1 consensus sequence [(C/g/a)(G/l)(C/t/a)CATN(T/a)(T/g/c), where uppercase letters indicate the preferred base] (21, 44). Since the initial analysis of the inv DNA fragments was done with the core GCCAT motif, any potential Pho sequences should have been identified, regardless of the flanking DNA sequences. However, as no Pho sites that matched this core sequence were found, we manually scanned the inv4 fragment for sequences that matched the Pho or YY1 sequence with minimal base pair mismatches. Six possible Pho binding sequences were identified, and Pho binding to these short sequences was tested using in vitro competition binding assays (Fig. (Fig.5).5). As expected, cold Pho-specific oligonucleotides are able to compete with radiolabeled Pho oligonucleotides for protein binding, as indicated by the lack of a radiolabeled band shift. In contrast, mutated Pho oligonucleotides lose the ability to compete for Pho binding. Interestingly, five of the six inv oligonucleotides were able to compete with the radiolabeled Pho-specific oligonucleotide, indicating that the oligonucleotide sequences have high affinities for Pho binding in vitro. We were unable to determine by further analysis of DNA sequences why the inv4-2 oligonucleotide was unable to compete for Pho binding. Thus, despite the lack of a traditional Pho consensus binding site, which resulted in our failure to computationally predict fragment 4, we find that Pho is still able to bind fragment 4 in vitro.
We utilized PRE prediction programs and protein binding ChIP-on-chip data to identify new PREs at the inv gene. Two PREs were identified upstream of the inv transcription start site, and their activities were confirmed using two assays which tested their abilities to silence different reporter genes at various developmental stages. Interestingly, we found that of the two major PREs identified, only one was identified using the jPREdictor program and that both PREs tracked with the PH ChIP binding profile. The absence of a PRE prediction score for the downstream PRE was due to the lack of Pho consensus sequences, one of the strongest known predictors of PREs (35). However, biochemical data presented here as well as recently published ChIP data indicate that Pho is capable of binding to this PRE fragment despite its lack of a traditional Pho consensus sequence. A third fragment was also capable of functioning as a PRE, albeit weakly in the assays used in this study. Given that the weak PRE, fragment 5, overlaps with 485 bp of fragment 4, a stronger PRE, we anticipated that the low level of activity was attributable to the partial overlap of the two fragments, thus giving fragment 5 only some of the information needed to function fully as a PRE. However, further analysis of the overlap between fragments 4 and 5 revealed that they share only four protein binding sites—two GAF sites and two Sp1 sites. Thus, the DNA sequence alone is not enough to help us understand why fragment 4 acts as a stronger PRE than fragment 5 when fragment 5 appears to contain all of the requisite binding sites for a well-functioning PRE. While inv5 is capable of producing a low level of PRE activity on its own, its close proximity to inv4 in vivo suggests that it may work with inv4 to produce an even stronger PRE near the inv transcription start site. Thus, we feel that overall, only two major PREs have been identified at inv.
PREs are an integral component of the cell differentiation process, ensuring that specific genes remain silenced throughout the remainder of development. Given their crucial involvement in various developmental processes, much time and effort have been devoted to gaining a better understanding of what basic components constitute a PRE. Many groups have demonstrated that elimination of more than one protein binding site is usually required to significantly affect PRE activity, indicating that PREs contain a multitude of protein binding sites that function cooperatively to recruit PcG proteins. However, our findings, as well as other published data (9, 33, 37), seem to indicate that specific clusters of binding sites are not enough to constitute a PRE. Additional unidentified binding sites or other chromatin-related modifications may be some of the missing links.
There is also the added challenge of identifying the exact sequences required for protein binding. Many protein consensus binding sequences are initially identified by using in vitro biochemical assays or by evaluating DNA sequence homology among multiple DNA loci. As research continues, additional experimental evidence often leads to further refinement of the consensus sequences with enhanced specificity, as recently demonstrated with longer consensus sequences identified for the Pho protein (26, 33). While these superstringent DNA sequences aid in more rapid identification of potential binding sites, they also pose the risk of eliminating other potential candidate sites from consideration. Our results from studying the inv DNA sequence demonstrate this pitfall, as one of our PREs (inv4) was capable of acting as a PRE in vivo and specifically binding the Pho protein in vitro despite its apparent lack of a Pho consensus sequence. Had the ChIP-on-chip data not been available to us, the lack of Pho consensus sites would have prevented us from identifying the inv4 PRE fragment. Furthermore, other DNA regions that had much stronger prediction scores, such as fragments 2 and 6, did not show any PRE activity. Thus, while computerized prediction tools can be a useful starting point, it is clear that PREs cannot be identified by DNA sequence alone, at least not at this time.
There has been much speculation regarding the strength of the relationship between PREs and PSS. It is not completely understood at what stage PSS takes effect, but the results are manifested in adults. Our findings allow us to make a much stronger conclusion regarding the relationship between these two silencing phenomena, and we find that in each case where PSS was observed, the same fragment was also able to maintain en pattern-like stripes as seen with en PREs. The magnitude of these silencing effects was also highly correlated with PRE activity; fragments that exhibited higher frequencies of PSS also maintained sharper stripes in the en/β-Gal PRE assay, and vice versa. While this correlation is very strong in this particular case, there are reports of fragments of DNA that can act as pairing-sensitive silencers but not as PREs in embryos (reviewed in reference 22a). In these cases, it seems likely that the DNA fragments that can mediate PSS act in combination with additional, nearby DNA fragments to form fully active PREs (as described previously for iab-2 ). Thus, available evidence suggests that DNA fragments that mediate PSS are components of PREs.
The DNA consensus sequence for Pho was identified from sequence comparisons among various PREs, including the 181-bp en PRE fragment originally used to isolate Pho (5, 28). The original core sequence identified to be necessary for Pho binding was similar to the CCATNTT consensus sequence identified for YY1, the mammalian homolog of Pho (40). The core sequence for Pho was later honed down to the GCCAT motif, although more recent studies on the Pho protein and its target DNA binding sites have led to further refinement of the Pho consensus sequence, with the most recent submission being the 11-bp sequence G(C/A)(C/G)GCCAT(T/C)TT (33). These identified sequences are frequently used in genome-wide searches for predicting Pho binding sites and are a central component in predicting PREs. Thus, it is important to include a site that is specific enough to separate the true Pho binding sites from the false positives. However, our findings indicate that future PRE predictions should be made cautiously when DNA sequences are scoured for potential Pho binding sites, as longer, more stringent DNA consensus sequences may force many PREs to be overlooked. This is further demonstrated by reports that PcG binding profiles, as determined by ChIP-on-chip studies, coincide with an average of only 20% of the PREs predicted by jPREdictor (33, 37). The goal of computationally predicting PREs is laudable, as PREs are developmentally regulated and will vary in different tissues and, thus, detection using ChIP-on-chip may miss PREs present in only a few cells of an embryo or larva.
Finally, our data suggest that the inv PREs are weaker than the bxd and en PREs, which are all composite PREs. Since PREs mediate long-distance interactions, we suggest that in vivo, the two inv PREs interact with each other and also with the en PREs to stabilize PcG repression at the en-inv locus.
We thank Jürg Müller for sending the ChIP-on-chip data on Pho binding to the en and inv regions of the genome prior to publication and Bernd Schuettengruber and Giacomo Cavalli for sending us their tiling array data for PH and PC in the region of en and inv. We thank Kris Langlais and Yuzhong Cheng for comments on the manuscript.
This research was supported by the Intramural Research Program of the NIH, NICHD.
Published ahead of print on 30 November 2009.