We have established a pipeline to identify transcription factor binding sites in vivo in C. elegans. This pipeline is designed to take advantage of the stability of fosmid-based transgenes, as well as their reliability in reproducing native expression patterns. The transgenic lines emerging from this pipeline tend to have between one and three copies of the transgene, and exhibit minimal, if any, over-expression (Sarov et al., in prep). Our initial trials with the RNA polymerase II subunit AMA-1 indicate that the transgenic, tagged version of a transcriptional regulator can indeed successfully recapitulate the DNA binding properties of the native factor. This pipeline can now be used on additional factors, and because the same antibody is used for every immunoprecipitation, will provide fairly uniform investigation of the binding sites of multiple factors, and aid in the dissection of regulatory networks in development.
As a first step toward this major goal, we identified candidate gene targets of PHA-4 in vivo at two distinct developmental stages. We chose PHA-4 as the initial factor for binding site identification for three primary reasons. First, it is a well-characterized factor with fundamentally important, yet distinct, functions at different times in development. Second, a handful of direct transcriptional targets of PHA-4 have been independently identified and validated, providing some key positive controls. Finally, PHA-4, unlike AMA-1, is expressed tissue-specifically, primarily in digestion-related tissues such as the pharynx and intestine. Thus, it provides a test case for whether ChIP can be performed on transcription factors with restricted expression.
A little over half of the PHA-4 targets we identified are in common between these two stages, suggesting that PHA-4 does have a general function in regulation of gene expression. However, over 40% are preferentially bound in one stage relative to the other, indicating that the ability of PHA-4 to mediate different processes likely occurs through a shift in the sets of targets it regulates. These data indicate that transcription factors can have diverse and key roles in distinct biological processes and underscore the importance of identifying binding sites under multiple conditions.
Several interesting differences in PHA-4 binding were noted between the two stages. For instance, among the many examples listed, several genes encoding members of the dosage compensation complex were preferentially bound by PHA-4 in embryos relative to L1s. During embryogenesis, PHA-4 helps specify the pharynx at the same time that the dosage compensation complex (DCC) is beginning to implement a two-fold reduction of transcription levels from the entire X chromosome. Little is known about how the dosage compensation complex interacts with tissue-specific programs, and our data suggests that PHA-4 helps to control the levels of the DCC in order to provide more or less dosage compensation in that tissue as needed. Possibly, master regulators in other tissues also regulate DCC levels in order to bring the level of dosage compensation in alignment with the needs of a specific tissue.
We have also demonstrated a novel role for PHA-4 in promoting the survival of larvae during starvation. Reduced PHA-4 levels resulted in decreased survival, while conversely expression of PHA-4:GFP in a wild type background increased survival. In particular, the increased survival indicates that the role of PHA-4 in this process is a regulatable function. This function is in keeping with its noted role in regulating environmental responses, as well as controlling longevity and dauer formation
[16],
[18]. Identification of the PHA-4 binding sites under the starvation condition illuminates some aspect of this function. A quite striking increase in genes involved in fatty acid metabolism and sterol biosynthesis were seen in L1s relative to embryos. Accordingly, many nuclear hormone receptor genes, which encode proteins that bind steroid hormones, were preferentially bound by PHA-4. The nuclear hormone receptor gene family in
C. elegans is much expanded relative to other organisms, and many of the ligands for these proteins are unknown. It is possible that a subset of these proteins respond to endogenous steroid hormones generated in response to starvation, and that PHA-4 mediates their induction.
Overall, the experimental ChIP-Seq pipeline we developed has produced global binding data, expanding the view of how PHA-4 works as both a master regulator of organ development and a mediator of starvation survival. PHA-4 primarily functions as an activator in both situations, based on our analysis of gene expression concomitant with binding analysis. It is likely that the different binding patterns of PHA-4 are mediated by potential cofactors such as SMK-1
[16], as well as interactions with other transcription factors such as the GAGA-binding protein suggested by the motif analysis here, and other studies
[15]. The binding sites of these factors can be identified using the tagging system and experimental pipeline that we have established, and integrated with the PHA-4 binding data to understand the functional relationship of these factors. Ultimately, the global DNA binding datasets we gather will greatly facilitate formulation of developmental gene regulatory networks in
C. elegans.