|Home | About | Journals | Submit | Contact Us | Français|
The lineage-specific basic helix-loop-helix transcription factor Ptf1a is a critical driver for development of both the pancreas and nervous system. How one transcription factor controls diverse programs of gene expression is a fundamental question in developmental biology. To uncover molecular strategies for the program-specific functions of Ptf1a, we identified bound genomic regions in vivo during development of both tissues. Most regions bound by Ptf1a are specific to each tissue, lie near genes needed for proper formation of each tissue, and coincide with regions of open chromatin. The specificity of Ptf1a binding is encoded in the DNA surrounding the Ptf1a-bound sites, because these regions are sufficient to direct tissue-restricted reporter expression in transgenic mice. Fox and Sox factors were identified as potential lineage-specific modifiers of Ptf1a binding, since binding motifs for these factors are enriched in Ptf1a-bound regions in pancreas and neural tube, respectively. Of the Fox factors expressed during pancreatic development, Foxa2 plays a major role. Indeed, Ptf1a and Foxa2 colocalize in embryonic pancreatic chromatin and can act synergistically in cell transfection assays. Together, these findings indicate that lineage-specific chromatin landscapes likely constrain the DNA binding of Ptf1a, and they identify Fox and Sox gene families as part of this process.
For several decades, investigators have attempted to elucidate how members of transcription factor families with highly similar DNA binding motifs are able to achieve specificity of binding and activation of select target genes. Equally puzzling is how a factor may be utilized in multiple organ systems to direct exceedingly disparate developmental programs. An emerging theme is the widespread use of binding partners to determine DNA binding selectivity, as extensive studies with homeodomain and Sox factors have demonstrated (summarized in references 1 and 2). In these instances, two or more DNA-binding transcription factors bind cooperatively and thus extend the recognition element to include both cognate motifs. Many of these studies examined isolated examples of binding partner utilization, and few have been able to address such in vivo mechanisms genome-wide. Moreover, the increased binding specificity derived from DNA-binding cofactors appears to be insufficient to account for most tissue-specific transcriptional differences (3). Others have proposed that target selection is mediated instead by local chromatin architecture. For instance, genome-wide studies in yeast to mammalian cell lines have demonstrated that the combination of chromatin accessibility and binding site affinity dictates transcription factor binding patterns (4–8). We address this fundamental question and show that the basic helix-loop-helix (bHLH) transcription factor Ptf1a combines both strategies to drive distinct programs for pancreas and neural tube development.
Like other class II bHLH factors, Ptf1a forms a heterodimer with E proteins; however, it requires a third component, Rbpj, to form a trimeric complex (PTF1) that activates transcription of target genes (9–11). During acinar pancreatic development, this complex activates the expression of Rbpjl, which then replaces Rbpj in the PTF1 complex to complete acinar maturation and maintain exocrine gene expression into adulthood (10, 12). Irrespective of whether Rbpj or Rbpjl is present in the PTF1 complex, in the small number of downstream targets identified, the complex binds a bipartite motif containing an E box (the bHLH consensus site) and a TC box (the Rbpj/Rbpjl consensus site) separated by one, two, or three helical turns of DNA (9, 13–17).
Loss-of-function studies have demonstrated the necessity for Ptf1a both for the commitment to the pancreatic lineage from endoderm and the maintenance of the acinar pancreas, as well as for the specification of inhibitory neuron populations in the dorsal spinal cord, cerebellum, and retina (18–22). In the absence of Ptf1a, the pancreas fails to form, and Ptf1a lineage cells instead adopt either a duodenal or a biliary identity (10, 18, 23). Similarly, in multiple regions of the nervous system, inhibitory neurons fail to form in Ptf1a mutants, and Ptf1a lineage cells switch fates to acquire an excitatory neuronal identity (19–22, 24). These functions of Ptf1a are dependent on a Ptf1a-Rbpj transcription complex (10, 11). However, how Ptf1a in the PTF1 complex regulates different target genes in the developing pancreas versus the neural tube remains unknown.
In this study, we used a combination of genome-wide deep sequencing technologies to address possible mechanisms by which Ptf1a can give rise to such different tissues as the pancreas and neural tube. The majority of Ptf1a binding regions do not overlap between these tissues, and those that do tend to be near genes for non-tissue-specific developmental signaling processes. Results from experiments to distinguish between the roles of lineage-specific chromatin accessibility and lineage-specific cofactors support the involvement of both mechanisms for the tissue-specific activities of Ptf1a. In particular, the forkhead factor Foxa2 is localized with Ptf1a in the pancreas-specific genomic regions, whereas Sox motifs are found near Ptf1a-bound sites in the neural tube. Further experiments testing direct interactions between Foxa2 and Ptf1a showed no such complex formation and suggest alternative mechanisms for their collaboration in pancreas development. Taken together, our results support a model whereby Ptf1a directs distinct gene expression programs that are established by the interplay of chromatin accessibility, collaborating transcription factors, and the DNA motifs within tissue-specific regulatory enhancers.
Neural tube, telencephalon, and limb tissues from embryonic day 12.5 (E12.5) mouse embryos and pancreata from E17.5 embryos were dissected and placed in buffer A (15 mM HEPES [pH 7.6], 60 mM KCl, 15 mM NaCl, 0.2 mM EDTA, 0.5 mM EGTA, 0.34 M sucrose) on ice. Nuclei were liberated by Dounce homogenization and purified by centrifugation through a sucrose gradient (15 mM HEPES [pH 7.6], 60 mM KCl, 15 mM NaCl, 0.1 mM EDTA, 0.25 mM EGTA, 1.25 M sucrose). Nuclei were then fixed in 1% formaldehyde for 10 min at 30°C, and fixation was terminated by adding glycine to a final concentration of 0.125 M. After centrifuging through another sucrose gradient, fixed nuclei were lysed in sonication buffer (1% Triton, 0.1% sodium deoxycholate, 150 mM NaCl, 50 mM Tris, 5 mM EDTA). Chromatin was sheared using a Diagenode Bioruptor for 30 min on high power with 30 s on-off cycles. One hundred micrograms (pancreas) or 250 μg (neural tube and telencephalon/limb) of chromatin was immunoprecipitated with either 2.4 μg of affinity-purified rabbit anti-Ptf1a antibody (25), 5 μl of rabbit Rbpjl antiserum (9), 5 μl of rabbit Rbpj antisera (laboratory generated), or 4 μg of goat anti-Foxa2 antibody (Hnf3b M-20) (catalog no. sc-6554; Santa Cruz) and protein A/G agarose beads (Santa Cruz). Rabbit polyclonal Rbpj antibodies were generated (Covance) using the peptide sequence (CKKK)NSSQVPSNESNTNSE, an epitope near the C terminus of Rbpj that has no similarity to the related Rbpjl. Captured bead-antibody complexes were washed twice with sonication buffer, three times with a high-salt buffer (1% Triton, 0.1% sodium deoxycholate, 500 mM NaCl, 50 mM Tris, 5 mM EDTA), twice with LiCl (0.5% NP-40, 0.5% sodium deoxycholate, 250 mM LiCl, 10 mM Tris, 1 mM EDTA), and once with TE (10 mM Tris pH 8.0, 1 mM EDTA). Two 15-min elutions were performed with 1% SDS, 0.1 M NaHCO3, and 10 mM Tris at room temperature. The immunoprecipitated DNA was then purified with a Qiagen PCR purification kit.
Independent sequencing libraries were prepared for two E12.5 neural tube samples and one E12.5 telencephalon sample for Ptf1a and Rbpj ChIPs and for two E17.5 pancreas samples for Ptf1a, Rbpjl, and Foxa2 ChIPs. In addition, an input sample from E17.5 pancreas and E12.5 neural tube were also prepared for sequencing. All libraries were made according to Illumina's ChIP-Seq DNA sample preparation protocol. Amplified libraries were prepared using Invitrogen's Platinum Pfx polymerase for 16 cycles of 94°C for 15 s, 62°C for 30 s, and 72°C for seconds. Single-end sequencing of 36 bp (Ptf1a and Rbpjl) or 40 bp (Rbpj and Foxa2) was conducted for all samples on an Illumina GAIIx sequencer.
Tissue samples from E12.5 neural tube and E17.5 pancreas were dissected in cold phosphate-buffered saline (PBS), homogenized, and cross-linked with 1% formaldehyde for 8 min at room temperature, and fixation was terminated by adding glycine to a final concentration of 0.125 M. Samples were sonicated using a Diagenode Bioruptor for 36 min with 30 s on-off cycle times. Samples were centrifuged at 14,000 rpm to pellet debris. Two rounds of phenol-chloroform extraction were performed, yielding 500 μl aqueous phase, to which 8.25 μl 5 M NaCl solution was added. Samples were incubated overnight at 65°C to reverse cross-links. Samples were ethanol precipitated, resuspended in 50 μl 10 mM Tris-HCl (pH 7.5), and digested for 4 h using 5 μl 10-mg/ml proteinase K. One microliter of 10-mg/ml RNase A was added and incubated for an additional 30 min at 37°C. Samples were purified using a Qiagen PCR Cleanup kit, prepared for sequencing using the NEBNext ChIP-Seq Library Prep Set Master Mix for Illumina, and sequenced on the Illumina HiSeq2000.
Mouse neural tubes were dissected from E12.5 2.3Ptf1a-GFP embryos (16) into Dulbecco modified Eagle medium (DMEM)–F-12 on ice and dissociated in 0.25% trypsin for 15 min at 37°C. Trypsin activity was quenched with 2% fetal bovine serum, and green fluorescent protein (GFP)-positive cells were purified from the resulting single-cell suspension by fluorescence-activated cell sorting (FACS). Total RNA from the sorted cells was extracted and purified with Zymo's Mini RNA isolation kit. mRNA was isolated, reverse transcribed, and amplified for sequencing with Illumina's mRNA-Seq kit. Two independent libraries were made from RNA isolated from 6 and 9 embryos, respectively. For the pancreas and transgenic embryos, total RNA was isolated using TRIzol (Invitrogen). mRNA was isolated, reverse transcribed, and amplified to prepare libraries for Sequencing with Illumina's Tru-Seq kit. Whole pancreata from E15.5 2.3Ptf1a::Sox1 and wild-type (WT) littermates and neural tubes from E12.5 R7Ptf1a::Foxa2 embryos and wild-type littermates were used. Two Foxa2 transgenic E12.5 neural tube libraries and three Sox1 transgenic E15.5 pancreas libraries were made from RNA isolated from two embryos that were confirmed to express each transgene. One wild-type E12.5 neural tube library was made from RNA isolated from three neural tubes, and two wild-type E15.5 pancreas libraries were made from RNA isolated from three pancreata.
ChIP-Seq and FAIRE-Seq sequence reads were mapped to the mm9 assembly of the mouse genome with Bowtie (26). Duplicate reads were removed, and the remaining unique reads were normalized to a 10-million-read count for peak calling analysis. Peak calling was performed by HOMER (27) using an FDR cutoff of 0.001. Putative peaks were required to have 4-fold enrichment over the control/input sample and a cumulative Poisson P value of <0.0001. All pancreas ChIP-Seq samples were compared to the pancreas input, while the neural tube RbpJ samples were compared to the neural tube input. Neural tube Ptf1a ChIP-Seq samples were compared to a Ptf1a ChIP-Seq sample of the telencephalon (a non-Ptf1a-expressing neural tissue with a similar developmental stage to control for nonspecific background from the antibody). This control was more stringent than using neural tube input chromatin, but similar results were found with both methods. For downstream analysis, only ChIP-Seq peaks that were called in both biological replicates were used. Common binding sites were required to be within 150 bp of each peak center. Quantification of the data at peak regions was performed by HOMER, and results were loaded into Multiple Experiment Viewer (MeV) v4.8.1 (28) to generate heat maps.
Sequence reads were aligned to the mm9 build of the mouse genome using TopHat v2.0.4 (26). Default settings were used with the following exceptions: -G option (which instructs TopHat to initially map reads onto a supplied reference transcriptome) and −no-novel-juncs option (which ignores putative splice junctions occurring outside known gene models). If a biological replicate was available, then it was specified and used to build an expression level model determined by the FPKM method of Cuffdiff v2.0.2 (29). Upper-quartile normalization (-N option) was performed to more accurately estimate levels of low-abundance transcripts, and multiple-read correction (-u option) was selected to better distribute reads mapping to multiple genomic locations. To correct any sequence-specific bias introduced during the library preparation, option −b was used. All other settings were left at default values.
Gene ontologies and gene annotations for ChIP-Seq peaks were obtained using GREAT v1.82 (30). GREAT assigns a gene to a binding region if the region falls within 5 kb 5′ or 1 kb 3′ of the transcription start site (basal region) with a maximum extension of 1,000 kb in either direction. If the binding region falls within the basal regions of multiple genes, then more than one assignment is made. All parameters were left at their default settings.
Peak regions were first trimmed to 150 bp centered about the peak summit to increase the likelihood of finding motifs associated with binding and to decrease the computational burden. Motif discovery was conducted with the HOMER package v3.8.1 using the following settings: -size 150 −S 10 −len 8,10,12 −bits. Motif density plots were generated in HOMER (annotatePeaks.pl) by identifying all matches to HOMER-generated motif matrices with 1-kb surrounding peak regions.
Locations and numbers of enriched motifs were obtained by HOMER (annotatePeaks.pl). Motif spacing was calculated from midpoint to midpoint between any E box and any TC box, Sox, Hox, Fox, or GATA motif found in the same peak. Histogram frequencies were then normalized to the total number of spacings found within each ChIP-Seq data set for better comparison.
LacZ reporter transgenes were generated by cloning PCR fragments (~1 kb) from mouse genomic DNA for the regions listed in Table S3 in the supplemental material 5′ of the basal promoter of a lacZ reporter plasmid. The lacZ reporter contains the basal promoter from the rat Ela1 gene, a simian virus 40 (SV40) nuclear localization signal, the lacZ coding sequence, and the 3′ untranslated region plus a poly(A) addition signal from bovine growth hormone, as well as an insulator sequence from the Gallus HS4 gene (14). Transgenic constructs for misexpressing Foxa2 or Sox1 were generated using regulatory sequences from the Ptf1a locus. The R7Ptf1a::Foxa2 transgene contains a 1.2-kb enhancer 3′ of the Ptf1a gene called R7 that has the same activity in transgenic mice as the 12.4-kb Ptf1a enhancer described previously (reference 16 and unpublished data). R7 (chromosome 2 [chr2]:19380202 to chr2:19381456 from mm9) was cloned upstream of the β-globin basal promoter driving mouse Foxa2 cDNA followed by a bovine growth hormone poly(A) signal sequence. The 2.3Ptf1a::Sox1 transgene contains the 2.3-kb autoregulatory enhancer region (chr2:19351717 to chr2:19354014 from mm9) (14) cloned upstream of the β-globin basal promoter driving mouse Sox1 cDNA followed by a bovine growth hormone poly(A) signal sequence.
Transgenes were prepared for microinjection using Elutip-d columns (Schleicher & Schuell). Transgenic embryos were generated by pronuclear injection of linear transgene DNA into mouse eggs by the UT Southwestern Transgenic Core Facility. All housing, husbandry, and procedures were performed in accordance with NIH guidelines, under protocols approved by the Institutional Animal Care and Use Committee at UT Southwestern Medical Center.
For lacZ reporter gene expression, embryos were harvested at E12.5 or E15.5, and yolk sacs were PCR genotyped using primers directed against the lacZ gene. The primers used were 5′-TGCTGCTGGTGTTTTGCTTCC-3′ and 5′-CGATATTATTTGCCCGATGTA-3′. E12.5 embryos were immersion fixed in 4% formaldehyde–PBS for 30 min and then washed in PBS before being transferred to an X-Gal (5-bromo-4-chloro-3-indolyl-β-d-galactopyranoside) staining solution overnight. E15.5 embryos were dissected to remove the head and the contents of the peritoneum, fixed for 1 h, and washed in PBS before being transferred to an X-Gal staining solution overnight. The X-Gal staining solution consisted of 5 mM potassium ferrocyanide trihydrate, 5 mM potassium ferricyanide, 2 mM MgCl2, and 1 mg/ml X-Gal in PBS. Embryos/tissues were washed in PBS and postfixed in 4% formaldehyde–PBS at 4°C before embedding for cryosection.
HEK293 or pancreatic 266-6 cells were transfected with plasmid DNA using FuGene 6 (Promega) as described previously (14). Transfection results reflect at least four independent experiments assayed in duplicate. β-Galactosidase (β-gal) levels expressed by reporter genes were normalized according to the levels of Renilla luciferase from the cotransfected pRL-CMV (Promega). To permit transfections performed at different times to be analyzed together in the HEK293 experiments, final results were normalized according to the level of the positive-control transfections of pCMVbeta (Clontech). Results are presented as normalized relative light units (nRLU). Transcription factor expression vectors used the cytomegalovirus (CMV) promoter to direct expression of mouse Ptf1a (in pcDNA3.1) (10) and mouse Foxa2 (in pCIG) (31). For reporter constructs, one to three repeats in tandem of the tested motifs (for sequences, see Fig. 7C) were cloned adjacent to the promoter of the lacZ reporter plasmid described above for mouse transgenes. Empty pcDNA3.1 vector was used to keep the total amount of expression vector and CMV promoter constant in each transfection.
Proteins were synthesized in vitro (in vitro transcription and translation [IVTT]) using the TnT coupled reticulocyte system (Promega). DNA probes were radiolabeled with [γ-32P]ATP using T4 polynucleotide kinase (New England BioLabs). Binding reaction mixtures (20 μl) were incubated for 30 min at 30°C and set up as follows: 5 μl 4× binding buffer (20% glycerol, 40 mM HEPES-NaOH [pH 7.9], 16 mM Tris-HCl [pH 8.0], 320 mM NaCl, and 4 mM EDTA, pH 8.0), 1 μl 100 mM dithiothreitol (DTT), 2 μl 800 ng/μl poly(dI-dC), 1 to 5 μl IVTT protein extract(s), and 1 μl labeled probe (20,000 cpm). Binding reaction products were separated on 4.5% polyacrylamide gels by applying a constant current of 15 mA/cm2 for 2 h at 4°C; gels were prepared using 1× running buffer (50 mM Tris-HCl, 380 mM glycine, and 2.8 mM EDTA) and were prerun for 15 min at a constant current of 7.5 mA/cm2 at 4°C.
HEK293T cells were seeded at a concentration of 8 × 105 cells/6-cm dish. The following morning, cells were cotransfected as indicated using FuGene 6 (Roche Life Sciences). The cells were washed with ice-cold PBS and collected at 48 h after transfection. Cell pellets were dissolved with 5 pellet volumes of lysis buffer (50 mM HEPES-NaOH [pH 7.4], 5 mM MgCl2, 100 mM KCl, 1 mM DTT, 0.2% Triton X-100, 1 mM phenylmethylsulfonyl fluoride [PMSF], and EDTA-free protease inhibitor cocktail [Roche Life Sciences]) and homogenized by passing through a 25-gauge needle 5 times. After resting on ice for 20 min, lysates were cleared by centrifugation at 20,000 × g for 20 min at 4°C. Anti-Ptf1a antibody (11) was conjugated to protein A/G beads (Santa Cruz) in lysis buffer by rotating for 2 h at 4°C. One hundred microliters of the resulting antibody-bead mixture was added to cleared lysate and rotated overnight at 4°C. Beads were washed 4 times with lysis buffer and eluted with a small volume of 2× SDS-PAGE sample buffer (30 mM Tris-HCl [pH 6.8], 1% SDS, 5% glycerol, 0.0125% bromophenol blue, and 100 mM DTT). Coimmunoprecipitation was assessed by Western blotting using anti-myc (A-14) (1:2,000; Santa Cruz) and anti-Foxa2 (3143) (1:1,000; Cell Signaling Technology) antibodies. Note that the heavy-chain fragment (approximately 50 kDa) from the anti-Ptf1a antibody used for immunoprecipitation comigrates with Foxa2 and weakly cross-reacts with the horseradish peroxidase (HRP)-conjugated secondary antibody used for detection.
All sequencing data sets have been submitted to GEO under accession no. GSE47459.
An intuitive explanation for the divergent functions of Ptf1a in pancreatic and neural development is that it binds to distinct enhancers and thus regulates tissue-specific gene programs. To test this, we performed genome-wide chromatin immunoprecipitation sequencing (ChIP-Seq) to identify chromatin sites bound by Ptf1a in the E12.5 neural tube and E17.5 pancreas (see Table S1 in the supplemental material). Using ChIP-Seq data from two replicates each from neural tube and pancreas, we identified genomic regions with enriched Ptf1a binding (peaks) signifying potential Ptf1a-regulated enhancers. Taking the peaks present in both replicates in a given tissue, our analysis revealed 2,241 Ptf1a-bound sites in the neural tube and 9,985 in the pancreas, of which 501 were common to both tissues (Fig. 1A and andB).B). The neural tube ChIP had a lower signal-to-noise ratio, probably due to the heterogeneity in the tissue used relative to the pancreas sample, and thus is likely an underestimate of the number of Ptf1a-bound sites in neural tissue. Nevertheless, a majority of genomic sites bound by Ptf1a are distinct between neural tube and pancreas. This conclusion holds true even when the stringency of the peak calling is reduced.
To gain more insight into the characteristics of Ptf1a binding, we assigned peaks to genes using the algorithm GREAT developed for ChIP-Seq data (30). Notably, many genes have more than one associated Ptf1a-bound region (Fig. 1C), and the overwhelming majority of Ptf1a-bound regions fall well distal to their assigned genes (i.e., 5 to 500 kb either upstream or downstream from the transcription start sites) (Fig. 1D) (17). These findings are consistent with other recent studies for site-specific binding factors (30). Gene ontology analysis from GREAT showed that genes assigned to neural tube-restricted binding sites are significantly enriched for neural processes such as the development of spinal cord and eye and for axon guidance. Similarly, genes assigned to pancreas-specific sites are heavily enriched for processes of glandular and exocrine development and differentiation, including enrichments for protein synthesis, endoplasmic reticulum (ER) functions, intracellular protein trafficking, and the response pathways for unfolded secretory proteins. Target genes common to both tissues are involved in more general functions, including cell proliferation and regulation of signal transduction (Fig. 1E).
Ptf1a exists in a trimeric complex with an E protein and Rbpj or Rbpjl, and it is these complexes that are required for transcription activation and the development of the neural tube and pancreas (9–11). From previous studies largely using select gene targets in the pancreas, the trimeric form of Ptf1a (PTF1) was found to bind to sequences containing an E box (CANNTG) and a TC box (YTYYCA) separated (midpoint to midpoint) by one or two helical turns (9).
One possible explanation for the tissue-specific binding preferences of Ptf1a is that the PTF1 complex binds distinct DNA motifs in the neural tube and pancreas. To address this, we analyzed 150 bp surrounding the apex of the Ptf1a peaks for motifs using HOMER (27). For both tissues, we identified an E box as the most enriched motif, with only subtle sequence preferences in the two tissues (Fig. 2A; also see Fig. 5). Both tissues have a strong prevalence of the palindromic CAGCTG E box; however, the E box CATCTG/CAGATG is enriched in neural tube binding regions, while CACCTG/CAGGTG E boxes are enriched in the pancreas (Fig. 2A; see Fig. S1 in the supplemental material). The second most enriched motif was the cognate Rbpj/Rbpjl binding sequence, or TC box, which also exhibited only slight differences in each tissue. The neural tube TC box, YTSYCA, shows greater degeneracy at the third position than does the pancreas TC box, YTCYCA (Fig. 2A). The TC box from Ptf1a-bound regions shared between the two tissues, TTCYCA, shows even less degeneracy. Because the differences represent preponderance rather than the absolute range of variations, which is the same among the three groups, it is unclear whether these subtle tissue-specific preferences affect the specificity of in vivo binding or function of PTF1.
To investigate whether the previously proposed spacing constraint for 1 or 2 helical turns between the E box and TC box applied to all Ptf1a-bound sites, we determined the spacing of all E-box and TC-box motifs in each peak, measuring the distance midpoint to midpoint. The most frequent spacing in both tissues occurs around 11 and 22 bp, consistent with previous studies (Fig. 2A) (9, 13–16). The Ptf1a peaks in pancreas demonstrated enrichment for a 3-helical-turn spacing as well, consistent with a recent study using the acinar cell line 266-6 (17). A majority of Ptf1a peaks in each group (pancreas only, 67%; neural tube only, 53%; and common, 81%) have at least one E box/TC box with the PTF1 constrained spacing (Fig. 2A, pie charts). It is likely that this analysis underestimates bona fide PTF1 sites given PTF1's tolerance of motif degeneracy (9). Taking the results together, the genome-wide analysis of Ptf1a binding alone in the two tissues confirms the E box/TC box with spacing constraints as previously predicted for Ptf1a in the pancreas. No novel spacing preference between E box and TC box in either tissue was revealed.
The motif analysis of Ptf1a-bound sequences implies Rbpj and Rbpjl binding at many of these same sites, which is also expected given the apparent requirement of Rbpj or Rbpjl for the transcriptional activity of Ptf1a (9–11). To assess this genome-wide, we performed ChIP-Seq for Rbpj and Rbpjl with chromatin from the neural tube and pancreas, respectively (see Table S1 in the supplemental material). Rbpjl was used for pancreas rather than Rbpj since it is the dominant binding partner for Ptf1a at E17.5 in the pancreas (10), and ChIP-Seq for Rbpj at this stage identified fewer than 1,000 peaks, making comparative analysis more difficult. Using analyses similar to those described above, we identified 11,678 Rbpjl binding sites in the pancreas at E17.5 and 3,035 significantly enriched Rbpj binding sites in the neural tube at E12.5 (Fig. 2B and andC).C). As predicted, most (71%) Ptf1a-bound regions in the pancreas displayed colocalization with Rbpjl (Fig. 2B). Significant colocalization between Ptf1a and Rbpj occurred for the neural tube, although with decreased frequency (25% of Ptf1a peaks) (Fig. 2C), consistent with the lower frequency of E box/TC boxes with constrained helical turn spacing found by the motif analysis. The stringent criteria for peak calling discards many low-affinity Ptf1a and Rbpj/Rbpjl binding events, and visual inspection of the heat maps (Fig. 2B and andC)C) suggests that the overlap is likely greater than 25% for neural tube and may approach 100% for pancreas. Together these results assessing Ptf1a and Rbpj/Rbpjl binding genome-wide support the broad existence of the PTF1 complex in both pancreas and neural tube, although the decreased frequency in the neural tube suggests a possible Rbpj-independent role for Ptf1a in the nervous system.
Chromatin accessibility as defined by the epigenetic landscape is known to change over developmental time from stem cell stages to differentiated cell types in mature tissues (32, 33). Chromatin accessibility is a likely candidate for explaining the distinct Ptf1a binding in the pancreas and neural tube chromatin given the different developmental histories for these tissues. To determine if chromatin accessibility differences correlate with the ability of Ptf1a to bind its recognition motif in the two tissues, we performed formaldehyde-assisted identification of regulatory elements followed by deep sequencing (FAIRE-Seq) (34). In this procedure, open chromatin regions are preferentially isolated and subsequently sequenced. We performed FAIRE-Seq with E12.5 neural tube and E17.5 pancreas chromatin (Fig. 3). A heat map of Ptf1a-bound sites designated neural tube only, pancreas only, or in common in the two tissues is shown next to the FAIRE-Seq signals at these sites in the two tissues. Sites bound by Ptf1a in each tissue show concordant FAIRE enrichment, as the chromatin accessibility model predicts. In the pancreas-only Ptf1a-bound regions, FAIRE-Seq indicated that 53% of these were “open” (>5 sequence tags) in pancreas and only 10% were “open” in the neural tube. Conversely, in the neural tube-only Ptf1a-bound regions, FAIRE-Seq indicated that 75% were “open” in the neural tube, compared to 15% in the pancreas. Similarly, Ptf1a-bound sites in common between the two tissues are at regions with high FAIRE values in both tissues, with 61% of the sites “open” in both tissues. This is illustrated both globally (Fig. 3A) and for specific examples around Klf7 (neural tube only), Cela3b (pancreas only), and Fbxl19 (common target) (Fig. 3B and and4).4). Thus, chromatin accessibility is at least part of the mechanism of tissue-specific Ptf1a binding in the genome.
To determine whether genomic regions bound by Ptf1a and Rbpj/Rbpjl represent active enhancers that retain their tissue specificity out of the normal genomic context, we tested the ChIP-Seq identified regions in lacZ reporter constructs for activity in transgenic mouse embryos. One such region located between Kirrel2 and Nphs1 is occupied by PTF1 complexes in both the neural tube and pancreas (Fig. 4A). This region contains two consensus PTF1 motifs; both are required for reporter expression in vitro (35, 36). At E12.5, the Kirrel2/Nphs1 enhancer faithfully recapitulates endogenous Ptf1a expression, as well as that of Kirrel2 and Nphs1 (35–39), in both dorsal neural tube and pancreas (Fig. 4B and andC;C; see Table S3 in the supplemental material). Similarly, reporter expression is detected in the correct Ptf1a expression domains in the developing cerebellum and medulla, and in the pancreas, at E15.5 (Fig. 4D to toD″;D″; see Table S3 in the supplemental material).
We next examined whether regions with tissue-specific Ptf1a binding in vivo were capable of directing tissue-restricted expression in transgenic animals despite the similarity in the consensus PTF1 motifs found in these regions. In this fashion, we hoped to address whether the intrinsic sequence, and hence the binding, of additional tissue-specific factors contributes to the distinct Ptf1a binding in the two tissues. Regions chosen for testing had strong Ptf1a binding in the neural tube and no significantly enriched binding in the pancreas and vice versa (see Table S3 in the supplemental material). Additionally, genes near each tested region had expression only within the tissue in which there was Ptf1a binding, as determined by RNA-Seq analyses for each tissue (see Table S3 in the supplemental material). Preference was also given to regions with significant evolutionary conservation and the presence of a PTF1-binding motif.
Of the six neural tube-specific regions tested in this study, three had reproducible activity in the dorsal neural tube at E12.5 (Fig. 4E to toG;G; see Fig. S2 and Table S3 in the supplemental material). Histologic analysis of these embryos revealed reporter staining in the correct Ptf1a domain in the ventricular zone, as well as in more mature neurons of this lineage in the mantle zone (Fig. 4F to toG′;G′; see Fig. S2 in the supplemental material). No pancreatic expression was detected in any of these embryos (Fig. 4F″ to toG″;G″; see Fig. S2 in the supplemental material). Similarly, of the eight pancreas-specific regions tested, five efficiently drove reporter expression within the pancreas at both E12.5 (Fig. 4H to toJ′;J′; see Fig. S2 in the supplemental material) and E15.5 (Fig. 4I″ to toJ″;J″; see Fig. S2 and Table S3 in the supplemental material). Only one region, that upstream from Dlk1, also directed expression in Ptf1a domains in the nervous system at E12.5 (see Fig. S2 in the supplemental material). The tissue-specific activity of Ptf1a-bound regions in vivo, therefore, is intrinsic to the DNA sequence and is not due to the imposition of chromatin architecture by the flanking regions of the endogenous loci.
The intrinsic tissue-specific activity of Ptf1a-bound regions strongly suggests that other tissue-restricted factors are needed for pancreas- versus neural tube-specific PTF1 functions. To investigate this possibility, we searched for other enriched motifs within Ptf1a-bound regions. Indeed, besides the E box and TC box, which were found with the highest frequency, HOMER identified additional enriched motifs that are distinct for the two tissues (Fig. 5).
For the neural tube, the consensus Sox family binding site ACAAWG (40, 41) was present in 22% of Ptf1a-bound regions, and the consensus Hox family binding site RTTAATY (42) occurred in 19% (Fig. 5A). These consensus sites are not enriched in the pancreatic Ptf1a-bound regions. To determine if these motifs are indicative of additional or alternative binding partners for Ptf1a, we examined whether they occurred at stereotyped intervals from the E box. Unlike the E box and the TC box, which are centered around the apex of the Ptf1a peak and are enriched in the 1- and 2-helical spacing (Fig. 2A and and5B),5B), the Sox and Hox motifs are randomly spread across the peak region and have no apparent spacing constraints relative to the E box (Fig. 5B and andC).C). These results suggest that Sox and Hox factors are not forming DNA-bound complexes with Ptf1a and Rbpj that have strict stereochemical constraints but rather may be serving a more general role to stabilize the region in a neural tube-specific open chromatin state.
In contrast, in the Ptf1a-bound regions specific to the pancreas, the next most enriched motifs after the E box and TC box were a consensus forkhead motif (Fox), RYMAAYA (43), found in 16% of the peaks and a GATA family motif, GATAA, in 12% of the peaks (Fig. 5D). Like the Sox and Hox motifs in the neural tube Ptf1a peaks, the GATA motif was randomly spread across the peak region and had no spacing constraint relative to the E box. However, the Fox motif was also enriched at the center of the peak with the E box and TC box (Fig. 5E). Notably, a compound motif with the E box-Fox with a single-base-pair separation 3′ of the E box was identified (Fig. 5F). Neither this compound motif nor the Fox motif alone occurs at any particular interval with respect to the TC box (data not shown), suggesting that Ptf1a, and not Rbpjl or the PTF1 complex, may form a novel stereochemically constrained complex with a pancreas-specific forkhead protein.
The presence of the E-box–Fox motifs with the constrained spacing in 430 (4.3%) of the pancreatic Ptf1a-bound regions led us to hypothesize that a forkhead factor might bind cooperatively with Ptf1a on such sites. Among forkhead family members present in the pancreas at E15.5 (see Table S4 in the supplemental material), we chose Foxa2 as the best candidate for interaction with Ptf1a based on its high level of expression and its role in pancreatic development (44, 45). To determine if Foxa2 localizes to the same genomic regions as Ptf1a in vivo, we performed ChIP-Seq for Foxa2 with chromatin from E17.5 pancreas (see Table S1 in the supplemental material). Surprisingly, Foxa2 localizes to at least 31% of all Ptf1a-localized regions and 38% of PTF1 sites defined by Ptf1a and Rbpjl overlapping peaks (Fig. 6A and andB).B). An example of a site with colocalized Foxa2, Ptf1a, and Rbpjl is shown for the Gata4 locus (Fig. 6C). GO analysis reveals that genes near regions bound by Ptf1a, Rbpjl, and Foxa2 function generally in organogenesis, such as stem cell development, branching morphogenesis, and Notch and RhoA signaling (Fig. 6D).
The ChIP-Seq results demonstrate that Ptf1a, Rbpjl, and Foxa2 localize to many of the same genomic regions, but they cannot determine if these factors are present at these sites at the same time. To test whether Foxa2 binds cooperatively with Ptf1a to the compound E-box–Fox motifs, we performed EMSAs with several sequences present in pancreatic Ptf1a- and Foxa2-bound regions. Foxa2 and PTF1 (Ptf1a-E12-Rbpj) bound the motifs separately, but no evidence of a higher-order complex between Foxa2 and the PTF1 trimer or Ptf1a-E12 dimer was detected indicating that PTF1 and Foxa2 bind DNA independent of one another (Fig. 7A and andBB and data not shown). Furthermore, coimmunoprecipitation studies with transfected cells failed to demonstrate an association of Foxa2 and Ptf1a (Fig. 7D). Thus, there is no evidence for a stable complex containing Foxa2 and Ptf1a in vitro.
Three compound sites that bind Ptf1a, Rbpjl, and Foxa2 in close proximity were selected from genomic control regions that direct pancreas-specific expression (Clps, Dlk1, and Rbpjl regulatory regions) (see Table S3 in the supplemental material) and examined for their ability to activate transcription in response to Ptf1a and Foxa2 in cell transfection assays (Fig. 7C, ,E,E, and andF).F). These experiments provided insight into how Ptf1a and Foxa2 work together to activate pancreas-specific target genes. First, in mouse pancreatic 266-6 cells, which express endogenous PTF1 and Foxa2, all three PTF1-Fox compound sites could activate reporter gene transcription (Fig. 7E). Activity of the Fox-PTF1 site from the Clps control region in particular required all three binding motifs (E box, TC box, and Fox).
We then used HEK293 cells, which do not contain endogenous Ptf1a or Foxa2 but contain ample Rbpj and E proteins (46), to test the individual versus joint contributions of these factors. Each compound site responded to Ptf1a and Foxa2 differently. For the Fox-PTF1 site from the Clps regulatory region, Ptf1a and Foxa2 had strong synergistic activity (Fig. 7F). Importantly, this activity required the presence of each individual motif, just as for 266-6 cells. In contrast, the addition of Foxa2 had no effect on the Ptf1a-induced activity of the Fox-PTF1 site from the Dlk1 regulatory region. Finally, Foxa2 inhibited the Ptf1a activation of the Fox-PTF1 site from the Rbpjl regulatory region (Fig. 7F). Thus, Ptf1a and Foxa2 can work together in different ways to provide distinct regulatory outcomes, depending on the nucleotide sequences of compound sites.
Our results demonstrate that Foxa2 and Ptf1a bind in close apposition in many genomic regions in vivo, including hundreds that contain a compound E-box–Fox motif. It appears, however, that no stable physical interaction between these factors occurs either in solution or on DNA, indicating independent rather than cooperative binding and the possibility of sequential or alternating binding. In addition, the effect of Foxa2 on Ptf1a-mediated transcription is variable and may depend on the relative spacing or precise sequences of their binding motifs in compound sites.
We tested whether Foxa2 or Sox1 might collaborate with Ptf1a to control pancreas and neural programs, respectively, given the enrichment of motifs for these factors in Ptf1a-bound regions. Transgenic embryos that misexpressed Foxa2 in Ptf1a-expressing cells in the dorsal neural tube or Sox1 in Ptf1a-expressing cells in the developing pancreas were generated. In this paradigm, the ectopic expression of Foxa2 or Sox1 occurs simultaneously with the endogenous Ptf1a rather than prior to Ptf1a as would happen during normal development. Changes in gene expression as a consequence of misexpressing Foxa2 or Sox1 were assayed by RNA-Seq. We found that the combination of Foxa2 and Ptf1a in neural tissue or of Sox1 and Ptf1a in the pancreas as used in this paradigm is not sufficient in vivo to reprogram neural tissue to pancreas or pancreas to neural tissue, respectively (data not shown). The effects are most supportive of a model whereby the lineage-specific factors Foxa2 and Sox1, normally expressed prior to Ptf1a, generate lineage-specific chromatin accessibility that allows Ptf1a to bind when it appears at later developmental stages (Fig. 8) (see Discussion).
The bHLH transcription factor Ptf1a is critical to the development of both the pancreas and nervous system. In this study, we addressed a fundamental question in developmental biology of how one transcription factor can control diverse gene expression programs by assessing in vivo Ptf1a binding genome-wide in the two developing tissues. We demonstrated that genomic occupancy of Ptf1a in the pancreas and neural tube are largely distinct, yet there is little difference in the most common binding motifs in these regions. These motifs contain an E box and a TC box separated by 1-, 2-, or 3-helical-turn spacing, as previously predicted by a select set of targets identified in the pancreas and in pancreatic 266-6 cells (9, 17). Thus, the broad existence of this compound E-box–TC-box motif was shown in vivo in pancreas and in neural tissue. Despite the use of an alternate binding partner in the pancreas (Rbpjl, which recognizes the identical TC-box consensus sequence), the motifs bound by the PTF1 trimer in the pancreas and neural tube are nearly identical. Indeed, Rbpj can compensate for the absence of Rbpjl in E17.5 and adult pancreatic acinar cells (12). Although we detected subtle differences in E-box and TC-box preferences between neural tube and pancreas, it seems unlikely that such subtleties alone could dictate the observed binding profiles of Ptf1a. Indeed, several of the first validated PTF1 sites near exocrine pancreatic genes possess widely degenerate sequences (9).
Although the PTF1 motif was found in a majority of Ptf1a-bound sites, there were still many for which HOMER did not detect a TC box, particularly in the neural tube. While visual inspection often identified potential TC boxes not detected by HOMER, there remain numerous examples where no clear PTF1 motif or TC box is evident. This suggests that Ptf1a may serve another function independent of the PTF1 complex, at least in neural tissue. Since the Ptf1a/E-protein heterodimer does not activate transcription (9), it is possible that Ptf1a competes at enhancers used by other E-box binding proteins near genes involved in opposing programs.
Ptf1a-bound sites restricted to either tissue define tissue-specific enhancers in transgenic mice. This result was not necessarily predicted, because transgenes integrate in nonhomologous loci where the lineage-specific chromatin landscape is presumably absent. The alternatives were that the enhancers would be inactive or their specificity lost and thus be active in both neural tube and developing pancreas where there is endogenous Ptf1a. The fact that neither occurred indicates that information present in the transgene, including the Ptf1a-bound site and surrounding sequence, is sufficient to retain the tissue specificity. Motif analysis with HOMER identified multiple tissue-specific consensus motifs and suggested that additional collaborating factors may provide such specificity. Fox and GATA motifs were enriched specifically in the Ptf1a peaks restricted to the pancreas, whereas Sox and Hox motifs were enriched specifically in the Ptf1a peaks restricted to the neural tube. Factors in these families, such as Foxa2 and Gata4/6, or the SoxB1 family, represent known lineage-defining factors for pancreas and neural tissue, respectively (47–49). The simplest model currently is that factors such as Foxa2 and Gata4 combine to define accessible sites within the chromatin, allowing Ptf1a in the PTF1 complex to bind and activate transcription of its targets in pancreas (50). Conversely, SoxB1 and Hox family factors combine to define sites accessible to PTF1 in neural tissue (49, 51) (Fig. 8). The transgenes chosen originally based on Ptf1a binding also possess consensus sites for these collaborating factors and potentially confer tissue specificity on these enhancers. Artificial coexpression of Foxa2 or Sox1 with Ptf1a was not sufficient to direct Ptf1a to regulate pancreas versus neural targets in transgenic mice, which suggests that additional factors are also required or that the correct temporal relationship of expression must also be respected.
Not all Ptf1a regions tested for enhancer activity were sufficient to direct expression in transgenic mice. Since Ptf1a often binds multiple regions near genes it regulates, it may require more than one of these enhancers for detectable activity. Similar observations have been made with Crx in the adult mouse retina, in which Crx binding regions cluster around genes important for photoreceptor identity (52). When tested individually, many of these regions were not able to direct reporter expression; however, several such bound regions in the Rho locus dramatically increased the activity of a functioning enhancer when tested in combination. The use of alternate enhancers to regulate critical development genes has been well described in Drosophila (53, 54), implying that the combinatorial action of cis-regulatory elements in modulating target levels is a widespread and highly conserved phenomenon. Perhaps in more complicated genomes there exist many partial enhancers necessary for proper gene regulation that are insufficient on their own to induce gene transcription. If this is true, then it could account for the fact that not all Ptf1a-bound regions that we tested functioned individually as enhancers.
It has been appreciated for many years that the chromatin landscape changes as development proceeds and cells become restricted to specific lineages (32, 33). We assessed differences in chromatin accessibility between embryonic pancreas and neural tube tissue using FAIRE-Seq. This analysis illustrated the differences in these two tissues and demonstrated that Ptf1a often binds to chromatin defined as open by this assay, consistent with a model whereby the chromatin landscape defines which PTF1 sites are available for binding. Although from these data alone we cannot distinguish whether Ptf1a binding is a consequence of the availability of its motif or whether Ptf1a can direct the opening of the chromatin, the fact that Ptf1a binding is tissue specific suggests the former. Furthermore, based on previous studies with Foxa2 and Sox factors, it is more likely that these factors prepare specific regions of chromatin in early developmental progenitor cells with broad potency in anticipation of binding of more restricting lineage factors, such as Ptf1a, later (Fig. 8) (50, 51).
Seminal work by Zaret and colleagues has elucidated the pioneering function of Foxa2 in liver development, in which Foxa2 acts as a placeholder to preserve access to critical regulatory regions for specification factors (summarized in reference 55). Our observations are consistent with a similar role for Foxa2 in pancreatic development. Although Foxa2 and Ptf1a bind many of the same regions in vivo and there is a compound motif with a constrained spacing of the E box and Fox motif, we were unable to detect any physical interaction between Foxa2 and Ptf1a either in solution or on DNA. These findings favor a model of mutually independent binding, perhaps with Foxa2 protecting PTF1 binding regions for future use. Given the disparate combined activities of Ptf1a and Foxa2 in our cell transfection experiments, it is tempting to speculate that Foxa2 buffers PTF1 activity to achieve optimal levels of target transcription. For example, Foxa2 augments Ptf1a function on the Clps element, which itself possesses a low-affinity PTF1 binding site, while it decreases Ptf1a-mediated transcription on the high-affinity PTF1 site in the Rbpjl element. Nonetheless, we cannot discount that additional cofactors or posttranslational modifications may be needed to correctly assemble a higher-order complex including Foxa2 and Ptf1a.
In summary, the tissue-specific transcription factor Ptf1a utilizes multiple strategies to activate transcription of lineage-specific genes. In previous work, Ptf1a was shown to form a transcriptional activator complex with its E-protein partner and Rbpj or Rbpjl. These factors cooperatively bind a bipartite DNA motif, resulting in increased sequence specificity and efficient transcriptional activation, a mechanism shared in pancreas and neuronal development. From results in this study, we show that the distinct activity of Ptf1a in pancreas versus neural tube utilizes additional mechanisms that likely combine collaboration with other tissue-specific factors, either directly or indirectly, with lineage-specific chromatin accessibility to regulate target genes. Furthermore, we identify Foxa2 and GATA factors, or Sox and Hox factors, as the most likely candidates for the collaborating factors that influence Ptf1a specificity in the pancreas and neural tube, respectively.
We acknowledge Yanjie Chang for valuable assistance in analyzing transgenic mice and the excellent services of the UTSW Transgenic Mouse Technology, Microarray, and FACS cores. We thank Stavros Malas for the Sox1 cDNA-containing plasmid and Helen Lai and Tou Yia Vue for many helpful discussions throughout this study.
This work was supported by Public Health Service grants DK89570 and HD60222 to R.J.M., NS067553 and HD037932 to J.E.J., NS061440 to D.M.M., and NS070559 to M.D.B. from the National Institutes of Health.
Published ahead of print 10 June 2013
Supplemental material for this article may be found at http://dx.doi.org/10.1128/MCB.00364-13.