|Home | About | Journals | Submit | Contact Us | Français|
Pax3 and Pax7 regulate stem cell function in skeletal myogenesis. However, molecular insight into their distinct roles has remained elusive. Using gene expression data combined with genome wide binding-site analysis we show that both Pax3 and Pax7 bind identical DNA motifs and jointly activate a large panel of genes involved in muscle stem cell function. Surprisingly, in adult myoblasts Pax3 binds a subset (6.4%) of Pax7 targets. Despite a significant overlap in their transcriptional network, Pax7 regulates distinct panels of genes involved in the promotion of proliferation and inhibition of myogenic differentiation. We show that Pax7 has a higher binding affinity to the homeodomain-binding motif relative to Pax3, suggesting that intrinsic differences in DNA binding contribute to the observed functional difference between Pax3 and Pax7 binding in myogenesis. Together, our data demonstrates distinct attributes of Pax7 function and provides mechanistic insight into the non-redundancy of Pax3 and Pax7 in muscle development.
The maintenance and repair of adult muscle tissue is directed by satellite cells. Quiescent satellite cells are activated by exercise or injury and enter the cell cycle to produce progeny myogenic precursor cells that undergo multiple rounds of division before entering terminal differentiation and fusing to multinucleated myofibers (Charge and Rudnicki, 2004). Moreover, satellite cells exist as a heterogeneous population based on Myf5 expression, a feature that divides the satellite cell pool into a subpopulation of self-renewing stem cells (Myf5−) and committed progenitors (Myf5+) (Kuang et al., 2007).
Satellite cells express the paired box transcription factors Pax3 and Pax7 which lie genetically upstream of the myogenic regulatory factors (MRFs) MyoD and Myf5 (Buckingham and Relaix, 2007a). Pax7 is uniformly expressed at high levels in satellite cells and plays a critical role in regulating their function. By contrast, equivalent level of Pax3 is expressed in satellite cells in a subset of muscles such diaphragm, but satellite cells in most muscle groups express very low levels (Kassar-Duchossoy et al., 2005). Extensive analyses of Pax7−/− mice have confirmed the progressive loss of the satellite cell lineage in multiple muscle groups (Kuang et al., 2006; Oustanina et al., 2004; Relaix et al., 2006; Seale et al., 2000). Small numbers of Pax7-deficient cells do survive in the satellite cell position but these cells are progressively lost, likely due to survival deficits or precocious differentiation. Pax7−/− muscles are reduced in size, the fibers contain approximately 50% the normal number of nuclei, and fiber diameters are significantly reduced (Kuang et al., 2006). Notably, Pax7-dependency in satellite cells has been suggested to be limited to a critical juvenile period when satellite cells are transitioning to a quiescent state (Lepper et al., 2009).
Our current understanding is that Pax3 and Pax7 play some overlapping but mostly non-redundant roles in the specification and progression of the adult satellite cell lineage. During early myogenesis, Pax3 and Pax7 expression defines a population of embryonic progenitors that have been suggested to later give rise to satellite cells. In the absence of both Pax3/7, these progenitors undergo apoptosis, or adopt alternative non-myogenic cell fates (Kassar-Duchossoy et al., 2005; Relaix et al., 2005). However, Pax3 expression does not rescue the loss of Pax7 in adult satellite cells. For example, while Pax3-expressing cells are found in the limb muscles of Pax7−/− mice and express MyoD, these cells are progressively lost, display poor myogenic potential and are located in the interstitial space rather than the satellite cell niche (Kuang et al., 2006; Relaix et al., 2006). Notably, satellite cells in the diaphragm express high levels of Pax3, but nonetheless, in Pax7−/− mice are similarly lost leading to progressive muscle wasting in the same manner as other muscle groups (Seale et al., 2000). Together, this data underscores the non-redundant functions of Pax3 and Pax7 in the myogenic program.
The transcriptional network which controls satellite cell lineage progression is similar to that deployed during embryonic myogenesis. During development, Pax3 has been shown to be involved in the delamination and migration of embryonic myoblasts towards developing limb buds (Bober et al., 1994). Pax3 can also directly activate Myf5 and MyoD in the embryo, thereby initiating myogenic differentiation (Punch et al., 2009). Embryonic progenitors that give rise to satellite cells are characterized by Pax3/Pax7 expression and the lack of expression of other myogenic regulatory factors (MRFs: Myf5, MyoD, Mrf4 and Myogenin). Pax3/7+ cells enter the myogenic program by up-regulation of Myf5/MyoD, or give rise to satellite cells during late fetal myogenesis without up-regulating MRFs. Pax3/7+MRF− progenitors are first found to align with nascent myotubes at E15.5, then become satellite cells by taking a sublaminar position (Relaix et al., 2005). Intriguingly, Pax3/7+MRF− progenitors rapidly up-regulate Myf5 and down-regulate Pax3 expression upon arrival at the nascent muscle groups (Kassar-Duchossoy et al., 2005). Lineage tracing suggests that Pax3+ cells contribute to embryonic myoblasts and the endothelial lineage whereas Pax7+ cells contribute to fetal myoblasts supporting the notion that these represent distinct myogenic lineages (Hutcheson et al., 2009). Thus, satellite stem cells (Pax7+/Myf5−) in adult muscle may be a lineage continuum of the embryonic Pax3/7+MRF− progenitors (Kuang et al., 2007).
Pax3 and Pax7 have overlapping but largely non-redundant roles in the myogenic developmental program. An outstanding question is which target genes mediate the unique function of Pax7 in adult satellite cells? Given the 86% sequence similarity between Pax3 and Pax7 proteins, an even intriguing question is which attributes of Pax7 specify those unique biological functions. Here, we tackled these key questions using genomic approaches. Our experiments have provided a mechanistic understanding of transcription activities of Pax7 and Pax3 in adult satellite cells.
To facilitate comparative analysis between Pax3 and Pax7 we produced stable cell lines with similar expression level of Pax3-TAP and Pax7-TAP (Figure 1A). Importantly, this approach facilitated our analysis as endogenous Pax7 was completely repressed by proviral expression of either Pax3-TAP or Pax7-TAP, and thus not involved in the dimerization of Pax complexes. We estimated proviral expression levels against wild type myoblasts by Fluorescent Activated Cell Sorting (FACS) coupled with quantitative Western Blot analysis. Proviral Pax7 was expressed at 6×105 molecules per cell compared to 2.75×104 endogenous Pax7 molecules per cell (Figure S1).
The use of ChTAP allowed us to perform ChIP using two sequential immunoprecipitation steps under identical conditions by using same affinity reagents in experiments and control reactions, therefore reducing the likelihood of false positives (Figure S2, Supplementary Materials and Methods). This approach allowed us to conduct a direct comparison between Pax3 and Pax7.
Ultra high throughput sequencing of ChTAP fragments on a GAIIx sequencer (Illumina) (see Supplementary Materials and Methods) yielded 11.4 million uniquely mapped reads for Pax7, 12.4 million reads for Pax3, and 8.5 million reads for the control sample (The ELAND output of Pax3 and Pax7 ChIP-seq data are available in GEO as the series GSE25064 containing TAP-tagged Pax7, TAP-tagged Pax3 and Control samples; GSM615619, GSM615620 and GSM615621 respectively). To identify genome wide Pax3 and Pax7 binding sites, reads from Pax3 or Pax7 together with the corresponding control reads were analyzed using Model-based Analysis for ChIP-Seq (MACS v 1.3) (Zhang et al., 2008). Binding sites, (peaks) were identified by comparing Pax3 and Pax7 read densities against the TAP background (control).
ChIP-PCR established the veracity of binding site identification in 100% of the sites tested (Figure S2).
We have previously shown that the presence of C-terminal TAP tag does not interfere with Pax7 function (McKinnell et al., 2008). Similarly, we found that the C-terminal TAP tagged Pax3 fully retains its function (Figure S3).
The vast majority of Pax3 and Pax7 binding sites were located within intergenic or intronic regions away from the transcription start sites (TSS) (Figure 1B, Figure S4). Few binding sites were observed within 5’ or 3’ UTR regions. The majority of Pax3 binding sites were found to overlap Pax7 binding sites (Figure 1C). Surprisingly, while more than 52,683 genomic loci were enriched for Pax7 binding only 4,648 sites were similarly enriched for Pax3 (Figure 1C). This striking difference in binding was observed despite each transcription factor having a comparable numbers of reads and equivalent expression levels, (Figure 1A, 1D).
Although the majority of Pax3 binding sites overlap with Pax7 binding sites, a significant 1,267 binding sites were uniquely occupied by Pax3 and not by Pax7 (Figure 1C). Analysis revealed that Pax3/Pax7 common peaks set had the highest quality score compared to the other peak sets (Figure 1D). Analysis of read distribution within the common peak set showed a modest correlation between Pax3 and Pax7. However, these peaks were more enriched for Pax7 reads than Pax3 (Figure 1E). Taken together these finding suggest that Pax7 has more extensive genome coverage in adult myoblasts than Pax3. However, Pax3 binds a substantial set of targets not bound by Pax7. Thus these data argue that Pax3 and Pax7 binding dynamics are markedly different in primary adult myoblasts.
To determine the consensus-binding motif of Pax3 and Pax7 we ran MEME (Bailey et al., 2009) on the highest scoring Pax3 and Pax7 unique peaks (Figure 1C–D), restricting our search to the 50 base pair region around the peak summit. Peaks that spanned more than 800 base pairs were excluded from all downstream analysis as these were thought less likely to be the result of a single binding site. The MEME search of Pax7 peaks found the homeobox-binding (hbox) motif (Figure 2A) in 409 out of the top 500 peaks. The search of Pax3 peaks identified a paired (prd) motif (Figure 2B) in 131 out of 499 peaks.
We then used the Position Weight Matrix (PWM) of these motifs to search a 200 base pair region surrounding the summit of all Pax3, Pax7, Pax3 unique and Pax7 unique peaks using FIMO (Grant et al., 2011). Globally, we found that both Pax7 and Pax3 peaks were highly enriched for hbox motifs (Figure 2C–E). On average more than 60% of all Pax7 and 30% of all Pax3 peaks contained hbox motif (Figure 2C–E). Interestingly, both Pax3 peaks and Pax3 unique peaks were more enriched for the prd motif than the Pax7 peaks (Figure 2D, 2F).
To test the statistical significance for differential enrichment of Pax3 and Pax7 peaks for prd versus hbox motifs we ranked Pax3 and Pax7 peaks into percentile and determined mean number of hbox and prd motifs within each group. The Wilcoxon Rank-Sum test showed that the differential enrichment of Pax3 and Pax7 for prd and hbox motif, respectively, is statistically significant (p-value <0.001) (Figure 2G–H). This finding suggests that while both Pax3 and Pax7 recognize the same DNA motif, there is a differential preference for prd versus hbox between the two transcription factors.
To test the biological significance of Pax3 and Pax7 bias towards prd and hbox motifs we performed electromobility shift assay (EMSA) using short oligonucleotides derived from a common Pax3/Pax7 binding site (Myf5 ECR111, Supplementary Materials and Methods and described below) that contains both hbox and prd motifs (Figure 3). EMSA confirmed the differential binding bias for hbox versus paired-domain motifs between Pax7 and Pax3 respectively (Figure 3A–B). Pax3 weakly shifted the hbox probe, whereas equivalent amounts of Pax7 protein resulted in a robust shift (Figure 3A). Two species of shifted probe were identified, likely representing monomer and dimer binding consistent with a previous report suggesting that homeobox repeats facilitate dimerization (Birrane et al., 2009).
To further document the differential affinities of Pax3 versus Pax7 for paired- and homeo-domain motifs, we performed luciferase reporter assays on a set of reporter plasmids containing different permutations of the Myf5 −111kb binding site (ECR111). Reporter constructs were engineered that contained either three copies of the paired-domain sequence (prd), three copies of the homeo-domain sequence (hbox), or three copies of the full binding site (prd + hbox) (Supplementary Materials and Methods).
Reporter plasmids were cotransfected with plasmids expressing Pax3-VP16, Pax7-VP16, or empty vector, into 10T1/2 fibroblasts. Both Pax3- and Pax7-VP16 were able to induce the expression of luciferase reporter gene when prd and hbox were juxtaposed (Figure 3C). However only Pax7 significantly induced the expression of reporter gene from the hbox motif alone (Figure 3D), and both Pax3- and Pax7-VP16 were weak inducers from prd motif alone (Figure 3D). Pax3- and Pax7-VP16 showed comparable effect on induction of reporter gene expression from the juxtaposed prd and hbox motifs (Figure 3C). Notably, the presence of prd and hbox domain has a synergistic effect on gene expression.
The unique ability of Pax7 to induce expression from the hbox motif in addition to prd/hbox is consistent with the enrichment of Pax7 binding to hbox motifs in the genome-wide binding data (Figure 2). Therefore, high affinity of Pax7 for hbox motif (Figure 3A) and the higher fraction of Pax7 peaks containing this motif (Figure 2C–D) is consistent with the notion that Pax7-binding to hbox motifs has a biologically important role in adult myogenesis. Conversely, lower affinity of Pax3 for the hbox motif (Figure 3A) combined with the lower fraction of Pax3 peaks containing this motif (Figure 2E–H) suggests that the two Pax proteins have markedly different DNA binding dynamics.
Investigation of Pax3 and Pax7 genome-wide binding data suggests that they both can regulate their targets from a wide range of distances to the TSS (Figure 1B, Figure S4). To investigate the functional significance of genes associated with Pax3 and Pax7 peaks we used Genomic Regions Enrichment of Annotation Tools (GREAT) (McLean et al., 2010) to perform association analysis. We used the basal regulatory domain of −5kb ~ + 1kb of the TSS and extending to 500kb in both upstream and down stream directions to obtain gene ontologies that were significantly associated with Pax3 and Pax7 binding sites.
We observed significant enrichment of Gene Ontology terms such as regulation of myoblast proliferation, hepatocyte growth factor receptor signaling, regulation of mitosis, regulation of glucagon secretion and apical junction assembly among other, associated with Pax7 or Pax3 peaks (Table S1–2). Interestingly, peaks that were uniquely present in the Pax3 and absent from the Pax7 dataset showed significant enrichment for ontology terms such as skeletal muscle morphogenesis, neural tube formation and epithelial tube formation among others (not shown). This finding suggests that Pax3 peaks that are not present in the Pax7 dataset may represent a set of Pax3 targets involved in embryonic myogenesis.
To assess the relationship between genome-wide binding data and gene expression we performed microarray analysis on Pax3- and Pax7-cTAP over-expressing skeletal myoblasts. We used Significance Analysis of Microarray (SAM) with a raw p-value <0.01 and a fold change ≥ 2.0 (positive or negative) to derive the list of genes that were differentially regulated. Consistent with our observation on genome-wide binding data, Pax7 over-expression resulted in differential regulation of a larger set of genes than Pax3 over-expression (Figure S4).
A set of 840 genes was significantly regulated by both Pax7 and Pax3, while 128 genes were significantly regulated by Pax3 only, a larger set of 439 genes was regulated by Pax7 only (Figure S4). Many genes associated with growth and proliferation, such as Fgfr2, Egfr, BMP4 etc. were significantly up-regulated by both Pax3 and Pax7 (Table S3). Differentiation-specific genes such as Myh1, Myh3, Myh8 Mef2c, Myog, Ttn etc. were significantly repressed by both Pax3 and Pax7 (Table S3).
Gene Ontology analysis showed that both Pax3 and Pax7 regulated genes are involved in cell growth, proliferation, signaling, adhesion etc. (Table S4–5). However, Pax7 induced the expression of many genes involved in growth and proliferation of muscle cells and repressed a large set of genes involved in muscle cell differentiation. The latter gene sets are not significantly regulated by Pax3 (Table S6).
Based on a statistical analysis of peak association with significantly regulated gene expression (not shown) we chose a 35kb window (−30kb to +5kb) around the TSS. To determine the significance of association between Pax3 and Pax7 binding and expression of target genes we grouped genes into 20 bins ranked by their expression fold change. A binomial test on the number of genes with associated peaks in each bin versus the proportion of all genes with associated peaks showed that genes that are up- or down-regulated in Pax7 over-expressing cells and genes that are up-regulated in Pax3 over-expressing cells have significantly more associated peaks than would be expected at random or non-regulated genes (Figure 4).
We performed similar analysis using Gene Set Enrichment Analysis (GSEA) (Broad Institute, v 2.07) (Subramanian et al., 2005) and observed comparable results (not shown). Taken together, these analyses suggest that overall transcriptional activity of Pax3 and Pax7 can be predicted from their binding patterns. Moreover, despite a significant amount of functional overlap between the two Pax genes in skeletal muscle cells Pax7 plays a more dominant role in adult skeletal myogenesis.
Pax3 has been documented to target several enhancers that are activated during embryonic myogenesis. We examined Pax7 and Pax3 binding to these regions to elucidate whether the adult myogenic program is simply a continuation of the embryonic myogenic program. Surprisingly, we did not find extensive Pax3 binding to known targets. For example, no Pax3 peak was identified at the paired motif-containing +22kb enhancer downstream from Fgfr4, which is bound by Pax3 during embryonic limb development (Bajard et al., 2006; Lagha et al., 2008). However, this element is strongly bound by Pax7 (Figure S5).
Dmrt2, a known Pax3 target is involved in somite maturation and is regulated by Pax3 via a conserved element located 18kb upstream of the TSS (Sato et al., 2010). However, our data shows that in adult myoblasts this element is strongly bound by Pax7 but not Pax3 (Figure S5). Another Pax3 regulated gene, Spry1 (Lagha et al., 2008) has a Pax7 binding site but no Pax3 binding at a conserved element located +12kb down stream of the TSS (Figure S5).
Binding sites uniquely recognized by Pax7 were also found near key myogenic genes. For example, two prominent Pax7 binding sites were found in conserved intergenic regions upstream of MyoD. These sites were also not bound by Pax3 (Figure S5). In other instances we observed that Pax7 binding sites were frequently adjacent to Pax3 target genes. For example, prominent binding sites were observed in the promoter of Lbx1, and within and downstream of Itm2a (Figure S5).
Consistent with previous findings, common Pax3/7 binding sites were observed within or near known target genes, including Myf5 (Bajard et al., 2006), C-met (Epstein et al., 1996), and Cdh11 (McKinnell et al., 2008) (Figure S5). Other Pax3/7 sites were observed adjacent to genes linked to muscle-related processes such as myogenic inhibition such as Mdfic; (Ma et al., 2003)), myogenic signaling (Stat1; (Sun et al., 2007)) (Figure S5) and data not shown). Although Myf5 showed Pax3/7 binding at its −57kb limb enhancer (Bajard et al., 2006), we noted a larger peak located at −111kb from the Myf5 TSS in both Pax3 and Pax7 datasets (Figure 5B). These observations suggest that in adult cells Pax7 binds strongly to many sites occupied by Pax3 in the embryo. Taken together, these data support the notion that adult and embryonic programs represent discrete myogenic programs.
Our experiments reveal that Pax7 plays an extensive role in adult myogenesis by activating multiple programs to promote myoblast growth and inhibit differentiation. Pax7 also activates components of multiple signaling pathways that have been implicated in adult myogenesis.
Myf5 is a direct Pax3/7 target gene that is transcriptionally poised in quiescent satellite stem cells. Myf5 is induced through asymmetric satellite stem cell division and is rapidly up regulated during activation of satellite myogenic cells (Kuang et al., 2007; Le Grand et al., 2009; McKinnell et al., 2008). Pax7 is a potent positive regulator of Myf5 in cultured myogenic cells. BAC transgenes carrying 195kb upstream of the Myf5 transcriptional start site are sufficient to recapitulate the expression pattern of Myf5 and Myf6 in the embryo (Carvajal et al., 2001), and in the adult (Zammit et al., 2004). The −140kb to −88kb interval is necessary for the expression of Myf5 in quiescent satellite cells, while the element directing Myf5 expression in proliferating myogenic cells is located between −59kb and +40.6kb (Zammit et al., 2004) (summarized in Figure 5).
Analysis of Pax3 and Pax7 ChIP-Seq data identified a number of binding sites across the Myf5 regulatory region (Figures 5A–C). Both Pax3 and Pax7 bound sites are located at conserved loci at −57.5kb, −111kb, and −129kb (relative to the Myf5 TSS). Within the 52kb upstream interval (−140kb to −88kb), harboring the satellite cell enhancer we identified four Evolutionary Conserved Regions (ECRs) (Figure 5B), among them ECR111 (Ribas et al., 2011) is conserved in all vertebrates. We confirmed the precise binding of Pax3/7 to the −57kb enhancer (Figure 5C) and the ECR111 by DNaseI foot printing (Figure 5D) and Supplementary Materials and Methods.
We examined Myf5 expression in vivo using a Myf5-nLacZ BAC carrying 195kb of upstream sequence (Figures 6A–H, Figure S6 and Supplementary Materials and Methods). Transgenic mice harboring a deletion of the ECR111 sequence from the BAC transgene (Ribas et al., 2011) abolished the expression of Myf5-nLacZ from adult quiescent satellite cells, but not muscle spindles (Figures 6, Figure S6). These results indicate that ECR111 is required for the expression of Myf5 in satellite cells, but is not required for its expression in muscle spindles. Interestingly, activated satellite cell derived myogenic cells from cultured EDL single-fibers expressed Myf5-nLacZ in the absence of ECR111 (data not shown). Therefore, we conclude that the ECR111 is a bona fide Pax7-dependent enhancer, which is required for Myf5 expression in adult quiescent satellite cells.
In the hierarchy of the myogenic transcriptional network, Pax3 and Pax7 lie upstream of the basic Helix-Loop-Helix (bHLH) transcription factors Myf5 and MyoD. Together these transcription factors regulate satellite cell commitment to the myogenic lineage and self-renewal through a spatial-temporal network regulated by various signaling pathways. Deciphering the precise underlying molecular mechanisms that regulate this network is fundamentally important in understanding muscle development and diseases. Pax3/7 expressing cells that originate from somites give rise to the satellite cells of postnatal muscle (Buckingham and Relaix, 2007b).
Gene replacement studies have demonstrated temporal and functional overlap of Pax3 and Pax7 in both embryonic and adult tissue. However, both factors have critical non-redundant roles in embryonic and adult muscle development. Pax7 can competently replace Pax3 in neural crest cells, dorsal neural tube, and trunk muscles, but cannot compensate for Pax3 function in the delamination and migration of limb muscle progenitor cells (Relaix et al., 2004). Conversely, Pax7 alone is essential for satellite cells up to a critical postnatal period (Kuang et al., 2007; Lepper et al., 2009; Oustanina et al., 2004; Relaix et al., 2006; Seale et al., 2000).
To mechanistically investigate the functional differences between Pax3 and Pax7 in skeletal myogenesis we combined global gene expression profiling with genome wide binding site analysis of these two closely related transcription factors in satellite cell-derived myoblasts. Our analyses of genome-wide binding site and gene expression data point to a more dominant role for Pax7 in adult skeletal muscle cells. Surprisingly, we found that Pax7 binds to many more sites (Figure 1) than the number of genes it regulates (Figure S4). This discrepancy between binding and expression data can be partially explained by the occurrence of multiple binding sites on the same gene. For example, analysis of Pax7 regulated genes revealed that on average there were three binding sites within 35kb of the TSS of these genes. However, a large fraction of the remaining binding sites were dispersed throughout the genome. The functional relevance of these binding sites is unknown.
Analysis of peak to gene association revealed that both Pax3 and Pax7 regulated genes are significantly enriched for Pax3 and Pax7 peaks compared to non-regulated genes (Figure 4). Association studies between binding sites and nearby genes using Genomic Regions Enrichment of Annotations Tool (GREAT) (McLean et al., 2010) revealed that Pax7 binding sites are highly associated with genes involved in growth, myoblast proliferation and muscle cell differentiation (Table S1). This finding is consistent with the expression data in which Pax7 up-regulates a significant set of genes that are involved in cellular growth, adhesion, and signaling pathways (Table S3). On the other hand, Pax7 also represses numerous genes that are involved in terminal differentiation (Table S3). Global analysis of peak to gene association showed that both up- and down-regulated genes are significantly enriched for Pax7 binding sites (Figure 4). This raises the question of whether Pax7 can act as a repressor. Previous studies have indicated that Pax7 appears exclusively associated with active chromatin (McKinnell et al., 2008). Therefore, it is interesting to speculate that Pax7 binding to differentiation-specific genes functions to maintain these genes transcriptionally poised during progenitor proliferation.
Strikingly, our data shows that despite recognizing the same binding motifs, Pax3 and Pax7 have significant differential affinities for paired versus hbox motifs (Figure 2 and and3).3). Pax7 strongly binds to hbox motifs (Figure 3A) and potently induces transcription of a reporter gene from this motif (Figure 3D) while Pax3 has low affinity for hbox motifs (Figure 3A) and a correspondingly lower transcriptional output on the same motif relative to Pax7 (Figure 3D). On the other hand, when prd and hbox are juxtaposed, both Pax3 and Pax7 have a similar effect on the transcription of a reporter gene (Figure 3C). This finding is consistent with the distribution of prd and hbox motifs in the full set of Pax3 and Pax7 binding sites (Figure 2). The observed differences between Pax3 and Pax7 affinities for prd and hbox motifs is surprising given the degree of protein sequence similarity between the two transcription factors in their paired and homeo domains.
Our data demonstrates that these intrinsic differences in DNA binding between Pax3 and Pax7 drive differential activation from promoters containing specific paired and hbox configurations. Work in Drosophila suggests that Pax proteins recognize target genes through various combinations of DNA binding domains. Synergistic binding of paired- and homeo-domains is required for the expression of even-skipped (Jun and Desplan, 1996); similarly we observed synergistic activation of the ECR111 element in Myf5 when both domains were juxtaposed (Figure 3C). Our data suggests that the distinguishing feature of Pax7 and Pax3 in adult skeletal muscles is that Pax7 can activate target gene expression from combined prd/hbox or hbox motifs alone, while Pax3 is ineffective in inducing transcription from only hbox motifs. We hypothesize that the observed transcriptional dominance of Pax7 over Pax3 is largely due to the greater affinity of Pax7 for hbox elements relative to Pax3.
Additional mechanisms may differentially influence the DNA binding affinity of Pax3 and Pax7. We observed several cases where embryonic Pax3 binding sites are poorly associated with Pax3 in adult cells, but robustly bound by Pax7 as indicated previously. DNA binding by Pax3 may require the presence of specific co-activators or post-translational modifications. It has been suggested that Pax3 activity is dependent on phosphorylation, as Pax3 transcriptional activity is blocked by kinase inhibition (Amstutz et al., 2008; Miller et al., 2008). Pax3 is phosphorylated at Ser205, proximal to the octapeptide domain. Mutation of this site not only abolishes phosphorylation, but also disrupts dimerization of Pax3 proteins. Homeodomain-binding motifs are thought to facilitate dimerization through inverted repeats of “TAAT” and we see a substantial difference in affinity between Pax3 and Pax7 binding to hbox motifs. Pax3 has been shown to cooperate with Sox10 to synergistically activate target gene expression (Mascarenhas et al., 2010).
Epigenetic modifications may also play a role in directing availability of binding sites to Pax3 or Pax7. For example, RARγ/RXRα heterodimers competently bind and induce retinoic acid response elements (RAREs) regulating Hoxa1 and Cyp26a1 in F9 teratocarcinoma stem cells, but not in Balb/c 3T3 fibroblasts (Kashyap and Gudas, 2010). This is attributed to retinoic acid-induced reduction of polycomb protein Suz12 and associated H3K27 trimethylation of Hoxa1 and Cyp26a1 RAREs. To what extent each of these factors contribute to the overall transcriptional network of Pax3 and Pax7 during myogenesis remains largely unknown.
Our experiments indicate that in adult myoblasts, Pax7 regulates many more genes than Pax3. We propose that many Pax7-only binding sites are regulatory elements for genes essential to the normal function of adult myogenic cells while their low affinity to Pax3 explains the inability of Pax3 to compensate for Pax7 in Pax7−/− muscles such as in the diaphragm. Ultimately, the unique characteristics of Pax3 and Pax7 binding provide a framework for understanding the differences in transcriptional network organization between embryonic and adult myogenesis. Both factors may be sufficient to initiate myogenic programs, but different developmental contexts may utilize alternative networks to fulfill other essential roles.
Consistent with the dominant role of Pax7 DNA binding, our gene expression data shows a panel of 439 genes that are significantly regulated by Pax7 but not Pax3. These genes are involved in diverse biological functions such as growth, signaling, cell adhesion and muscle cell differentiation (Table S6). For example Pax7 ability to induce Rspo1 and BMP4 among others (Table S3) suggests a mechanism by which Pax7 functions in the expansion of satellite cell population. Rspo1, a known Wnt/ β-catenin activator is known to regulate Myf5 expression in myoblasts (Han et al., 2011). We observed a Pax7 peak located at +12kb of the TSS of Rspo1. BMP4 is characterized in its ability to block the differentiation of myogenic cells (Dahlqvist et al., 2003). We observed a Pax7 peak on a highly conserved element at −28.7kb of BMP4. Additionally Pax7 over expression resulted in the repression of many muscle differentiation genes (Table S3, S6).
Myf5 is activated by Pax3 or Pax7 in both embryonic and skeletal muscle, and it has been shown that adult Myf5 enhancers operate independently of one another (Carvajal et al., 2008; Zammit et al., 2004). Embryonic regulation of Myf5 can occur through the −57.5kb enhancer, whereas deletions within Myf5-LacZ BACs show the −140kb to −88kb region is critical for the expression of Myf5 in quiescent satellite cells. Our data confirms that ECR111 is a Pax7-dependent enhancer that is evolutionarily conserved and directs Myf5 expression in quiescent satellite cells (Figure 6, Figure S6). The continued expression of Myf5-nLacZ in the ECR111 mutant in muscle spindles reinforces the idea that separate genetic elements control Myf5 expression between quiescent satellite cells and muscle spindles.
We have mapped Pax3 and Pax7 binding sites and identified common and discrete target genes. Our experiments have demonstrated that the disparate affinities of Pax3 and Pax7 for paired- versus homeo-motif provide a mechanistic explanation for the distinct role played by Pax7 in adult myogenesis. Importantly, the ability of Pax7 to specify myogenic identity while stimulating growth and inhibiting differentiation points to a central role in the regulation of myogenic progression of the adult satellite cell lineage. This work has facilitated the identification of gene interactions and represents an important step towards comprehensively defining the myogenic regulatory transcriptional network. Understanding the myogenic transcriptional network will have important implications for elucidating the molecular control of the myogenic developmental program, and for the genetic modulation of stem cells for use in the amelioration of muscle disease.
Sub-confluent myoblasts stably expressing a c-terminus TAP tagged Pax3 or Pax7 were cross-linked with 1% formaldehyde in 1× PBS. Cells were harvested by scraping and the cell pellet was dissolved in ChIP lysis buffer (Supplementary Materials and Methods). Chromatin was purified using two sequential immunoprecipitation steps. The first immunoprecipitation was done using anti 3xFLAG antibody conjugated to agarose beads (Sigma Aldrich) using 20 milligram of cell lysate as input for 2 hours at 4°C. Beads containing antigen/antibody complex were washed 3 times with 10 mM Tris-HCl, pH 8.0; 100 mM NaCl; 0.1% Triton-X100 containing protease inhibitors. Protein complex was eluted from M2-conjugated agarose beads using proteolytic cleavage with Tobacco Etch Virus (TEV) protease (Invitrogen) together with competition with 3xFLAG peptide (Sigma Aldrich) at 4°C overnight. Two additional rounds of elution with 3xFLAG peptides were done using 200 µg/ml of 3xFLAG peptide in TBS (50 mM Tris-HCl, pH 7.4; 150 mM NaCl). The eluted product was used as input for the second immunoprecipitation reaction using His-Select nickel beads (Sigma-Aldrich) for three hours at 4°C following manufacture’s recommendations. Beads were washed three times with wash buffer (20 mM Tris-HCl, pH 7.4; 150 mM NaCl; 5 mM Imidazole). Final DNA/protein complex was eluted by 400 mM imidazole at room temperature. Reverse cross-linking and phenol/chloroform extraction of chromatin were done as described previously (Gillespie et al., 2009). ChIP with Pax7-FLAG, EGFP, and puro cell lines were fixed in 1% formaldehyde solution and immunoprecipitated with M2-agarose (Sigma-Aldrich). Washing and DNA elution were performed according to the ChIP Assay Kit (Upstate). Enrichment of particular loci within ChIP DNA pools was determined by quantitative real-time PCR.
Total RNA was isolated from mouse skeletal muscle cells in triplicate by the RNEasy kit (Qiagen) and hybridized to Affymetrix Mouse Gene Array ST 1.0 chips and scanned using a GeneChip Scanner 30007G. Raw expression data was assembled using the Affymetrix GeneChip Command Console v1.1. Significance Analysis of Microarray (SAM) was used to derive differentially expressed genes. See Supplementary Materials and Methods for more details.
Peaks in the mapped Pax7 and Pax3 sequences were identified using MACS v1.3 (Zhang et al., 2008) using an empty TAP vector ChIP-seq as control. The following parameters: mfold=16, bw=150, p-value cut off of 10−5 were used in the peak calling procedure. Peaks greater than 800bp in width were excluded from further analysis, based on the size of excised ChIP fragments used for Solexa sequencing.
MEME (Bailey et al., 2009) was used to identify over-represented motifs in Pax3 and Pax7 peak set subsets, restricting the search to 50 base pair around the peak summit. Peaks were filtered by width, MACS score (−10 * log10 (p-value)) and repeat content to select regions most likely to reflect specific high-affinity binding events, and least likely to identify spurious motifs within repetitive sequences. Peaks greater than 800 base pairs in length were excluded from the analysis. See Supplementary Materials and Methods for more details.
Nine microarrays were used in the analysis. See supplementary Materials and Methods for GEO accession numbers. Transcript cluster identifiers (TCID) of the Affymetrix MoGene 1.0 ST chipsets were RMA normalization (Irizarry et al., 2003) using the xps in Bioconductor (Gentleman et al., 2004) R package with the Affymetrix provided MoGene-1_0-st-v1.r4 chip layout and scheme files. The RMA normalized results were log2 transformed and analysed using the Significance Analysis of Microarrays (SAM) (Tusher et al., 2001) method as implemented in the Bioconductor siggenes package. The resulting SAM fold changes and raw p-values were integrated into a single table by common TCID and annotated with gene associations based on probe to gene mappings downloaded from Ensembl v64 BioMart. TCIDs mapping to zero or more than one gene were excluded from the analysis. Gene properties were downloaded from Ensembl v64 BioMart and counts of MACS peaks identified peak positions falling within the chosen gene association window (−30kb to +5kb around the TSS) were provided for each gene.
See Supplementary Materials and Methods for a more extensive description of procedures and computational methods.
Baculovirus-purified Pax3-FLAG, Pax3-CTAP or Pax7-FLAG was incubated with 1.25ng (ATP γ-32P-labeled) DNA probe and nonspecific carrier (poly dI-dC) in a binding buffer containing 75mM NaCl, 1mM EDTA, 1mM DTT, 10mM Tris pH 7.5, 6% glycerol, and 0.25% BSA at room temperature for 20 minutes (Supplementary Materials and Methods). 25× cold probe was used for competitive binding assays. Supershifts were carried out with the addition of 1ug M2- or M5-FLAG antibody (Sigma-Aldrich). Reactions were analyzed on a 5% non-denaturing polyacrylamide gel (0.5X TBE), dried onto 3MM Whatman paper and exposed to BioMax MS x-ray film (Kodak).
ECR111 reporter vectors were constructed as follows. For the PD and HD reporters, three copies of 30bp sequences covering the prd and hbox motifs were directionally concatamerized and cloned upstream of a minimal Myf5 promoter driving the luc gene of pGL4.10 (Promega). A 70bp sequence covering the entire −111kb region, including both the prd and hbox regions was also concatamerized and cloned as above to generate the PDHD reporter. Reporter vectors were cotransfected with a renilla vector standard and either empty vector (VP16), VP16-Pax7 or VP16-Pax3 for 24 hours into C3H10T1/2 fibroblasts (also see Supplementary Materials and Methods). Luciferase assays were carried out using the Dual-Reporter Luciferase Assay System (Promega) and analysed on a Lumistar Optima (BMG Labtech) fluorescent plate reader.
Baculoviral purified Pax7 or Pax3 protein or equivalent amount of BSA (NEB) was equilibrated for 10 minutes at room temperature in the following buffer: (150mM KCl, 5mM MgCl2, 0.1mM EDTA, 8% glycerol, 30mM Tris-Cl pH 8.0, 1mM DTT) along with 1µg of poly dI-dC (Sigma). DNA probes were made from ECR57 and ECR111 of the Myf5 locus (Supplementary Materials and Methods). 150µg of each probe was then added for 20 minutes at room temperature (also see Supplementary Materials and Methods). Probes were then digested with 7.5×10−3u of DNase I (Worthington Biochemicals) for 15 minutes at room temperature. DNA was purified using MinElute enzymatic reaction cleanup (Qiagen). Control digestions were performed on an IgH probe. Digested DNA was added to HiDi formamide (Applied Biosystems) and 0.1µl GeneScan 500 LIZ size standards (Applied Biosystems). Mixtures were then heat denatured for 5 min at 95°C, immediately cooled on ice, and analyzed with a 3730 DNA Analyzer (ABI) running a G5 dye set (see Supplementary Materials and Methods for more details).
Gene expression values from Pax3 and Pax7 over expressing myoblasts were each ranked based on SAM fold change and binned into 20 groups. For each bin, we performed a one-tailed binomial test of the proportion of peak-associated genes in that bin versus all genes. We tested the hypothesis that the binned genes are more likely to be associated with peaks. The graphs in Figure 4 show the base 10 logarithms of the binomial test p-values.
We thank Dr Ricardo Ribas for providing the B195APZ BAC constructs. We thank Dr Hang Yin at the Ottawa Hospital Research Institute (OHRI) for his valuable comments and suggestion on the manuscript. M.A.R. holds the Canada Research Chair in Molecular Genetics and is an International Research Scholar of the Howard Hughes Medical Institute. This work was supported by grants from the National Institutes of Health, the Howard Hughes Medical Institute, the Canadian Institutes of Health Research, and the Canada Research Chair Program to MAR, by EuTRACC, a European Commission 6th Framework grant to FG, and by the Institute of Cancer Research, a Medical Research Council Programme Grant, and MYORES, a European Commission 6th Framework grant to JJC and PWJR.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
The authors declare no conflict of interest.