|Home | About | Journals | Submit | Contact Us | Français|
Facioscapulohumeral dystrophy (FSHD) is one of the most common inherited muscular dystrophies. The causative gene remains controversial and the mechanism of pathophysiology unknown. Here we identify genes associated with germline and early stem cell development as targets of the DUX4 transcription factor, a leading candidate gene for FSHD. The genes regulated by DUX4 are reliably detected in FSHD muscle but not in controls, providing direct support for the model that misexpression of DUX4 is a causal factor for FSHD. Additionally, we show that DUX4 binds and activates LTR elements from a class of MaLR endogenous primate retrotransposons and suppresses the innate immune response to viral infection, at least in part through the activation of DEFB103, a human defensin that can inhibit muscle differentiation. These findings suggest specific mechanisms of FSHD pathology and identify candidate biomarkers for disease diagnosis and progression.
Facioscapulohumeral dystrophy (FSHD) is the third most common muscular dystrophy. The mutation that causes FSHD was identified nearly 20 years ago (Wijmenga et al., 1992), yet the molecular mechanism(s) of the disease remains elusive. The most prevalent form of FSHD (FSHD1) is caused by the deletion of a subset of D4Z4 macrosatellite repeats in the subtelomeric region of chromosome 4q. Unaffected individuals have 11-100 of the 3.3kb D4Z4 repeat units, whereas FSHD1 individuals have 10 or fewer repeats. At least one repeat unit appears necessary for FSHD because no case has been identified with a complete deletion of D4Z4 repeats (Tupler et al., 1996). Each repeat unit contains a copy of the double homeobox retrogene DUX4 (Clapp et al., 2007; Gabriels et al., 1999; Lyle et al., 1995), and inappropriate expression of DUX4 was initially proposed as a possible cause of FSHD. This was supported by the observations that repeat contraction is associated with decreased repressive epigenetic marks in the remaining D4Z4 units (van Overveld et al., 2003; Zeng et al., 2009) and that overexpression of the DUX4 protein in a variety of cells, including skeletal muscle, causes apoptotic cell death (Kowaljow et al., 2007; Wallace et al., 2011; Wuebbles et al., 2010). However, initial attempts to identify DUX4 mRNA transcripts in FSHD muscle were unsuccessful, leading to the suggestion that other genes in the region were causative for FSHD (Gabellini et al., 2002; Klooster et al., 2009; Laoudj-Chenivesse et al., 2005; Reed et al., 2007).
Recent progress has returned the focus to the DUX4 retrogene as a leading candidate for FSHD. First, a subset of individuals with clinical features of FSHD do not have contracted D4Z4 repeats on chromosome 4 but do have decreased repressive heterochromatin at the D4Z4 repeats (de Greef et al., 2009) (FSHD2), indicating that loss of repressive chromatin at D4Z4 is the primary cause of FSHD. Second, genetic studies identified polymorphisms that create a DUX4 polyadenylation site as necessary for a D4Z4 contraction to cause FSHD (Lemmers et al., 2010). Third, high sensitivity RT-PCR assays detect DUX4 mRNA specifically in FSHD muscle (Dixit et al., 2007; Snider et al., 2010). Still, a major problem with the hypothesis that DUX4 expression causes FSHD has been the extremely low abundance of the mRNA and inability to reliably detect the protein in FSHD biopsy samples. Our prior work demonstrated that the low abundance of DUX4 in FSHD muscle cells represents a relatively high expression in a small subset of nuclei (Snider et al., 2010). However, it remained unclear whether the expression of DUX4 in FSHD muscle has a biological consequence that might drive the pathophysiology of FSHD.
The coding sequence of the DUX4 retrogene has been conserved in primates (Clapp et al., 2007), but whether this retrogene has a normal physiological function is unknown. Previously we found that DUX4 is normally expressed at high levels in germ cells of human testes and is epigenetically repressed in somatic tissues (Snider et al., 2010), whereas the epigenetic repression of the DUX4 locus in somatic tissues is less efficient in both FSHD1 and FSHD2, resulting in DUX4 expression in FSHD muscle cell nuclei. The germline-specific expression pattern of DUX4 is similar to that of other double homeodomain proteins (Booth and Holland, 2007; Wu et al., 2010). The function of this distinct family of DNA-binding proteins is unknown, but their shared tissue expression pattern may indicate a possible role for double homeodomain transcription factors in reproductive biology.
Here we report that DUX4 regulates the expression of genes involved in germline and early stem cell development. These DUX4 target genes are aberrantly expressed in FSHD skeletal muscle but not in control muscle biopsies. Therefore, the low level of DUX4 expression in FSHD is sufficient to effect numerous downstream changes and activate genes of germ cell and early development in postmitotic skeletal muscle. Additionally, we show that DUX4 binds and activates LTR elements from a class of MaLR endogenous primate retrotransposons and at the same time suppresses the innate immune response to retroviral infection, at least in part through transcriptional activation of DEFB103, a human defensin that can inhibit muscle differentiation. These findings suggest specific mechanisms of FSHD pathology and identify candidate biomarkers for disease diagnosis and progression.
Previously, we identified two different DUX4 mRNA transcripts in human skeletal muscle, both at extremely low abundance: a full-length open reading frame mRNA (DUX4-fl) only detected in FSHD muscle and an internally spliced form of DUX4 mRNA (DUX4-s) that maintains the N-terminal double-homeobox domains but deletes the C-terminal domain and is detected in both control and FSHD muscle (Snider et al., 2010). Forced over-expression of DUX4-fl is toxic to cells, inducing apoptotic cell death (Kowaljow et al., 2007; Wallace et al., 2011), whereas forced over-expression of DUX4-s is not toxic to cultured human skeletal muscle cells (Geng et al., 2011). To determine whether gene expression is regulated by DUX4-fl and/or DUX4-s in human muscle cells, we transduced primary myoblasts from a control individual (unaffected by muscle disease) with a lentiviral vector expressing either DUX4-fl or DUX4-s and performed expression microarrays. At 24 hours after transduction, DUX4-fl increased the expression of 1071 genes and decreased the expression of 837 genes compared to a control myoblast population similarly infected with a GFP expressing lentivirus (2-fold change and FDR<0.01); whereas DUX4-s increased the expression of 159 genes and decreased expression of 45 genes (Figure 1A and see Table S1 for the complete list of genes regulated by DUX4-fl and/or DUX4-s). Using a slightly more stringent 3-fold criteria (> 1.584 log2-fold change and FDR<0.01), 466 genes were increased and 244 decreased by DUX4-fl; and 37 were increased and one decreased by DUX4-s. Only two annotated genes were increased 3-fold or more by both (CCNA1, MAP2), and none were decreased 3-fold or more by both. A representative sample of genes activated by DUX4-fl is shown in Table 1 and the full set of genes regulated by DUX4-fl or DUX4-s is in Table S1.
The Gene Ontology (GO) terms significantly enriched in 3-fold up-regulated genes by DUX4-fl included categories such as RNA polymerase II mediator complexes, RNA splicing and processing, and gamete/spermatogenesis (Table S2A); whereas down-regulated genes were enriched in immune response pathways (Table S2B). The up-regulation of a large number of transcription-related and RNA processing factors suggests that DUX4-fl might be a central component of a complex gene regulatory network, and the large number of germline associated genes suggests a possible role in reproductive biology.
In primary human myoblasts, DUX4-fl robustly induced a large number of genes not normally detected in skeletal muscle (see the contrail of genes along the Y-axis in the left panel of Figure 1A). These genes are good candidate biomarkers of DUX4 activity in skeletal muscle, since there is little to no background expression in control muscle. GO analysis for these highly induced genes showed enrichment for gamete generation and spermatogenesis categories (Table S2A). In many cases, DUX4-fl activated multiple members of gene families involved in germ cell biology and early development, including some primate-specific genes (Table 2). We validated the differential expression of 15 of the DUX4-fl regulated genes by RT-PCR (Figure S1).
Double homeodomain proteins comprise a distinct group of DNA-binding proteins (Holland et al., 2007), but their consensus recognition sites and genomic targets are unknown. Therefore, we performed chromatin immunoprecipitation combined with high throughput sequencing (ChIP-Seq) to identify DUX4-binding sites in human muscle cells. We used two polyclonal rabbit antisera against DUX4 (Figure S2A and S2B) to immunoprecipitate DUX4-fl from human primary myoblasts 24 hours after transduction with lentiviral expressed DUX4-fl or control non-transduced primary myoblasts. Non-redundant reads unambiguously mapped to the human genome were computationally extended to a total length of 200 nucleotides and “peaks” were defined as regions where the number of reads was higher than a statistical threshold compared to the background (see Methods). Reads mapping to the X and Y chromosomes were excluded from our analysis.
A total of 62,028 and 39,737 peaks were identified at P-value thresholds of 10−10 and 10−15, respectively, after subtracting background peaks in the control samples. DUX4-fl peaks were widely distributed both upstream and downstream of gene transcription start sites (TSSs) with higher numbers in introns and intergenic regions, but showing a relatively constant peak density in all genomic regions when normalized for the size of the genomic compartment (Figure 1B, left). This pattern differs from that reported for many other transcription factors, such as MYOD (Cao et al., 2010), shown for comparison (Figure 1B, right), that show higher average peak density in regions near TSSs.
A de novo motif analysis identified the sequence TAAYBBAATCA (IUPAC nomenclature) (Figure 1C, top) near the center of greater than 90% of peaks. To our knowledge, this motif has not been described for any other transcription factor, but does contain two canonical homeodomain binding motifs (TAAT) arranged in tandem and separated by one nucleotide. Approximately 30% of sequences under the DUX4-fl peaks also contained a second larger motif that encompasses the primary DUX4-fl binding motif. This longer motif matches the long terminal repeat (LTR) of retrotransposons (Figure 1C, bottom). Assessment of the representation of DUX4-fl binding at different annotated repetitive elements in the genome shows a nearly 10-fold enrichment of DUX4-fl binding in the Mammalian apparent LTR-Retrotransposon (MaLR) family of retrotransposons and some enrichment in the related ERV family (Table S3). Note that the quantitative estimate of repeat-associated binding sites is conservative since reads mapping to more than one locus are excluded from our analysis.
MaLR family members expanded in the primate lineages (Smit, 1993). Thus, if DUX4-fl binding sites were carried throughout the genome during this expansion, these newer sites might have a different sequence motif compared to DUX4-fl binding sites located outside of MaLR repeats. To determine if the expansion of MaLR-associated binding sites might obscure the identification of a different DUX4 binding motif in non-repetitive elements, we performed separate motif analysis of MaLR-associated sites and sites not associated with repeats; both yielded nearly identical core motifs, TAAYYBAATCA and TAAYBYAATCA, respectively, but the repeat-associated motifs had slightly more flanking nucleotides preferences reflecting the LTR sequence (Figure S2C). Electrophoretic mobility shift assay (EMSA) confirmed that DUX4-fl binds the core motif present in both MaLR-associated and non-repeat associated sites and that mutation of the core nucleotides abolishes binding (Figure S2D-F), including sites from both repeat and non-repeat regions. Because the DUX4-s alternative splice form retains the N-terminal DNA-binding homeodomains, we hypothesized that it would bind to the same sites as DUX4-fl. EMSA confirmed that DUX4-s specifically binds the same core binding site as DUX4-fl in vitro (Figure S2G). Thus, DUX4-s can bind the same sequences as DUX4-fl but does not activate transcription of the same genes, which supports the prior determination that the C-terminus contains a transactivation domain (Kawamura-Saito et al., 2006).
The number of DUX4-fl binding locations exceeds the number of genes that robustly increase expression in muscle cells following transduction with DUX4-fl. A genome-wide analysis of peak height and regional gene expression shows only a weak association of binding and gene expression for DUX4-fl (Figure 2A). To determine whether DUX4-fl binding might function as a transcriptional activator at some of the identified binding sites, DUX4 binding sites from selected genes were cloned upstream of the SV40 promoter in the pGL3-promoter luciferase construct. Co-transfection with DUX4-fl in human rhabdomyosarcoma cell line RD significantly induced luciferase expression independent of orientation or position, and mutation of the DUX4 binding motif eliminated the induction (Figure 2B and 2C). In contrast to DUX4-fl, DUX4-s did not activate expression despite demonstrating in vitro binding to this site (Figure 2B and Figure S2G).
To determine whether DUX4 binding might directly regulate transcription of select genes, we cloned the 1.9 kb enhancer and promoter region of the ZSCAN4 gene that includes four DUX4 binding sites into pGL3-basic reporter vector. Co-transfection with DUX4-fl significantly induced expression of this reporter and mutation of three of the four DUX4 binding sites nearly abolished the induction (Figure 2D, left). DUX4-s interfered with the activity of DUX4-fl when the two were co-expressed (Figure 2D, right), suggesting that DUX4-s acts as a dominant negative for DUX4-fl activity.
DUX4-fl also activated transcription through DUX4 sites in repetitive elements: DUX4-fl activated transcription of a luciferase reporter containing DUX4 binding sites cloned from LTRs at a MaLR THE1D element (Figure 2E, left) and RT-PCR showed induction of endogenous MaLR transcripts in muscle cells transduced with DUX4-fl (Figure 2E, right).
To identify the set of genes that might reflect the function of DUX4-fl prior to the expansion of MaLRs in primates, we identified the subset of genes activated at least 3-fold by DUX4-fl that also contain a non-repeat associated binding site within six kilobases of the TSS and not separated from the TSS by a binding site for the insulator factor CTCF (Table S4). The 74 genes meeting these criteria are highly enriched for genes involved in stem and germ cell functions, RNA processing, and regulated components of the PolII complex, similar to the major GO categories identified for all of the genes regulated by DUX4-fl. Quantitative RT-PCR of six DUX4-regulated genes on paired samples of testis mRNA and skeletal muscle mRNA from two control individuals found high expression of these targets in the testes and absent, or nearly absent, expression in skeletal muscle (Figure 3). We also detected the expression of the related DUXA and DUX1 genes in healthy human testis (data not shown), further supporting the notion that this family of double homeodomain proteins has a role in germ cell biology.
To determine whether the low levels of endogenous DUX4-fl mRNA detected in FSHD skeletal muscle is sufficient to activate DUX4 target genes, we assessed the expression of these genes in a set of control and FSHD muscle. Cultured muscle cells from control biopsies showed low or absent expression of the six DUX4-fl regulated genes, whereas these genes were expressed at significantly higher levels in the FSHD muscle cultures (Figure 4A), including those from both FSHD1 and FSHD2 individuals.
Similar to the expression of DUX4-fl regulated targets in cultured FSHD muscle, muscle biopsies from FSHD individuals had readily detectable mRNA of DUX4-fl regulated genes, although at varying levels in different biopsies (Figure 4B). The DUX4-fl mRNA is at extremely low abundance in FSHD muscle and it is notable that some biopsy samples in which the DUX4-fl mRNA was not detected showed elevation of DUX4 regulated targets (Table S5 and Figure S3A), indicating that the target mRNA is of significantly higher abundance and perhaps more stable than the DUX4 mRNA.
To determine whether the expression of these genes in FSHD muscle cells was directly due to DUX4, we transfected FSHD muscle cells with siRNA to the endogenous DUX4-fl mRNA. The siRNA sequences that decreased the DUX4-fl mRNA also resulted in decreased expression of the DUX4 target genes (Figure 4C, siRNA 1 and siRNA 4), confirming that endogenous DUX4 drives the expression of these genes in FSHD muscle cells. In addition, expression of the dominant negative DUX4-s (see Figure 2D) also inhibited the endogenous expression of the target genes (Figure S3B).
As noted above, genes enriched in the innate immunity pathways were expressed at lower levels in myoblasts transduced with lenti-DUX4-fl compared to the lenti-GFP or lenti-DUX4-s (see Table S2B). When compared to non-transduced cells, it was evident that about 350 genes, most of which were in the innate immunity pathway, were unchanged in the lenti-DUX4-fl transduced myoblasts but increased in cells transduced with either control lenti-GFP or lenti-DUX4-s (Table S6). Therefore, lentiviral induction of the innate immune response in human muscle cells appeared to be inhibited by DUX4-fl. RT-qPCR validated that lenti-GFP, lenti-DUX4-s, and multiple other lentivirus constructs induced the innate immune response in myoblasts, whereas similar titers of lenti-DUX4-fl did not (Figure 5A, left, and data not shown). Additionally, supernatant from DUX4-fl infected cells reduced the induction of these genes by lenti-GFP (Figure 5B), indicating that a secreted factor induced by DUX4-fl could mediate this suppressive effect.
DUX4-fl robustly induced expression of DEFB103A and DEFB103B (β-defensin 3) (Figure 5A, right, Table 1 and S1), which has been shown to inhibit the transcription of pro-inflammatory genes in TLR4-stimulated macrophages (Semple et al., 2011). Indeed, addition of DEFB103 peptide also inhibited the induction of the innate immune response to lenti-GFP when added to the muscle cells at the time of infection (Figure 5B) but did not prevent viral entry and transduction as measured by copies of viral integrants in the genome and levels of GFP mRNA expressed (data not shown). Thus, DUX4 can prevent the innate immune response to viral infection in skeletal muscle cells, at least in part, through the transcriptional induction of DEFB103.
Like other DUX4-regulated genes, endogenous expression of DEFB103 was detected in FSHD cultured muscle cells, FSHD muscle biopsies, and in healthy testes, but little to none was seen in control skeletal muscle (Figure 5C). DEFB103 has been previously shown to bind the CCR6, CCR2, and melanocortin receptors and to be an antagonist ligand for the CXCR4 receptor, which is important for muscle cell migration and differentiation (Candille et al., 2007; Feng et al., 2006; Jin et al., 2010; Yang et al., 1999). To determine whether DEFB103 could affect myoblasts or muscle differentiation, we treated cultured control human muscle cells with DEFB103 peptide at concentrations used in anti-microbial and immunological assays (2.5-5.0 ug/ml) (Funderburg et al., 2007; Midorikawa et al., 2003; Semple et al., 2011) and assessed changes with gene expression arrays. Based on a 2-fold change threshold, DEFB103 did not alter the expression of any genes in myoblasts, although it is of interest that myostatin was upregulated approximately 50% and RT-qPCR confirmed that DEFB103 increased the mRNA for myostatin in myoblasts (Figure 5D). In contrast, exposing differentiating muscle cells to DEFB103 reduced the expression of 44 genes relative to the untreated control, the majority of which were genes associated with muscle differentiation (Table S7), and RT-qPCR on select genes (ACTA1, CKM, MYH2 and TNNT3) validated the array results (Figure 5E). Furthermore, muscle cultures exposed to DEFB103 during 72 hr of differentiation media showed decreased fusion to myotubes and decreased expression of myosin heavy chain, a marker of terminal muscle differentiation (Figure 5F). Therefore, DEFB103 activates the expression of myostatin in myoblasts and inhibits the expression of genes necessary for normal muscle differentiation, although it remains to be determined whether this activity is mediated by the CXCR4 receptor. Therefore, DUX4-mediated expression of DEFB103 in FSHD muscle can modulate the innate immune response and can inhibit myogenic differentiation.
Recent genetic and molecular studies indicated DUX4 as the likely candidate gene for FSHD (Dixit et al., 2007; Lemmers et al., 2010; Snider et al., 2010). Although the abundance of DUX4-fl mRNA was extremely low in FSHD muscle, we previously showed that this represented relatively high expression of both DUX4-fl mRNA and protein in a small percentage of muscle nuclei at any time point, either because the gene was on transiently or the expressing nuclei were eliminated (Snider et al., 2010). Yet, it remained unclear whether DUX4-fl expression had a biological consequence in FSHD. In our current study, we identify genes regulated by DUX4-fl and show that they are expressed at readily detectable levels in FSHD skeletal muscle, both cell lines and muscle biopsies, but not in control tissues, providing direct support for the model that misexpression of DUX4-fl is a causal factor for FSHD. The genes regulated by DUX4-fl suggest several specific mechanisms for FSHD pathophysiology.
Many of the genes highly upregulated by DUX4-fl normally function in the germline and/or early stem cells and are not present in healthy adult skeletal muscle. This supports a biological role for DUX4-fl in germ cell development and suggests potential disease mechanisms for FSHD. Activation of the gametogenic program might be incompatible with post-mitotic skeletal muscle, leading to apoptosis or cellular dysfunction. Also, the testis is an immune-privileged site and testis proteins misexpressed in cancers can induce an adaptive immune response (Simpson et al., 2005). In fact, some of the genes regulated by DUX4-fl, such as the PRAME family (Chang et al., 2011), are known cancer testis antigens, so it is reasonable to suggest that expression of these genes in skeletal muscle might also induce an adaptive immune response. An immune-mediated mechanism for FSHD is consistent with the focal inflammation and CD8+ T-cell infiltrates that characterize FSHD muscle biopsies (Frisullo et al., 2011; Molnar et al., 1991).
The induction of DEFB103 by DUX4 might influence both the adaptive and the innate immune response. DEFB103 can have a pro-inflammatory role in the adaptive immune response (Funderburg et al., 2007) and can act as a chemo-attractant for monocytes, lymphocytes and dendritic cells (Lai and Gallo, 2009). In this regard, it might enhance an adaptive immune response to germline antigens expressed in FSHD muscle. Though traditionally known for its role in antimicrobial defense (Sass et al., 2010), DEFB103 has been shown to suppress the innate immune response to LPS and TLR4 stimulation in macrophages (Semple et al., 2011; Semple et al., 2010), and has also been shown to be an antagonistic ligand of the CXCR4 receptor (Feng et al., 2006), which is important for muscle migration, regeneration, and differentiation (Griffin et al., 2010; Melchionna et al., 2010). In this study we show that DEFB103 inhibited the innate immune response to lentiviral infection in skeletal muscle cells, modestly induced myostatin in myoblasts, and impaired muscle cell differentiation. Therefore, DEFB103 might contribute to FSHD pathology by modulating the adaptive and innate immune response, as well as through inhibiting muscle differentiation. In this regard it is interesting to note that expression of murine Defb6 in the skeletal muscle of transgenic mice induced progressive muscle degeneration (Yamaguchi et al., 2007), although the mechanism was not determined.
Reactivation of retroelements can result in genomic instability (Belancio et al., 2010) and transcriptional deregulation (Schulz, 2006). Therefore, DUX4 activation of MaLR transcripts might directly contribute to FSHD pathophysiology. It is interesting that DUX4 both activates retroelement transcription and suppresses the virally induced innate immune response. Although we have shown that DEFB103 can substitute for DUX4 to suppress the innate immune response, products of retroelements and endogenous retroviruses may do the same and, thus, the DUX4-mediated suppression of the innate immune response might be multi-factorial. Since DEFB103 is also expressed in the testis, it is interesting to consider whether the role of DUX4 in the germline might include a simultaneous activation of retroelement transcription and suppression of the innate immune response to those transcripts.
DUX4 regulated targets also include genes involved in RNA splicing, developmentally regulated components of the Pol II transcription complex, and ubiquitin-mediated protein degradation pathways, all of which may have pathophysiological consequences. A recent study indicated that induction of E3 ubiquitin ligases by DUX4 might cause muscle atrophy in FSHD (Vanderplanck et al., 2011), consistent with our findings that multiple ubiquitin ligase family members are induced by DUX4. In addition, myostatin induces some of these ubiquitin ligases in skeletal muscle (Lokireddy et al., 2011) and it is therefore possible that both DUX4 induction of ubiquitin ligases and the modest upregulation of myostatin by DEFB103 that we observed in this study can both contribute to muscle atrophy. DUX4 is also known to induce apoptosis in muscle cells and DUX-4 mediated myopathy in mice has been shown to be p53-dependent (Wallace et al., 2011). As noted above, the activation of retrotransposons or reactivation of the gametogenic program, particularly inducers of cell cycle in post-mitotic muscle, might contribute to apoptosis. In addition, the altered expression of many factors involved in RNA transcription and splicing might affect cell differentiation and survival. In many human diseases a single mutation can effect multiple pathological pathways that collectively account for the complex disease phenotype. Our study of DUX4 regulated genes has identified several candidate pathways and future work will be necessary to determine their relative contributions to the disease phenotype.
In this regard, other genes have been identified as candidates for FSHD. For example, FRG1 expression has been reported to be elevated in FSHD muscle (Gabellini et al., 2002) and FRG1 transgenic mice display a muscular dystrophy phenotype (Gabellini et al., 2006). It is interesting that FRG1 is reported to alter RNA splicing in FSHD muscle (Gabellini et al., 2006) and that our study shows that DUX4-fl also alters the expression of many genes that regulate splicing and RNA processing. It will be important to determine the relative contributions of DUX4 and FRG1 to FSHD pathophysiology; however, the human genetics shows a convincing linkage to polymorphisms necessary for the polyadenylation of the DUX4 mRNA (Lemmers et al., 2010), indicating that DUX4 mRNA is a necessary component of the disease. Therefore, one therapeutic avenue to pursue for FSHD is to reduce the activity of DUX4, either by eliminating its expression in the muscle cells as we have done in vitro with an siRNA or by introducing a dominant negative, such as the DUX4-s splice form.
A previous study identified PITX1 as a DUX4 target gene expressed in FSHD skeletal muscle and in mouse cells transfected with DUX4 (Dixit et al., 2007). Others have expressed DUX4 in mouse muscle cells and identified repression of the glutathione redox pathway (Bosnakovski et al., 2008). Both of these findings are consistent with our expression array data. However, since many of the DUX4 binding sites reside in primate-specific MaLRs and some of the DUX4 targets are not conserved in mice, further studies are necessary to determine the conserved and primate-specific functions of DUX4, an important consideration for evaluating mouse models of FSHD.
In conclusion, our data support the model that inappropriate expression of DUX4 plays a causal role in FSHD skeletal muscle pathophysiology by activating germline gene expression, endogenous retrotransposons, and suppressors of differentiation in skeletal muscle. The set of genes robustly upregulated by DUX4 in FSHD skeletal muscle are candidate biomarkers because they are absent in control muscle and easily detected in FSHD1 and FSHD2 muscle. Furthermore, some target genes encode secreted proteins, which offer the potential for developing blood tests to diagnose FSHD or monitor response to interventions. Beyond their utilities as candidate biomarkers, the DUX4 targets identified in this study point to specific mechanisms of disease and may help guide the development of therapies for FSHD.
Primary human myoblasts were collected and cultured as previously described (Snider et al., 2010). Human RD cells were grown in DMEM in 10% bovine calf serum (Hyclone) and pencillin/streptomycin.
Quadruplicate total RNA samples were collected from control human primary myoblasts transduced with lentivirus carrying DUX4-fl, DUX4-s or GFP (MOI = 15) for 24 h. Samples were analyzed by Illumina HumanHT-12 v4 Expression BeadChip Whole Genome arrays. Probe intensities were corrected, normalized, and summarized by the Lumi package of Bioconductor (Du et al., 2008). Differentially expressed genes were identified by the LIMMA package of Bioconductor (Wettenhall and Smyth, 2004). Gene set enrichment analysis (GSEA) was performed using the Bioconductor GOstats package (Falcon and Gentleman, 2007).
ChIP was performed and ChIP DNA samples were prepared as previously described (Cao et al., 2010). Anti-DUX4 C-terminus rabbit polyclonal antibodies MO488 and MO489 were combined to immunoprecipitate DUX4-fl. The samples were sequenced with Illumina Genome Analyzer II.
Sequences were extracted by Illumina package GApipeline and reads were aligned using BWA to the human genome (hg18). We only kept one of the duplicated sequences to minimize the artifacts of PCR amplification. Reads mapping to multiple locations in the genome were excluded from our analysis. Each read was extended in the sequencing orientation to a total of 200 bases to infer the coverage at each genomic position. We performed Peak calling by an in-house developed R package “peakSig” (pending submission to Bioconductor), which models background reads by a negative binomial distribution. The negative binomial distribution can be viewed as a continuous mixture of Poisson distribution where the mixing distribution of the Poisson rate is modeled as a Gamma prior. This prior distribution is used to capture the variation of background reads density across the genome. Model parameters were estimated by fitting the truncated distribution on the number of bases with low coverage (one to three), to avoid the problem of inferring effective genome size excluding the non-mappable regions, and to eliminate contamination of any foreground signals in the high coverage regions. We also fit a GC dependent mixture model so that the significance of the peaks is determined not only by peak height, but also by the GC content of the neighboring genomic regions.
We used an in-house developed R package “motifRG” (pending submission to Bioconductor) described previously for discriminative de-novo motif discovery (Palii et al., 2011). Briefly, it finds motifs that distinguish positive and negative sequence datasets, which in this study correspond to DUX4 binding sites and randomly sampled flanking regions of DUX4 binding sites. To generate a more accurate presentation of the DUX4 binding sites from the consensus pattern returned by the motifRG package, we further used a positional weight matrix (PWM) model, using the matches of the consensus pattern as the seed to initialize the iterative expectation-maximization (EM) refinement process similar to MEME. The motifs were extended iteratively as long as there was sequence preference in the flanking region, and refined in the same EM process.
Transient DNA transfections of RD cells were performed using SuperFect (Qiagen) according to manufacturer specifications. Briefly, 3 × 105 cells were seeded per 35 mm plate the day prior to transfection. Cells were co-transfected with pCS2 expression vectors (2 ug/plate) carrying either β-galactosidase, DUX4-fl or DUX4-s and with pGL3-promoter luciferase reporter vectors (1 ug/plate) carrying various putative DUX4 binding sites or mutant sites upstream of the SV40 promoter or pGL3-basic reporter vector (1 μg/plate) carrying test promoter fragment upstream of the firefly luciferase gene. Cells were lysed 24 h post-transfection in Passive Lysis Buffer (Promega). Luciferase activities were quantified using reagents from the Dual-Luciferase Reporter Assay System (Promega) following manufacturer’s instructions. Light emission was measured using BioTek Synergy2 luminometer. Luciferase data are given as the averages ± SD of at least triplicates.
One microgram of total RNA was reverse transcribed into first strand cDNA in a 20 uL reaction using SuperScript III (Invitrogen) and digested with 1U of RNase H (Invitrogen) for 20 min at 37°C. cDNA was diluted and used for quantitative PCR with iTaq SYBR Green supermix with ROX (Bio-Rad). The relative expression levels of target genes were normalized to those of ribosomal protein L13A (RPL13A) by 2ΔCt. Undetermined values were equated to zero. Standard deviations from the mean of the ΔCt values were calculated from triplicates. PCR primers used for detecting the transcripts of the selected genes are listed in Supplementary methods. All primers amplify with similar and high (>90%) efficiencies.
Muscle biopsy samples were collected at the University of Rochester Fields Center from the vastus lateralis muscle of clinically affected and control individuals as previously described (Snider et al., 2010). RNA from matched tissues from healthy donors were purchased from BioChain (Hayward, CA).
Statistical significance between two means was determined by unpaired one-tailed t tests with P-value <0.05. Statistics for the microarray and ChIP-Seq experiments are described separately.
Supported by NIAMS R01AR045203, NINDS P01NS069539, and Friends of FSH Research; University of Washington Genome Sciences Training Grant (L.N.G.), and University of Washington Child Health Research Center, NIH U5K12HD043376-08 (A.P.F.). We thank M. Conerly, S. Diede, K. MacQuarrie, L. Maves and K. Siebenthall for review of the manuscript; A. Tyler for technical assistance; M. Gale and J. Smith for advice on innate immunity and defensins; and F. Rigo and F. Bennett at ISIS Pharmaceuticals for assistance with siRNA design.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Accession numbers Microarray and ChIP-seq data have been deposited in Gene Expression Omnibus (GEO) under accession numbers GSE33799 and GSE33838, respectively.
The authors declare no conflict of interest.