Our quest for the nuclear factor binding the QTN site in the porcine
IGF2 gene has been driven by the vision that this factor must be important, since disruption of the interaction with one of its target sites alters body composition and promotes cardiac growth in pigs. Our previous experiments revealed a highly specific interaction between the factor and its target site in
IGF2 [2], but it was not until we used the ultrasensitive SILAC technology that we could take advantage of this specificity and isolate ZBED6. The results presented in this study have conclusively demonstrated that ZBED6 is the bona fide repressor binding the QTN site in pig
IGF2. This conclusion is based on (
i) EMSA with recombinant ZBED6 protein, (
ii) supershift of EMSA complex using an anti-ZBED6 antibody, (
iii) abolishment of the repressor function in a luciferase assay after siRNA silencing, and (
iv) ChIP data. The biological significance of ZBED6 was underscored in this study as siRNA silencing in C2C12 cells led to faster myotube formation and wound healing, and increased cell proliferation.
The difference between less complex eukaryotes like
Caenorhabditis elegans and more complex eukaryotes, such as human, is related not to the number of protein-coding genes, but rather to the complexity of the gene regulatory networks. A large proportion of vertebrate genomes is composed of transposable elements, and their integration in the genome has contributed to the evolution of regulatory networks
[11]. The majority of these transposable elements are retrotransposons, but 5%–10% are derived from DNA transposons. In the initial analysis of the first human genome assembly, Lander et al.
[12] identified 47 human genes derived from transposable elements, as many as 43 of these are derived from DNA transposons, and in fact, one of the genes listed in Table 13 of the human genome paper corresponds to
ZBED6. However,
ZBED6 has never been appropriately annotated in any mammalian genome, despite the fact that it constitutes an ~2,900-bp open reading frame and that part of the ZBED6 protein is extremely well conserved among placental mammals. A bioinformatic analysis of other vertebrate genomes did not reveal the presence of a functional
ZBED6 gene outside the placental mammals. We found evidence for a nonfunctional
ZBED6 sequence at the orthologous positions in the Platypus and opossum genomes, but these genomes did not contain an extended open reading frame for
ZBED6 (unpublished data). This implies that the integration of
ZBED6 happened before the divergence of the monotremes from the other mammals, but that the gene has been inactivated or lost in monotremes and marsupials. Thus,
ZBED6 must have evolved its essential function in the time span after the split between marsupial and placental mammals, but before the radiation of different orders of placental mammals. An interesting topic for future research will be to reveal what advantage the development of ZBED6 as a new regulatory protein has provided to the placental mammals.
ZBED6 is an apparent example of a domesticated transposon that has lost its ability to transpose, because it occurs as a single copy gene at the same location in intron 1 of
ZC3H11A in all placental mammals for which at least a partial genome sequence is available. ZBED6 has evolved an essential function in this group as implicated by the observation that the two DNA-binding BED domains (about 100 amino acids together) show near 100% amino acid identity across 26 placental mammals (
Figure S2). The two BED domains in ZBED6 have apparently evolved by internal duplication because the two copies are more similar to each other than to any other mammalian BED sequence. The mechanism by which ZBED6 acts as a repressor remains to be determined. Chromatin remodeling is an obvious possibility since other members of the ZBED family have this function. For instance, the
Drosophila Dref protein, a BED domain protein, is found in complex with the NURF chromatin remodeling complex and its human ortholog ZBED1 interacts with MI2, a chromatin remodeling factor, and PC2, a Polycomb group protein involved in heterochromatin formation
[13]. The ability of ZBED6 to interact with chromatin and affect transcriptional regulation is most likely a function derived from the ancestral transposase. The nucleolar localization of ZBED6 () suggests that it may mediate transcriptional silencing by moving the
IGF2 locus and other targets to the nucleolus.
Our ChIP-sequencing experiment using mouse C2C12 myoblasts revealed more than 1,000 genes putatively regulated by ZBED6 in the mouse. We assume that a majority of these binding sites are true positives, because (
i) we were able to generate a consensus binding motif () with a perfect match with the established
Igf2 binding site using both peaks with high and low enrichment levels, (
ii) the majority of the binding sites occurred in the vicinity of TSS (), (
iii) most of the binding sites occurred within or near CpG islands (), in line with the established binding site in
Igf2, and (
iv) the highly significant enrichment of certain Gene Ontology terms (). Thus, although we are certain that ZBED6 interacts with a majority of the genes listed in
Table S1, transcriptome analysis will be required to assess the importance of ZBED6 for transcriptional regulation of these putative targets. In this context, it is worth emphasizing that disruption of the interaction between ZBED6 and the
IGF2 QTN in pigs leads to a 3-fold up-regulation of
IGF2 mRNA in skeletal muscle and altered body composition. Interestingly, our data indicated that the
Zbed6 gene itself was bound by ZBED6 (
Table S1), implying autoregulation of its expression.
About 1,200 of the ZBED6 binding sites in C2C12 cells occurred within 5 kb of the TSS of an annotated gene. The analysis of Gene Ontology terms associated with these genes revealed a highly significant enrichment for a number of important biological processes such as development, transcriptional regulation, and cell differentiation (). As many as 262 of the putative target genes encode transcription factors, 36 containing the homeobox domain, 26 members of the basic helix-loop-helix (bHLH) family, ten belonging to the FOX family, eight nuclear receptors, and seven members of the SOX family (). Many of these putative ZBED6 targets have a crucial role during development, and the results suggest that ZBED6 is an important regulator of development, cell proliferation, and growth. The binding of ZBED6 to its target sites in
IGF2 leads to repression of
IGF2 expression both in pig skeletal muscle
[2] and in mouse C2C12 cells (this study). It may appear surprising that genes associated with neurogenesis were much more overrepresented in our peak list than genes associated with muscle development (), given the fact that we used mouse C2C12 myoblasts in this experiment. However, this pattern is expected if ZBED6 is primarily a repressor that silence genes not being part of the developmental program of a certain cell type. Another intriguing observation was the clear trend that ZBED6 preferentially binds downstream of the transcription start site which appears logical for a repressor ().
Igf2 is an imprinted gene, but our list of top hits did not indicate any overrepresentation of imprinted genes. In this respect, it is noteworthy that the QTN mutation in pigs does not result in loss of imprinting, but rather exclusively increases the transcription from the paternal
Igf2 allele
[2]. Thus, ZBED6 is unlikely to be a regulator of imprinting. However, one of the identified ZBED6 targets is the gene for growth factor receptor-bound protein (
Grb10), also denoted
Meg1 (
maternally expressed gene 1), that is maternally expressed and a potent growth inhibitor
[14]. GRB10 binds to the insulin receptor (INSR) and the IGF1 receptor (IGF1R), and inhibits the growth-promoting activities of insulin (INS), IGF1, and IGF2.
The list of genes associated with ZBED6 binding sites (
Table S1) includes additional members, besides
Igf2, of the IGF-signaling pathway, namely the genes for the IGF1 receptor (
Igf1r), IGF2 binding protein 2 (
Igf2bp2), IGF binding protein 3 (
Igfbp3), and IGFBP-like protein 1 (
Igfbpl1), suggesting that ZBED6 is an important regulator of IGF signaling. Furthermore,
Grb10, as mentioned above, also takes part in the regulation of IGF signaling
[14].
Genome Wide Association (GWA) studies have revealed a number of loci in the human genome associated with multifactorial disorders (Office of Population Genomics;
http://www.genome.gov/26525384). An examination of this database showed that the region harboring
ZBED6 is not one of the associated regions in any of the studies published so far. This means that the current GWA screens for different multifactorial disorders have not revealed any common
ZBED6 variants associated with disease. This does not exclude the possibility of rare sequence polymorphism in
ZBED6 affecting disease susceptibility in certain families. However, the ChIP-sequencing data indicated that ZBED6 has a fundamental role in regulating several biological processes. Mutations altering ZBED6 function or expression may therefore have severe pleiotropic effects through the many downstream targets. This notion is consistent with the near 100% conservation of the BED domains among placental mammals.
Our current model for ZBED6 function is summarized in . Our data on the IGF2 locus indicate that ZBED6 acts primarily as a repressor, likely with a modulating effect, although it is fully possible that it acts as a transcriptional activator under some circumstances.
First, germline or somatic mutations at target sites may lead to transcriptional up-regulation as demonstrated for the
IGF2 locus in pigs
[2]. Our findings that the mammalian genome contains thousands of putative ZBED6 targets and that these are enriched among genes associated with disease suggest that sequence polymorphism at ZBED6 target sites may contribute significantly to variation in disease susceptibility in humans. Furthermore, the ZBED6 binding motif contains a CpG dinucleotide so we expect to find genetic polymorphisms as CpG sites are associated with a high rate of C→T and G→A transitions
[15], as exemplified by the pig
IGF2 QTN. Gain or loss of ZBED6 binding sites may also have contributed to phenotypic evolution in placental mammals.
Second, our data suggest that ZBED6 targets can be released from repression by epigenetic activation. This is implied by the finding that EMSA using an oligonucleotide with a methylated CpG site was not bound by ZBED6
[2]. Interestingly, the pig QTN had no effect on
IGF2 transcription in liver, and the QTN region was shown to be methylated in this tissue, whereas it was undermethylated in skeletal muscle where the QTN had a drastic effect on
IGF2 expression
[2]. Thus, epigenetic regulation of the access of ZBED6 to its target sites may play an important role during development and cell differentiation.
Third, ZBED6 targets can be released from repression by down-regulation of ZBED6 expression, as demonstrated by siRNA experiments in the present study. Finally, loss-of-function mutations in ZBED6 are expected to up-regulate many target genes. Our finding that Zbed6 silencing in C2C12 cells leads to faster cell proliferation and wound healing combined with the identification of a large number of cancer-associated downstream targets by ChIP sequencing implies that further studies of ZBED6 function is of considerable interest for tumor biology.
Data reported here suggest that ZBED6 has an essential role in a number of crucial gene regulatory networks. Thus, the discovery of ZBED6 opens up many avenues for research that may have profound implications for human medicine.