We have performed an integrative study of complementary experimental techniques in order to identify the core gene regulatory network of OCT4 in the context of maintaining pluripotency and preventing differentiation along the trophoblast lineage. This network represents a rather conservative selection since we have chosen only genes with high evidence that showed significant results in ChIP-on-chip analysis, RNAi analysis and promoter sequence analysis.
Figure illustrates the limited overlap between the different technologies. For example, only 40 genes intersect between the ChIP-on-chip (12%) and the RNAi experiments (4%). This observation is in line with the results of a similar approach comparing the overlap of altered gene expression after Oct4 silencing and TF binding in mouse ESCs (<9% overlap) [
20]. Genes that show altered gene expression but no binding site may be regulated by an inter-dependent network, where loss of expression of one factor ultimately leads to the suppression of the others [
16]. Additionally, the RNAi targets relate also to downstream effects independent of direct protein-DNA binding of OCT4 which explains the higher number of RNAi targets (1,104) compared to ChIP-on-chip targets (308). Alternatively, TF binding may not be limited to the promoter region interrogated by the tiling arrays. On the other hand, genes having OCT4 binding sites but do not show altered expression may be regulated by a more complex system of OCT4 co-factors, epigenetic modifications like de-/methylation of CpG di-nucleotides within promoter regions or simply at later time points during differentiation into one of the three germ layers or into the trophectoderm lineage. Hence, independent validations such as the accessed knockdown experiment are critical in distinguishing functional from non-functional circuitries [
20].
Although the presented network is rather conservative and potentially neglects genes regulated by OCT4 together with unknown interaction partners, it represents the functional regulatory circuitry of direct OCT4 target genes in hESCs as deduced by the available data. It is a well-known problem of both ChIP-on-chip experiments and motif prediction analyses to generate a large number of false positives. Additionally, RNAi experiments do not only reveal direct but – to a much higher extent – indirect targets. Thus, having a rather conservative process for identifying OCT4 target genes has the benefit of narrowing down this large number of false positives. An indicator of this is the fact that the integration of the different experiments purifies and enriches functional content of the resulting targets in all investigated functional classes by factors of 1.5 – 4 as is shown in Figure .
Recent studies report
OCT4/
Oct4 expression in the adult, most frequently in the bone marrow of both humans and mice, particularly in hematopoietic and mesenchymal stem cells as well as in various sub-populations of multipotent progenitors [
37]. It has been suggested that
Oct4 may not only be crucial for the maintenance of pluripotency in embryonic cells but may also play an important role for the self-renewal of somatic stem cells [
37]. However, Lengner et al. [
37] have shown that
Oct4, even if expressed at low levels in somatic cells, is dispensable for the self-renewal of somatic stem cells, and for the regeneration of tissue in the adult, and is only rarely activated in somatic tumors. Based on these observations, we do not consider
OCT4/
Oct4 to be a key player for transcriptional regulation of pluripotency in either mesenchymal stem cells and other adult stem cells. The identified core regulatory network of OCT4 was created in the context of human embryonic stem cells for maintaining pluripotency and preventing differentiation along the trophoblast lineage.
Our
de novo motif discovery approach did not only reveal known OCT4 binding sites but also motifs similar to binding sites recognized by regulators that are known to interact with components of the OCT4 regulatory network as well as genes that may have important functions as downstream effectors of OCT4 but not yet described. Besides the co-factors presented above, a predominantly occurring motif is similar to a binding site recognized by Sp1 (Specificity protein 1), a transcription regulator that plays a role in TGF beta induced cell migration and mesenchymal transition, regulates angiogenesis, heart contraction, and aberrant expression is associated with several types of cancer. Yang et al. proposed that Sp1 or Sp3 play a critical role in controlling the transcriptional activity of OCT4 by direct binding and an overexpression study showed that Sp1 positively regulates OCT4 promoter activity [
38]. The Sp1 motif was identified within the promoter regions of OCT4 and of other OCT4 target genes (see Figure ). Sp1 is closely connected to the network as it binds to FGF2, C-MYC, HOXB7, Spp1 (the latter two genes are upregulated by Sp1), and interacts with Egr-1. Moreover, Sp1 interacts with CP2A, a TF which in turn regulates PAX6 [
19] (not indicated in the extended network), a transcription factor which is a member of the differentiation related OCT4 target genes. From the mouse model it is known that Sp1 binds to Foxa1 and Cdx2. Egr-1 (Early growth response 1) is a transcription factor that acts in apoptosis, angiogenesis, cell differentiation, regulates TNF production, cell proliferation and adhesion and aberrant expression of the gene is associated with several types of cancer. HOXB7 (Homeo box B7) is a transcriptional activator and functions in DNA double strand break repair by nonhomologous end joining. Both, Egr-1 and HOXB7 bind to the promoter region of FGF2 and a motif similar to the binding site of Egr-1 was obtained (see Figure ).
As another example, the de novo motif discovery identified a binding site similar to a motif recognized by STAT1 (Signal transducer and activator of transcription 1), a gene that mediates DNA replication, cell proliferation, apoptosis, and cell cycle regulation. It is known that STAT1 binds to C-MYC and is upregulated by Sp1. Several of the OCT4 target genes show a putative STAT1 binding site within their promoter region (see Figure ).
HNF4A (Hepatocyte nuclear factor 4 alpha) has a known binding site similar to one of the obtained motifs. It is a transcription factor that inhibits GH1 induced STAT5 and JAK2 phosphorylation and functions in hepatocyte differentiation and blood coagulation. HNF4A expression is upregulated by Sp1 and is a target of GATA6, a transcription factor which is a member of the differentiation related OCT4 target genes. RAB5A and TGIF2 show a putative HNF4A binding site within their promoter region (see Figure ).
PAX4 (Paired box gene 4) is a putative RNA polymerase II transcription factor that acts in positive regulation of cell proliferation and motifs similar to the known binding site for PAX4 were obtained. PAX4 itself has a binding site for HNF4A which is a downstream target of GATA6 [
19]. There are putative PAX4 binding sites within the promoter regions of several OCT4 target genes (see Figure ).
The computed OCT4 core regulatory network can be utilized in multiple ways. Well-characterized OCT4 target genes will help in extending the OCT4 network by suggesting further experimental work. The relatively high proportion of TFs in the OCT4 target set can be used for further inhibition studies or protein-DNA binding experiments. This leads to an extended radius of the network. For example, Additional file
4 shows that OCT4 has a positive regulatory effect on FGF2. FGF2 re-stimulation experiments performed by Greber et al. in hESCs revealed BMP4 as a downstream target of FGF2 signaling [
27]. BMP4 expression was activated upon OCT4 knockdown in the original experiment as well, so both experiments consistently confirm that BMP4 is a negatively regulated downtream target of OCT4. Such an extended network and even the constructed core regulatory network will ultimately help in the study of stemness and early embryogenesis. Figure shows functional enrichment for "embryonic development" that is increasing from 8%–30% with the integrative approach.
Finally, the identification of targets and co-factors of OCT4 might help in the design of iPS reprogramming protocols that use different TFs in order to generate and monitor cell status. C-MYC has already been successfully applied within a set of TFs for generating iPS cells through reprogramming. Figure gives a guided hint for testing a variety of these co-factors.