Genes showing statistical difference in expression between iPSCs and ESCs and shared by several iPSC lines can be considered as RRGs. RRGs were classified in two categories of Induced Genes and Inherited Genes depending on their expression status in somatic cells of origin. Induced Genes exhibit bivalent (H3K4K27) modification status in ES cells with the predominance of Intermediate and Low CpG density promoters and promoters with non-defined CpG density (“ND”).
Faulty resetting of inactive yet “posed” state of bivalent domains is critical for the consequent differentiation of iPS cell in tissue (Kim et al., 2010
). On the contrary, Inherited Genes category was enriched in univalent (H3K4) modification status in ES cell and showed preponderance for High CpG density promoter genes.
Part of our Inherited Genes category was identified as fibroblast-associated in prior studies (123 genes), while Induced Genes category associated with iPS-specific reprogramming network showed tiny overlap (8 genes) of our genes with resistant genes known from other studies (Chin et al., 2010
; Gupta et al., 2010
; Newman and Cooper, 2010
; Lister et al., 2011
; Ohi et al., 2011
). The analysis of iPSC-differentiated cardiomyocytes beating clusters reported by Gupta et al. showed significant overlaps with our results in 111 genes (87 up-, 24 down-) with our Inherited Genes category and only 6 genes in Induced category (Supplementary Tables 1, 2, 3, 4 “overlap” columns). Ten genes (COMP1, DYNLT3, NME4, OXCT1, MGMT, PTGR1, MGC3207, CKLF, ZNF167, ZNF626) from our study were verified in functional analysis by qPCR to be over-expressed in somatic cell and continue their up
-regulation in iPSC-derived cardiomyocyte beating clusters (Gupta et al., 2010
). Seven out of 15 genes reported as differently expressed between iPS and ES cell lines over four laboratories in the study of Newman and Cooper also found in Inherited (6 genes) and Induced (1 gene) Gene categories in our study.
CSRP1 (cystein and glycine rich protein 1, 6 cell lines, neonate, and adult), COMT (catecol-O-methyltransferase, 5 cell lines, neonate and adult) in Inherited up-regulated group (Supplementary Table 3a) and C9orf64 (5 cell lines, neonate) Inherited down-regulated group (Supplementary Table 4a) are top somatic cell genes expressed in iPS cell (Ohi et al., 2011
) also identified in our study. CAT (catalase) (Warren et al., 2010
) fibroblast-associated gene is top shared gene (9/13 cell lines) in our Inherited up-regulated gene group (Supplementary Table 3a). Results of another functional analysis (Lister et al., 2011
) overlapped with 5 genes of our list of Inherited Genes. This finding let us conclude about the higher plasticity of Induced Genes to reprogramming and consequent differentiation into somatic cell in comparison with Inherited somatic memory genes.
Transcriptional analysis of Induced Genes category revealed similar fraction of genes with predicted transcription factor binding motifs in up- and down-regulated groups in each cell line. Stronger binding affinity was observed in down-regulated groups of all cells suggesting that stochastic genes silencing should be the major problem to overcome for more successful reprogramming. Predicted NANOG binding motif in the proximity (−500 to +500) of TSS in the most cell lines leads us to the conclusion about its ectopic activation. OCT4 and KLF4 are comparatively regular in binding allocation, while SOX2 and c-Myc did not show any consistency in our study. All predicted TFBS are distantly allocated, which is consistent with the recent experimental publication (Soufi et al., 2012
Concerning the general characteristics of RRGs on the pathway level, Inherited Gene category included cancer and apoptosis-related pathways, such as focal adhesion, p53 signaling pathway, which also observed in the recent results (Soufi et al., 2012
). This may affect unwanted tumorigenic propensity of iPS cells and further experimental verification of this issue is required. Induced Gene category was enriched in calcium-signaling pathway, cell adhesion, PPAR signaling, and tight junction. These pathways may contribute to the embryogenesis, development, and immune response, but biological implication of such differential expressions is yet to be elucidated. It is evident that substantial numbers of genes are differentially expressed due to various factors leading to differences between iPS and ES cells.
Pertaining to the virus type two conclusions can be drawn. First, virus-free and lentivirus-derived iPS cell lines have lower number of down-regulated Inherited Genes, while retrovirus insertion in the promoters of genes seems to provoke strong inhibitory effect. Remarkably, virus-free and lentivirus-derived iPS cell lines have larger number of genes with bivalent (H3K4K27) modification status in ES cells, i.e., under these conditions somatic cells exhibit higher susceptibility to reprogramming than those generated through the retrovirus transduction. Second conclusion is that passage length anti-correlates with number of RRGs.
For further improvement of iPSC technology several factors should be taken into the consideration. Basing on the status of gene in somatic cell of origin two natures of RRGs should be considered: Induced Genes active in somatic substrate and Inherited Genes inactive in somatic substrate. Tiny fraction of Induced Genes (bivalent H3K4K27, LCP/ICP) was identified up-/down-regulated in iPS-derived somatic cell (Gupta et al., 2010
). This means that they might be activated during the first stage of reprogramming (Soufi et al., 2012
), which make us suggest that longer passage can help to resolve this problem. More attention should be paid to the Inherited Genes category (univalent H3K4, HCP), retaining somatic cell transcriptional memory. They were abundantly found in iPS-derived somatic cell (Gupta et al., 2010
) and overlap with RRGs from other studies (Chin et al., 2010
; Newman and Cooper, 2010
; Lister et al., 2011
; Ohi et al., 2011
). Active and demethylated High CpG density promoters might attract retrovirus insertion, which causes consequent silencing of the promoter through de novo
methylation. Virus-free or miRNA-mediated (Ankye-Danso et al., 2011
) reprogramming may be more plausible in the future. For the future work it is important to identify key transcription factors within Inherited Genes category to be able to reduce/block their activity. Donor age and developmental stage is important for the selection of somatic cell substrate. While heterogeneous tissue culture does not simply reflect the epigenetics status of the substrate cell, several reports indicate that somatic cell/progenitor cells can be epigenetically favored substrates for nuclear resetting (Aasen et al., 2008
; Silva et al., 2008
The influence of RRGs on the intactness of function after the consequent differentiation iPSCs in organs and tissues is extremely important for the validation and standardization of iPSC technology, and our results can be a help for this.
Conflict of interest statement
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.