|Home | About | Journals | Submit | Contact Us | Français|
Induction of pluripotency by transcription factors has become a commonplace method to produce pluripotent stem cells. Great strides have been made in our understanding of the mechanism by which this occurs - particularly in terms of transcriptional and chromatin-based events – yet it is possible that still only a small part of the complete picture has been revealed. Understanding the mechanism of reprogramming to pluripotency will have important implications not only to improve the efficiency of the method, generate highest quality reprogrammed cells, and propel their therapeutic applications, but will help to reveal the machinery that stabilizes cell identity and instruct the design of directed differentiation or lineage switching strategies. To inform the next phase in our understanding of reprogramming, we review the latest findings, highlight ongoing debates and outline future challenges to this important problem.
In 2006, Takahashi and Yamanaka published their milestone strategy to reprogram somatic mammalian cells to induced pluripotent stem cells (iPSCs) by overexpression of only four transcription factors: Oct4, Sox2, Klf4, and c-Myc1. The huge therapeutic potential of iPSCs makes understanding mechanisms underlying the reprogramming process of paramount importance. While reprogramming with transcription factors has become routine, we are only beginning to define how these factors induce pluripotency, but during the last few years, several points have become clear. Many studies have demonstrated that mouse and human iPSCs are highly similar to their respective embryo-derived embryonic stem cell (ESC) counterparts morphologically, functionally, and molecularly at the level of transcription and genome-wide distribution of chromatin modifications2–16. The key mechanistic question of transcription factor-induced reprogramming to the iPSC state therefore is how the somatic program is erased and the ESC- like transcriptional network established to confer pluripotent capabilities. Despite the development of numerous methods to introduce the reprogramming factors into somatic cells, only a small percentage of cells expressing the factors make the complete trip to the pluripotent state. It is now believed that the inefficiency of reprogramming is attributable to epigenetic hurdles that are only overcome infrequently17,18. Steps are being defined that precede the activation of the endogenous pluripotency network and each step appears to be conquered by fewer and fewer cells18–24. Recent data also demonstrate that repressive chromatin states comprise a major mechanistic barrier to the induction of pluripotency6,23,25–29. Various extrinsic signals can modulate reprogramming and even affect the activity of the reprogramming factors, demonstrating the close relationship of extrinsic and intrinsic pathways in regulating reprogramming and cell identity 21,22,24,30–35. In addition, there is a question of whether a molecule that appears to accelerate reprogramming acts via changes in the cell cycle or by lowering reprogramming barriers17,36, and the notion that iPSCs carry an epigenetic memory of the starting cell may shed light on processes that are difficult to reset during reprogramming37,38.
In this review, we highlight recent important work on understanding transcription factor-induced reprogramming to iPSCs. While iPSCs can now be derived by various combinations of transcription factors and small molecules centering around Oct4 (for a review see39), we concentrate mostly on lessons learned from experiments performed with the original reprogramming factor cocktail consisting of Oct4, Sox2, Klf4, and cMyc. We discuss steps leading to faithful reprogramming, the function of the reprogramming factors in relation to the transcriptional network of pluripotent cells, and what is known about chromatin regulation in this process. We also consider whether analyses of molecular and functional similarities and differences between ESCs and iPSCs can illuminate mechanisms of reprogramming. Finally, we speculate on the best strategies to generate a complete accounting of the reprogramming process.
Reprogramming to pluripotency upon Oct4, Sox2, Klf4, and cMyc overexpression takes at least one to two weeks for the first reprogrammed cells to emerge in the culture dish. An important characteristic of this process is that only few of the somatic cells that initially express the reprogramming factors eventually convert to the pluripotent state within this time frame17. In fact, an experiment that plated single pre-B cells into individual culture wells and quantified reprogramming in hundreds of these clonal cell populations demonstrated that only 3–5% of the wells induced pluripotency successfully within two weeks, but that even in successful wells only a small subset of daughter cells had undergone reprogramming17.
Even when most of the cells express all reprogramming factors, by for instance applying polycistronic cassettes that encode all four factors in a single construct or employing secondary reprogramming systems to control for transgene expression, the number of faithfully reprogrammed colonies remains low relative to the number of dividing cells in the culture dish. This argues against the idea of that the low efficiency of the process is attributable to heterogeneous transgene expression across the starting cell population40–49. The hypothesis that only non-lineage committed cells or adult stem cells are amenable for reprogramming has also been discarded as an explanation for the low efficiency, based on the ability of terminally differentiated cells, such as pancreatic islets or terminal blood lineages, to give rise to iPSCs50–56. In regard to this, the aforementioned clonal reprogramming experiment demonstrated that virtually all cells in a donor pre-B cell population have the potential to give rise to a reprogramming event, because after 18 weeks in culture, more than 90% of the wells contained at least a few cells positive for a pluripotency marker17, further arguing against a model in which only a subset of cells can induce reprogramming. However, debate continues as to whether the degree of differentiation of cells within one lineage influences the efficiency and kinetics of the process17,51. Initially, it was also suspected that insertional mutagenesis upon viral insertion of the reprogramming factor coding DNA was required for reprogramming, but non-integrative reprogramming studies57–60 (for a review see61), mapping of viral insertion sites56,62,63, and the development of the “reprogrammable” mouse model with a defined integration site for a single inducible, polycistronic reprogramming factor cassette, argue against this idea43,64.
Together, these findings have led to a model that expression of the reprogramming factors per se is not sufficient to permit the transition to pluripotency and that additional events are required to overcome major epigenetic barriers that prevent reprogramming 17,18.
Due to the low efficiency of the process and the timescale involved, determining the events that occur between the initial expression of the reprogramming factors in somatic cells and the establishment of the pluripotent program has been challenging. To probe for mechanistic insights, mouse embryonic fibroblasts are most commonly used as starting cells for reprogramming experiments and partially reprogrammed cells (pre-iPSCs) have been particularly valuable for analyzing the some stages of the process (Box 1). In addition, the development of improved technologies, particularly of various tetracycline-inducible expression systems for the reprogramming factors (Table 1), and most recently of live imaging analysis have had a huge impact on mechanistic studies, demonstrating how technology development and mechanistic insight are intimately connected in this field6,17–20,22,37,43–46,48,49,64.
Considering only those reprogramming events that occur within the first couple weeks, many reports now suggest that successful reprogramming of fibroblasts requires the stepwise transition through key intermediate steps, and at each step, fewer and fewer cells advance due to secondary events that are still being discovered6,18–24. The steps of fibroblast reprogramming are described in more detail in the subsections below and the changes observed at each stage are shown in Figure 1. Briefly, induction of proliferation and downregulation of fibroblast-specific transcription are followed by acquisition of epithelial characteristics and activation of some ESC markers. Later, pluripotency-related genes are activated. Markers are used to detect these steps include alkaline phosphatase and stage specific embryonic antigen-1 (SSEA1, also known as FUT4) at an intermediate stage and Nanog in the mouse system7,9,65 or the surface marker TRA-1-60 in human cells66 for the final stages. The steps en route to the iPSC state require continuous expression of the reprogramming factors but maintenance of iPSCs is independent of overexpression of these factors; this independence indicates a stable conversion of cell fate19,20.
A high resolution time-lapse imaging approach that enabled retroactive tracking of faithful reprogramming events demonstrated that an increase in proliferation rate and a concomitant decrease in cell size are the first noticeable changes in the reprogramming of mouse fibroblasts, and occur as early as 24 hours post induction of the reprogramming factors18. These morphological and proliferative changes are accompanied molecularly by the induction of proliferation genes and downregulation of the somatic expression program6,22,67. Interestingly, a cell sorting experiment for Thy-1, a marker present on the surface of fibroblasts, suggests that the expression changes in this early phase of reprogramming occur in the majority of cells19. Thus, while the transcriptional response to the reprogramming factors may be population-wide, only few cells undergo the rapid shift in proliferation that coincides with the reduction of cell size in this early phase of reprogramming, which can be tracked as first morphological event in all successful reprogramming events. How much the expression changes seen at the population level reflect changes in these fast dividing cells remains unclear at this point.
Most cells expressing the reprogramming factors thus fail to successfully induce the first morphological change of proper reprogramming events, remain fibroblast-like and often undergo apoptosis, senescence or cell-cycle arrest. Notably, each of these processes is thought to be a barrier to reprogramming as methods that suppress these responses are associated with higher reprogramming efficiency68–73. Specifically, the silencing of central regulators of these pathways, such as p53 and p21 or Ink4a/Arf, is observed upon reprogramming and their experimental depletion enhances the efficiency and kinetics of iPSC generation17,68–73. It is important to note that the role of p53 in reprogramming has been debated and one study linked the positive effect of p53 depletion in pre-B cell reprogramming solely to an increased proliferation rate17. This would be consistent with data suggesting that even in wildtype cells, promoting the cell cycle improves reprogramming efficiency17,36,74. It is likely that active promotion of cell proliferation, i.e. more transitions through S-phase, enables efficient resetting of the transcriptional and chromatin landscapes. Intriguingly, when monitoring the effect of p53 knockdown at the single cell level, it was suggested that while proliferation is induced in more cells than the control case, most of these cells derail from the reprogramming path later, yielding a lower overall reprogramming efficiency when normalized to the number of initially responding cells18. This study highlighted how single-cell analysis may reveal novel insights that cannot be obtained from typical population studies, which are more typical.
The extinction of the somatic program is perhaps a lower barrier to reprogramming than the acquisition of the ESC program. It is conceivable that the induction of the pluripotent state may only be possible once drivers of the somatic state are efficiently silenced. In agreement with this idea, recent studies have demonstrated that expression of lineage-specific transcription factors blocks reprogramming of the somatic genome in a dominant fashion6,75.
ESCs and iPSCs have characteristics of epithelial cells with tight intercellular contacts and surface expression of the key epithelial gene E-cadherin. Thus, mesenchymal cells like fibroblasts need to gain an epithelial character during reprogramming. After suppression of the somatic transcriptional program, small, fast cycling cells cluster tightly and undergo coordinated changes in cell-cell and cell-matrix interactions which corresponds with a loss of mesenchymal features and acquisition of epithelial cell characteristics18,21, supporting the idea that reprogramming fibroblasts undergo a mesenchymal-to-epithelial transition (MET), thus reversing the epithelial-to-mesenchymal transition (EMT) that occurred during differentiation of fibroblasts in vivo21,22. Signaling pathways that promote or suppress MET affect the efficiency of the reprogramming process indicating that MET is a critical step in fibroblast reprogramming. For example, inhibition of signaling by the transforming growth factor TGFβ improves reprogramming because TGFβ activity prevents MET by inhibiting both the upregulation of epithelial markers and the downregulation of the mesenchymal transcriptional repressor Snai121. In contrast, bone morphogenetic protein signaling enhances reprogramming via upregulation of pro-MET microRNAs22. Furthermore, E-cadherin, one of the epithelial genes upregulated in this phase of reprogramming, is critical for ESC pluripotency76, and its knockdown interferes with reprogramming21. Of course, not all cell types reprogrammed to date would have to go through MET. For instance, keratinocytes and hepatocytes are epithelial and reprogrammable with higher efficiencies than fibroblasts56,77. Therefore, it is easy to speculate that the reason for this is that MET is not a barrier to reprogramming in epithelial cells.
After epithelial cell character is established and as larger colonies are formed, other ESC markers such as the SSEA1 cell surface antigen are induced, likely only in a subset of E-cadherin positive cells19–21. At this point, the expression program of the basic state of embryonic, fast dividing cells appears to be established6,22,23. Experiments in which SSEA1-positive and negative cell populations were isolated from reprogramming cultures demonstrated that only SSEA1-positive cells can give rise to faithfully reprogrammed cells and activate the expression of the pluripotency network, i.e. of transcriptional or developmental regulators highly expressed in ESCs including the endogenously encoded Oct4, Sox2, and Nanog and many other pluripotency-related genes19,20. The upregulation of this core pluripotency network is considered the final step of reprogramming and similar to the other steps of reprogramming, only few SSEA1-positive cells make this final transition19.
Understanding the contribution of each reprogramming factor to the different steps of reprogramming is ultimately required to reveal the molecular mechanisms underlying the induction of pluripotency. It is now thought that each reprogramming factor plays a distinct role21,23,78–80. This concept is nicely exemplified by their respective contributions to the MET during fibroblast reprogramming21. It was shown that Oct4 and Sox2 suppress the pro-mesenchymal regulator Snai1, while Klf4 induces the epithelial program by directly binding and activating epithelial genes including E-cadherin21. At the same time, cMyc reduces TGFβ signaling by repressing TGFβ1 and TGFβ receptor. The fact that the reprogramming factors collaborate in the MET by suppressing different pro-EMT molecules and promoting various pro-MET mechanisms may explain why the canonical four Yamanaka factors constitute such an efficient reprogramming cocktail.
To address how pluripotency-related genes are upregulated during the final phase of reprogramming, we have mapped the binding sites of Oct4, Sox2, Klf4, and cMyc in iPSCs and pre-iPSCs, which have acquired the proliferative and biosynthetic properties of iPSCs, but not yet activated the pluripotency network (Box 1)23. We found that in iPSCs the target genes of these four transcription factors are similar to those previously defined in ESCs, where Sox2 and Oct4 co-occupy promoters of highly expressed genes including their own promoters, and Klf4 shares about half of its targets with these two transcription factors23,81–84. Notably, based on limited target overlap, it was proposed that the function of cMyc differs from that of Oct4, Sox2, and Klf4 in ESCs/iPSCs23,81,82. Consistent with this notion, cMyc targets are predominantly involved in the regulation of cellular proliferation, metabolism, and biosynthetic pathways, whereas Oct4, Klf4, and Sox2 targets are skewed towards transcriptional and developmental regulators forming the pluripotency network in pluripotent cells81,82,85 (Figure 2).
These results imply that cMyc, in contrast to Oct4, Sox2 and Klf4, is not involved in the upregulation of the pluripotency network during the final step of reprogramming. A recent report also suggests that cMyc promotes the release of promoter-proximal pausing of RNA polymerase II (Pol II) and thereby enhances the elongation of transcripts, as opposed to mediating the initial recruitment of Pol II to promoters86. cMyc could therefore enhance but may not be absolutely required for transcription of its target genes. Together, these findings may explain why cMyc is dispensable for reprogramming while still able to enhance the efficiency and kinetics of the process79,80. Thus, during reprogramming, c-Myc overexpression may lay the framework for the efficient induction of proliferation, the repression of the somatic expression program, and the acquisition of ESC-like biosynthetic properties, onto which Oct4, Sox2 and Klf4 can exert their function and finally activate the pluripotency network (Figure 2).
Along the same line, cMyc already binds many of its iPSC target genes in pre-iPSCs indicating that the cMyc transcriptional network is already largely engaged at an intermediate step of reprogramming23. In contrast, many pluripotency-related genes that are occupied by Oct4, Sox2 and Klf4 in iPSCs completely lack binding by these three reprogramming factors in pre-iPSCs23. Consequently, the expression of genes belonging to the cMyc target network is comparable between pre-iPSCs and iPSCs, but genes of the Oct4, Sox2, and Klf4 network are not activated in pre-iPSCs85. Genes belonging to the pluripotency network may therefore simply not be accessible for the reprogramming factors at this intermediate stage of reprogramming, and we posited that the engagement of Oct4, Sox2 and Klf4 at pluripotency genes and their subsequent transcriptional upregulation represents a major hurdle to the completion of reprogramming23 (Figure 2).
At least two models can be envisioned to explain the lack of binding to and upregulation of pluripotency genes in pre-iPSCs. The first suggests that additional transcription factors are required to cooperatively bind with Oct4, Klf4, and Sox2 and recruit co-activators that are not yet available at this intermediate stage23. This model is supported by studies of the Nanog transcription factor (Figure 2). Nanog has extensive protein-protein interactions with Sox2, Oct4 and other pluripotency transcription factors87 and co-binds many of their targets in ESCs82,83,88, but is not expressed at the pre-iPSCs state as it is only upregulated during the final step of reprogramming when the pluripotency network is established6,23,35. Nanog is absolutely essential for the generation of the iPSCs but required only during the final step of reprogramming35,89. Intriguingly, Nanog transcripts can be detected at low levels early in the transition from pre-iPSCs to iPSCs, which may be sufficient to promote Oct4, Sox2, and KLF4 function89. Accordingly, its overexpression in pre-iPSCs and during reprogramming enhances the induction of pluripotency by lowering cell-intrinsic barriers17,35,89. A second model suggests that repressive chromatin at pluripotency gene promoters and enhancers, which has formed to silence these genes during differentiation90, interferes with binding of the reprogramming factors (Figure 2).
In agreement with the second model, it is currently believed that repressive chromatin comprises a major mechanistic barrier to transcription factor-induced reprogramming. This is mainly suggested by the ability of agents such as histone deacetylases, histone methyltransferases and demethylases, and DNA methyltransferase 1 inhibitors that liberate repressive chromatin states to enhance the process6,25–29,91, as summarized in Box 2. Though small molecules have proven to be useful in showing that repressive chromatin states contribute to the stability of differentiated cell identity, the question of how exactly they affect reprogramming remains largely unclear, particularly since these inhibitors are likely altering global chromatin structure as well as targeting specific genes, and may act in several steps of reprogramming.
However, the regulatory regions of some pluripotency genes such as Oct4, Nanog, Utf1, Dppa5, Rex1, and Dppa3 are hypermethylated at the DNA level in somatic cells as well as partially reprogrammed cells1,6 and many pluripotency genes are enriched for repressive H3K27 and/or H3K9 methylation4,6,23,90. DNA demethylation and the loss of repressive histone methylation marks likely occur at the end of the reprogramming process, concomitant with the binding of the reprogramming factors Oct4, Sox2 and Klf4 and transcriptional upregulation of these genes (Figure 2)6,23. These findings are in agreement with the notion that the repressive chromatin state of promoters and enhancers of pluripotency-related genes may block engagement of the reprogramming factors.
Intriguingly, a recent report demonstrated that Nanog overexpression and inhibition of DNA methylation synergistically enhance the final phase of reprogramming indicating that both models proposed above may be in play to activate pluripotency-related genes89. In any case, the activation of Nanog or changes in repressive chromatin structure at pluripotency-related genes (or elsewhere) occur through so far unknown mechanisms during reprogramming.
A very recent twist is that specific chromatin changes precede the activation of pluripotency-related genes23,67. For example, many pluripotency-related genes with CpG-dense promoter and enhancer elements that are hypomethylated in fibroblasts gain H3K4me2 in the early phase of reprogramming, despite the fact that they are upregulated much later in the reprogramming process67 (Figure 1). The silent state of these genes is maintained at the early phase of reprogramming, because gain of H3K4me2 at the CpG island does not alter the repressive chromatin character in surrounding regions67. It remains to be tested whether H3K4me2 in pluripotency gene promoters is required for their subsequent activation, but changes in H3K4me2 apparently occur in the majority of fibroblasts in response to reprogramming factor expression even before the first cell division is initiated67. This means that the reprogramming factors are not only inducing major transcriptional changes early on in the reprogramming process but also affect the chromatin landscape in a global manner, without cell division, maybe by altering the activity or levels of chromatin remodelers or modifiers. In this context it should be noted that chromatin remodeling is critical to efficient reprogramming92,93 (Figure 2).
Chromatin states influence reprogramming at various stages. For instance they also appear to determine where initial responses to reprogramming factor expression occur in somatic cells. Comparing the transcriptional response to reprogramming factor expression with genome-wide maps for histone modifications and DNA methylation in the early phase of reprogramming, it was found that transcriptional changes are limited to those promoters that carry histone H3K4me3, a histone modification strongly associated with transcriptional activation67. While the binding targets of the reprogramming factors at this early phase of reprogramming are not yet mapped, it is therefore likely that they can only target their binding sites in pre-existing open chromatin, further highlighting why they are more likely to enhance transcription of proliferation genes (cMyc) and silence somatic genes than activating pluripotency genes early on in the process.
During reprogramming, silencing of somatic genes is associated with a change in chromatin structure at their enhancers and promoters, particularly a loss of histone H3K4me267. Interestingly, many fibroblast-specific enhancers need to gain DNA methylation during reprogramming as they are hypermethylated in ESCs, but appear to do so only towards the end of the process67. This finding may explain at least partially why cells on the reprogramming path that have not yet induced pluripotency can return to a fibroblast-like morphology upon withdrawal of the reprogramming factors, because DNA hypermethylation, among other mechanisms, may be required to stably lock in the silent state of somatic genes upon reprogramming.
Female mammalian cells silence one of the two X chromosomes in a process called X chromosome inactivation (XCI) (reviewed in94). XCI is initiated early during female embryonic development in mammals when pluripotent cells of the blastocyst differentiate. Thus, female mouse ESCs carry two active X chromosomes (XaXa) and initiate XCI upon differentiation by upregulating the large non-coding RNA Xist on the future inactive X chromosome (Xi) and inducing a cascade of events that leads to a heritable heterochromatic state (Figure 3a). Given that the X chromosome represents the largest continuous DNA segment that is subject to epigenetic silencing when pluripotent cells differentiate, a key question has been whether the Xi reactivates during reprogramming.
As expected from the XaXa pattern in mouse ESCs, we have shown that the Xi is reactivated in female mouse iPSCs and its heterochromatic state is reset to that of the Xa, enabling random XCI upon induction of differentiation (Figure 3a)5. Xi reactivation occurs very late in the reprogramming process at around the time when the pluripotency network is activated19 emphasizing the tight link between pluripotency and the XaXa state. The pluripotency transcription factor network may even need to be established to allow downregulation of Xist, and reactivation of the Xi as it has been suggested that pluripotency transcription factors regulate Xist expression95–97 (Figure 3b). Because XCI can be studied at the single cell level by fluorescent imaging approaches, the process provides an attractive model to study changes in transcription and chromatin in the context of reprogramming.
For human iPSCs the picture appears to be different: our data show that reactivation of the Xi does not occur when female human cells are reprogrammed98. Because of their clonality and lack of Xi reactivation, the cells of a given iPSC line all have the same X silenced, even though the fibroblast population they originate from is mosaic for which X chromosome is inactive (Figure 3a/c). The differences between human and mouse reprogramming are likely due to the cells not being developmentally equivalent rather than reflecting a difference in the way XCI is regulated. Human ESC/iPSCs are thought to be in a primed pluripotent state similar to that of mouse epiblast stem cells (EpiSCs)99, which is distinct from the naïve pluripotency of mouse ESCs. In agreement with this notion, female mouse EpiSCs derived from post-implantation embryos are XiXa100. Mirroring reprogramming experiments in which the transition of mouse EpiSCs to the naïve ESC-like state is accompanied by the reactivation of the Xi35,100,101, overexpression of KLF4 in human ESC/iPSCs in combination with a small molecule cocktail that supports growth of mouse ESCs leads to the establishment of a mouse ESC-like state in human cells with two Xa's102 (Figure 3c). Thus, lessons about the mechanism of reprogramming in mouse are informative for the human reprogramming process as well.
Notably, while most classical (mouse EpiSC-like) female human ESC lines, like iPSCs, are XiXa, XaXa ESCs can in some cases be generated and maintained, particularly when derived under hypoxic conditions to more accurately model the in vivo environment of the developing embryo103,104,105. However, using standard reprogramming methods even under hypoxic conditions, we have been unable to generate XaXa iPSCs98. This discrepancy could be because of inherent differences between hESCs and hiPSCs and it will take further work to understand what molecular differences between these cell types can tell us about the process of reprogramming.
Reprogramming to the iPSC state by introduction of pluripotency transcription factors appears to generate pluripotent stem cells that are superficially indistinguishable from embryo derived counterparts5,7,8. However, numerous studies have now described molecular differences between iPSCs and ESCs in both mouse and human systems2,3,13,37,38,109–112, while others argue that there are no fundamental differences between them113. While future effort will provide clarity on this issue, for now we consider these studies with an eye towards using these findings to understand the mechanisms underlying the reprogramming process (Table 2).
So far, iPSCs and ESCs have been compared at the epigenetic, transcriptional, proteomic, and metabolic levels. Our group and others have performed several analyses of human iPSCs and ESCs and suggested that these two cell types, while very similar, could still be distinguished by their expression of protein-coding RNAs 2,3,109,110. A significant portion of the gene expression differences between human ESCs and iPSCs were due to residual expression of somatic genes109,110, and many of these differences appear to dissipate upon extended passaging2,3. There are several possible explanations that are not mutually exclusive: reprogramming is not immediately complete upon induction of the endogenous pluripotency network; there is selection of authentic pluripotent cells within a heterogenous culture over time; or perhaps the exogenous versions of the reprogramming factors need to be silenced completely prior to complete the process.
One group has shown that repression of a small group of non-coding RNAs encoded in Dlk-Dio imprinted gene cluster may distinguish mouse iPSCs from ESCs at a functional level13. These non-coding RNAs might signify a landmark for reprogramming such that all pluripotent stem cell lines (ESC or iPSC) that show normal expression of these genes are able to contribute to animals entirely derived from these cells in the tetraploid complementation assay, whereas those iPSCs that did not show expression at this locus were able to generate normal chimera, but were incapable of satisfying this gold standard assay for mouse pluripotency. These experiments did suggest that reprogramming is complete in some cases. However, it should be noted that the lines were analyzed at later passage, when many of the expression differences often observed between iPSCs and ESCs have disappeared (Konrad Hochedlinger, personal communication). Because the mechanisms by which these imprinted non-coding RNAs are regulated has only begun to be explored, it is difficult to link the mechanism by which these RNAs are misexpressed and the process of reprogramming, but new data are beginning to shed light on this issue.
Recently, it was shown that there are 10 large intergenic non-coding (linc) RNAs that are differentially expressed between human iPSCs and ESCs, and that at least one of these can play a role in the reprogramming process as its overexpression enhances and its depletion inhibits this process114. The fact that at least some of the misregulated lincRNAs are targets of the OCT4 and SOX2 in ESCs indicates that they could be de-regulated during reprogramming due to aberrant binding of the reprogramming factors.
Extensive examination of the chromatin state of iPSCs and ESCs has also shown that while these two cell types are clearly very similar, consistent differences can be observed, and some have even been shown to be functionally relevant. As described above, based on × inactivation status it could be argued therefore that at least some human ESC lines are more epigenetically “pristine” than human iPSCs (even when apparently at the same developmental stage), but it is unknown whether × status simply reflects the biology of this chromosome or is a clue to more profound genome-wide epigenetic variability.
Genome-wide approaches to identify sites enriched in histone H3K4me3 and H3K27me3 suggested that human iPSCs and ESCs have identical patterns for these marks, even for promoters of genes that are differentially expressed between the two cell types2,4,113. On the other hand, the pattern of H3K9me3 within promoter regions was found to be different and this mark is overrepresented amongst genes that were differentially expressed between human ESCs and iPSCs4. Of course, there is a panoply of other histone modifications yet to be probed and it is challenging to demonstrate a functional role for these marks, so it is difficult to use these differences to elucidate mechanisms of reprogramming. Most of the transcriptional and chromatin differences described to date appear to reflect the state found in the cell type that was reprogrammed, suggesting a form of `memory' that might indicate incomplete reprogramming2,109,110.
In fact, two groups showed that the DNA methylation pattern of the original cell persists in mouse iPSCs, and demonstrated that this residual DNA methylation pattern affects their differentiation potential37,38. For instance, iPSCs from blood more easily differentiate towards blood lineages than iPSCs made from fibroblasts37,38. Importantly, many blood markers are hypermethylated at the DNA level in fibroblast-derived iPSCs, likely preventing their efficient upregulation upon induction of differentiation towards the blood lineage. Furthermore, treatment of iPSCs generated from non-blood lineages with histone deacetylase and DNA methylation inhibitors appeared to allow for more efficient blood differentiation. The fact that residual DNA methylation within lineage specific genes is found in iPSCs provides tangible evidence that resetting of this mark is fundamental to reprogramming, and that failure to do so has a functional consequence. Notably, one of these studies also showed that continued passaging of the iPSCs appeared to erase this epigenetic memory37, a finding reminiscent of work in human iPSCs showing that continued passaged abrogated transcriptional differences between iPSCs and ESCs2,3.
Recent work has also uncovered an epigenetic memory in human iPSCs at the level of DNA methylation by generating single-base, whole genome DNA methylation maps115. This study also argued that as well as a failure to properly erase parts of the DNA methylome that leads to an epigenetic memory of the somatic DNA methylation pattern, reprogramming often induces aberrant methylation that seems to be specific to the iPSC state and that some iPSCs are unable to re-establish methylation, particularly non-CpG methylation. These methylation differences between ESCs and iPSCs are associated with differences the transcriptional level that can be found after many passage and might affect the differentiation behavior of these cells.
The difficulty with all these molecular comparisons is that both iPSCs and ESCs show significant variability amongst individual lines. To quantify such variability, a recent study profiled 20 human ESC and 12 iPSC lines and generated a “scorecard” to measure the fidelity and utility of reprogrammed lines versus a set of standard ESCs lines112. This effort included DNA methylome, transcriptome, and differentiation studies to determine whether quantification of molecular similarity to gold standard pluripotent cells could be predictive of their ability to differentiate down various lineages. This study concluded that although ESCs exhibit significant variability across individual lines and that some iPSCs fall within the variability of ESCs, iPSCs were more variable at the molecular level than ESCs. In the final analysis, it is imperative that any differences between iPSCs and ESCs be tested experimentally to determine whether they are functionally significant13,114, as these experiments will yield mechanistic insights into reprogramming and the pluripotent state, and whether one of these pluripotent cell types is more suitable to a desired application.
For now, the compendium of differences described between iPSCs and ESCs is further evidence that the reprogramming process requires a wide variety of molecular changes and that cells can either get trapped (partial reprogramming), get close to the final destination (reprogrammed state with epigenetic memory), or reach a bona fide pluripotent state. Perhaps the only way to truly understand the reprogramming process will combine single cell analysis with fine temporal resolution extending recent studies 18,66.
To understand the mechanisms of the reprogramming factors, several groups have employed tet-inducible expression of the reprogramming factors (Table 1). An inducible system facilitates the identification of broad landmarks of reprogramming such as suppression of somatic genes and induction of epithelial and pluripotency genes, as discussed above. Strikingly, even with robust expression of all reprogramming factors by polycistronic methods, typically less than 10% of cells undergo complete reprogramming, suggesting formidable barriers to the process beyond expression of the Yamanaka factors. Since we can currently only identify faithful reprogramming events when the pluripotency network is expressed, it is nearly impossible to determine which molecular changes occur in cells that are destined to make it all the way to a pluripotent state versus those that will end up lost along the way. To overcome this obstacle, either a technique that allows for a very high reprogramming efficiency is required, or perhaps the discovery of early epigenetic landmarks that reliably mark cells that will proceed towards pluripotency. Similarly, inducible reprogramming factor expression and single cell approaches need to be combined with genome-wide approaches such as transcriptome analysis and ChIP-SEQ. Currently, merging these technologies is still very challenging if not prohibitive, but it will be essential to understand the molecular steps underlying reprogramming. If someday all these issues can be adequately addressed we may be able to gain a clear understanding of reprogramming. Until then, we will have to rely on studies that employ transcriptional or epigenetic manipulations that drive or impede the process to shed light on this fascinating phenomenon.
Mouse embryonic fibroblasts are most commonly used to probe for mechanistic insights because they can be easily generated from reporter mice that carry a knockin of the green fluorescent protein (GFP) coding sequence in the endogenous Nanog locus, or a transgenic GFP driven by Oct4 regulatory regions. Fibroblasts also have a relatively high viral infection capacity for the delivery of the four factors combined with a reasonable reprogramming efficiency. Understanding the function of the reprogramming factors is particularly difficult at the intermediate and late phases of reprogramming as fewer and fewer cells are routed towards the pluripotent state. Thus, the isolation of intermediate stages has been informative. Partially reprogrammed cells (pre-iPSCs) have been established as a tool to study the late step of reprogramming, i.e. the upregulation of the pluripotency network 6,23,24,31. These pre-iPSCs can be clonally expanded and emerge as ES cell- like colonies in the reprogramming dish. They have largely acquired the proliferative capacity and biosynthetic properties of pluripotent cells with silencing of many somatic genes, but fail to express many endogenous pluripotency genes such as Oct4 and Nanog6,23,24,31. pre-iPS clones obtained from different starting cell types share similar transcription profiles suggesting reprogramming from various starting cell types converges close to the pluripotent state and stalls at a similar barrier6. While it is not absolutely clear that pre-iPSCs represent an intermediate that occurs transiently during the reprogramming process, they are not simply an aborted reprogramming artifact, as they can be converted into iPSCs with small molecule treatments that also improve the efficiency and kinetics of the reprogramming process6,23,24,31,34 (Figure 1). At this point, pre-iPS are therefore a useful platform for the identification of molecular mechanisms guiding the final steps of reprogramming and, because of their defined nature, allow population-based genomics approaches.
Small molecules have been successfully used to implicate repressive chromatin states as barriers of reprogramming. Histone deacetylases (HDACs) catalyze the removal of acetyl groups from lysine residues of histones, which is classically associated with chromatin condensation and transcriptional repression116. The four HDAC inhibitors suberoylanilide hydroxamic acid (SAHA), trichostatin A (TSA), butyrate, and valporic acid (VPA) greatly improve the reprogramming efficiency of mouse and/or human fibroblasts25,27–29. Additionally, VPA allowed the efficient induction of mouse iPSCs in the absence of ectopic cMyc; reprogramming of human fibroblasts without ectopic expression of KLF4 and cMYC; and made the generation of iPSCs using cell penetrable recombinant proteins possible117. In contrast, butyrate requires ectopically expressed cMyc to exert its positive effect and functions in the early phase of mouse reprogramming process29, but late in the human process, where it can efficiently substitute for ectopically expressed KLF4 or cMYC25. Though these studies come to different conclusions regarding the temporal requirement of HDAC inhibition and reprogramming factor replacement, they all agree that treatment of reprogramming cultures with VPA or butyrate induces a transcriptional change towards the ES cell state. This is consistent with the idea that inhibiting HDACs shifts the balance towards histone acetylation and activation of transcription25,27,29. Interestingly, the reprogramming factors have been shown to interact with various histone acetyltransfereases, which could partially explain why their expression can be replaced by HDAC inhibition. However, in addition to regulating the acetylation state of histones, HDACs can deacetylate and regulate the activity of a number of other proteins, including the transcription factors p53118, which also has been implicated as a barrier to reprogramming. Given that all of the listed HDAC inhibitors block the activity of several HDAC family members, the particular HDAC(s) implicated in reprogramming and its substrate(s) remain to be determined. Similar to HDAC inhibition, an inhibitor of DNA methylation, 5'-azacytidine, or knockdown of DNA methyltransferase 1 (Dnmt1), the enzyme responsible for maintaining DNA methylation through DNA replication, enhance reprogramming and promote the conversion of pre-iPSCs to the iPSC state6,27. In addition, the small molecule BIX-01294 that can inactivate the repressive histone H3K9 methyltransferases G9a and GLP119,26,120, and parnate, an inhibitor of the histone H3K4 demethylase lysine-specific demethylase 191 have similar effects and can compensate for the loss of various reprogramming factors.
If repressive chromatin marks interfere with reprogramming, how are they removed during successful reprogramming events? A passive mechanism could require DNA replication and lead to the dilution of repressive marks during DNA replication by simply preventing the re-establishment of the parental chromatin pattern on newly incorporated histones and DNA. In support of this notion, an increased cell division rate accelerates and cell cycle arrest inhibits reprogramming17,36. Alternatively, DNA replication could facilitate the resetting of chromatin states, potentially by allowing the reprogramming factors to engage their target sites more effectively. However, active mechanisms may be more likely given that demethylating enzymes have been identified for almost every `repressive' methylated lysine within histones and are now also being uncovered for methylated DNA.
KP is supported by the NIH Director's Young Innovator Award (DP2OD001686), a CIRM Young Investigator Award (RN1-00564). WEL is the Maria Rowena Ross Professor of Cell Biology and Biochemistry and is supported by the NIH, The March of Dimes, and the Fuller Foundation. KP and WEL are supported by the Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at UCLA
Kathrin Plath Kathrin Plath earned her doctorate degree in cell biology from Harvard Medical School and Humboldt University in Berlin, Germany. She carried out her post-doctoral training at the University of California, San Francisco and the Whitehead Institute at MIT, Cambridge focusing on the function of Polycomb Group proteins, a family of transcriptional repressors, in embryonic stem cells and x chromosome inactivation. She joined the faculty at the University of California Los Angeles in 2006 and is now Associate Professor in the Department of Biological Chemistry. Her research concerns mechanisms controlling the pluripotent state, particularly as they relate to chromatin and transcription.
William E Lowry Bill received his B.S. from the University of Washington in Molecular and Cellular Biology in 1996. He then moved to New York to do his graduate work in signal transduction and cell biology with Xin-Yun Huang at Cornell Medical College in the fall of 1997. Bill then to work with Elaine Fuchs at the Rockefeller University in 2002 where he studied the mechanisms of quiescence and activation in stem cells of the Epidermis. In the summer of 2006, Bill joined the Department of Molecular, Cell and Developmental Biology at UCLA where he is the Maria Rowena Ross Professor of Cell Biology and Biochemistry.