|Home | About | Journals | Submit | Contact Us | Français|
Human embryonic stem cells (hESCs), due to their pluripotent nature, represent a particularly relevant model system to study the relationship between the replication program and differentiation state. Here, we define the basic properties of the replication program in hESCs and compare them to the programs of hESC-derived multipotent cells (neural rosette cells) and primary differentiated cells (microvascular endothelial cells [MECs]). We characterized three genomic loci: two pluripotency regulatory genes, POU5F1 (OCT4) and NANOG, and the IGH locus, a locus that is transcriptionally active specifically in B-lineage cells. We applied a high-resolution approach to capture images of individual replicated DNA molecules. We demonstrate that for the loci studied, several basic properties of replication, including the average speed of replication forks and the average density of initiation sites, were conserved among the cells analyzed. We also demonstrate, for the first time, the presence of initiation zones in hESCs. However, significant differences were evident in other aspects of replication for the DNA segment containing the POU5F1 gene. Specifically, the locations of centers of initiation zones and the direction of replication fork progression through the POU5F1 gene were conserved in two independent hESC lines but were different in hESC-derived multipotent cells and MECs. Thus, our data identify features of the replication program characteristic of hESCs and define specific changes in replication during hESC differentiation.
Studies during the past few years suggest variability among different lines of human embryonic stem cells (hESCs) and human-induced pluripotent stem cells (hiPSCs) with regard to differentiation and lineage specification (42). Thus, inconsistencies in the quality and purity of undifferentiated and differentiated cell populations from different passages are a serious concern for the development of translational applications in human disease (35). Current approaches to characterize the pluripotent behavior of hESCs are primarily limited to assays such as marker expression, in vitro differentiation, and in vivo teratoma formation. Therefore, it is critical for the field to develop additional methods for identifying characteristics that define the pluripotent state, particularly ones that could detect incompletely reprogrammed hiPSCs. One very important and defining epigenetic characteristic of ESCs is their DNA replication program.
The DNA replication program specifies the sites along the DNA molecule at which replication initiates and when in the S phase these sites are activated. When tissue-specific gene loci are compared in different cell types, there are often differences in DNA replication timing, replication initiation sites, and the direction of replication fork progression (14, 24, 26, 27, 40). The replication program is implicated in many cellular functions, such as genome reprogramming, epigenetic modifications, gene expression, and development (reviewed in reference 20). In fact, small differences in the replication of a single DNA locus could critically affect developmental pathways. Because the replication program changes as differentiation proceeds, it is very likely that all pluripotent ESCs have a common replication program before development progresses. Furthermore, this could imply that if ESCs do not initially have the correct replication program, it is possible that developmental pathways will be affected.
Replication timing (the temporal order of DNA replication during the S phase) changes significantly during development (14, 18, 24, 26, 27, 45) and is often linked to gene expression. In one example, tissue-specific genes, such as mouse Hbb (β-globin) and the IGH locus, generally replicate earlier during S phase when they are active than when they are silent (19, 24, 28). A second example is the significant change in replication timing observed for the β-globin locus during erythroid cell development (3, 34). In a third example, a recent genome-wide study reported that replication timing for a considerable portion of the mouse genome (appropriately 20%) changes significantly when mouse ESCs differentiate into neural precursor cells (27).
In addition to changes in replication timing, changes in the utilization and location of replication origins also accompany differentiation and development (reviewed in reference 20). For example, silent origins located within the DJC cluster of the mouse IGH locus are activated during B-cell development concomitant with early replication of the locus (22, 40). Upon differentiation of primary erythroid progenitor cells into erythrocytes, additional origins become active in the chicken β-globin gene cluster (13). Another example of origin plasticity occurs during retinoic acid induction of mouse P19 cells. Significant changes in origin usage take place in the transcriptionally activated HoxB gene cluster; several origins are silenced, and a single dominant origin is specified at the 3′ boundary of the locus (21). In addition, the directions of replication forks also can have important functions during development. For example, the direction of DNA replication fork progression determines mating type switching in the fission yeast Schizosaccharomyces pombe (10-12). Thus, a change in location of a single replication initiation site in ESCs can result in a change in replication fork direction through an important region of the genome, possibly leading to a developmental change.
Since DNA replication in higher eukaryotes is under epigenetic regulation (1, 2, 25, 30, 37, 50, 51), the distinctive chromatin properties of ESCs likely contribute to specifying distinctive DNA replication programs (31, 49). The chromatin in ESCs has an open configuration and undergoes dramatic reorganization during embryonic development and cellular differentiation (6, 32, 36, 38, 48). There is evidence that chromatin structure has a critical role in replication timing for mouse and human pluripotent hESCs (14, 26, 27, 45). Therefore, we sought to determine whether hESCs possess a unique replication program. For example, is the epigenetic signature of hESCs manifested by a high density of replication initiation sites and a great flexibility in origin usage? Do the hESCs show a speed of replication fork progression substantially different from that of differentiated cells?
To examine these possibilities, we applied a sensitive, high-resolution approach to capture images of replication intermediates in a population of single DNA molecules. We analyzed DNA replication for three gene loci in pluripotent hESCs, multipotent neural rosette cells (R-NSCs), and differentiated microvascular endothelial cells (MECs). R-NSCs represent a recently defined early neural stem cell type isolated from hESCs (15). R-NSCs used in the current study were derived from the same hESC line (H9) studied here. The three loci studied included two pluripotent regulatory genes that are active in hESCs, POU5F1 (OCT4) and NANOG (7, 8, 33, 39, 43, 46), and the IGH locus, which is silenced in hESCs, MECs, and R-NSCs. We comprehensively characterized replication initiation, fork direction, termination, and replication fork speed. We demonstrate that several basic replication program properties, including the average speed of replication forks and the average density of initiation sites, were conserved between pluripotent hESCs, multipotent hESC-derived neural rosette cells, and differentiated microvascular endothelial cells.
Although the chromatin structure is generally more open in hESCs (see above), our data indicate that (for the loci studied) hESCs did not possess a speed of replication fork progression or a density of replication initiation sites that was dramatically different from that of the somatic cell types. However, several significant changes in the DNA replication program for the POU5F1 locus were evident. Specifically, the location of centers of replication initiation zones and the directions of replication forks proceeding through the POU5F1 gene differed significantly in the pluripotent hESCs (H9 and H14) compared to in the multipotent R-NSCs and differentiated MECs. Our results provide important insights into the basic features of the DNA replication program of hESCs, establishing aspects of replication that are characteristic of hESCs as well as replication program changes during their differentiation.
H9 hESCs (WA09) and H14 hESCs (WA14) were cultured in HES medium (80% Dulbecco's modified Eagle's medium-F12, 20% knockout serum replacement supplemented with 1 mM l-glutamine, 1% nonessential amino acids, 0.05 U/ml penicillin, 0.05 g/ml streptomycin [all from Invitrogen], 0.1 mM β-mercaptoethanol [Sigma-Aldrich, St. Louis, MO], and 4 ng/ml fibroblast growth factor 2 [FGF2; R&D, Minneapolis, MN]) on mouse embryonic fibroblasts (MEFs). Passage 45 H9 cells and passage 45 H14 cells were transferred onto Matrigel (BD Biosciences, Franklin Lakes, NJ) with dispase (1 mg/ml; Worthington, Lakewood, NJ) to eliminate the MEF population. Normal karyotypes were confirmed for all hESC lines (Memorial Sloan-Kettering Cancer Center Cytogenetics Core Facility) (data not shown). For the nucleoside labeling experiments, cell colonies were dissociated into single cells using Accutase (15 min; Innovative Cell Technologies, San Diego, CA), and about 107 cells were grown in the presence of the nucleosides. Cell cycle analysis was performed by flow cytometry directly on an aliquot of the labeled cells.
Primary human cortex microvascular endothelial cells (MECs; Cell Systems, Rockland, ME; ACBRI 376 lots 2648 and 0537) were originally isolated from the human brain cortex of a healthy 24-year-old in the United States. The cells were elutriated from dispase-dissociated neurons and cultured on gelatin-coated tissue culture dishes in M199 medium supplemented with 20% newborn calf serum (NCS) (both from Invitrogen-Gibco, Carlsbad, CA), 5% human serum (BioCell, Rancho Dominguez, CA), 0.1 g/ml heparin, 0.05 g/ml ascorbic acid (Sigma-Aldrich), 1.6 mM l-glutamine (Invitrogen-Gibco, Carlsbad, CA), Sigma endothelial cell growth factor (Sigma-Aldrich), bovine brain extract (Clonetics BioWhittaker, Walkersville, MD), and 0.05 U/ml penicillin with 0.05 g/ml streptomycin (Invitrogen/Gibco, Carlsbad, CA) to maintain their differentiation state (16). All experiments were performed 6 to 8 days after passage of the cells, and only the early passages (11 to 24) were used. Nucleoside labeling experiments used approximately 4 × 107 cells in 12 150-mm flasks. One flask of cells (approximately 4 × 106 cells) was used for flow cytometric analysis.
H9 human embryonic stem cells (WA09) were differentiated toward R-NSCs as described previously (15). Mechanical isolation of rosettes was performed at day 16 of differentiation, followed by culture in N2 medium supplemented with sonic hedgehog (SHH) (50 ng/ml), FGF8 (100 ng/ml), brain-derived neurotrophic factor (BDNF; 20 ng/ml) (all R&D Systems), and ascorbic acid (100 μM; Sigma-Aldrich). At day 23 of differentiation, about 2 × 107 cells were grown in the presence of the halogenated nucleosides and, prior to analysis, enriched for R-NSCs via FACS for expression of Hes5::eGFP (44) and N-cadherin (15). A control set of cells from the batch used for analysis of the replication program was replated under R-NSC conditions, and the percentage of R-NSC marker-positive cells was determined at day 27 of differentiation.
Cells were stained with antibodies to Oct4, Nanog, SSEA-4, TRA1-60, and TRA1-81. Cells were fixed in 4% paraformaldehyde and 0.015% picric acid for 20 min at room temperature, followed by three 5-min washes in phosphate-buffered saline (PBS). Fixed cells were blocked in PBS, 0.5% bovine serum albumin (BSA), and 0.3% Triton X-100 for 50 min at room temperature, followed by three 5-min washes in PBS. Cells were incubated overnight at 4°C with primary antibodies diluted in PBS-1% BSA (mouse anti-Oct3/4, 1:200 [Santa Cruz Biotechnology, Santa Cruz, CA]; goat anti-Nanog, 1:50 [R&D Systems, Minneapolis, MN]). For cell surface markers (mouse anti-SSEA-4, 1:50 [DSHB, Iowa City, IA]; TRA1-60 and TRA1-81, 1:50 [Chemicon-Millipore, Billerica, MA]), the PBS-BSA blocking solution did not contain Triton X-100. Cells were washed with PBS, incubated with secondary antibodies (Alexa Fluor 488 anti-mouse antibody for SSEA-4, TRA1-60, and TRA1-81, Alexa Fluor 555 anti-mouse antibody for Oct3/4, and Alexa Fluor 488 anti-goat antibody for Nanog [Invitrogen/Molecular Probes]; all dilutions, 1:400) and diluted in PBS-1% BSA for 1 h at room temperature. This was followed by a 4′,6-diamidino-2-phenylindole (DAPI) counterstain (100 ng/ml for 5 min at room temperature) and then followed by a PBS wash. Fluorescent images were acquired with an Olympus 1X71 inverted microscope, using IPLab software. The same protocol was used for fluorescence-activated cell sorting (FACS) analysis, with the exception of maintaining the cells in suspension and constantly on ice and maintaining the secondary antibody dilutions at 1:1,000.
Cells fixed in cold 70% ethanol were analyzed by double immunostaining with platelet/endothelial cell adhesion molecule 1 (PECAM-1; CD31) (Sigma-Aldrich, St. Louis, MO) and Von Willebrand factor (vWF) (Sigma-Aldrich, St. Louis, MO) expression using the previously described protocol (16). Briefly, the cells were incubated in blocking solution (5 mM EDTA, 1% fish gelatin, 1% essentially immunoglobulin-free BSA, and 2% horse serum) for 30 min at room temperature and then incubated in diluted primary antibody (goat anti-PECAM-1 antibody or rabbit anti-vWF antibody, 1:500 or 1:800, respectively) overnight at 4°C. The cells were then washed with PBS, incubated with fluorescein isothiocyanate (FITC)-conjugated anti-rabbit IgG F(ab′)2 (Sigma-Aldrich, St. Louis, MO) or Cy3-conjugated anti-goat IgG (Sigma-Aldrich, St. Louis, MO) for 1 h at room temperature, followed by another wash in PBS for 1 h. Samples were examined by confocal microscopy with a Leica microscope. Specificity was confirmed by replacing the primary antibody with preimmune serum. For FACS analysis, goat PECAM-1, vWF, and CD99 were used as described above. In addition, the MECs were tested with antibodies to Oct4 and Nanog.
Endothelial cells and Ramos cells (Igh-expressing cells) were also treated with goat anti-human IgM-FITC and goat anti-mouse IgM-FITC (Southern Biotechnology Associates Birmingham, Birmingham, AL). The immunohistochemistry was performed as described above. Ramos cells were used as a positive control because they are a human Burkitt's lymphoma germinal-center-like B-cell line that expresses IgM.
Cells were fixed and processed for immunocytochemistry under conditions identical to those for undifferentiated hESCs. Antibodies used for characterizing R-NSCs include monoclonal mouse antibodies Nestin (1:500; Neuromics, Edina, MN), PLZF (1:50; Calbiochem, San Diego, CA), Pax6 (1:75; DSHB), and Oct3/4 (1:200; Santa Cruz Biotechnology, Santa Cruz, CA) and polyclonal rabbit antibodies Sox2 (1:200; Abcam) and ZO-1 (1:100; Zymed, San Francisco, CA).
The percentage of cells in S phase of the cell cycle was determined using propidium iodide. An aliquot of 106 cells was centrifuged into a pellet, drained of liquid, and placed on ice for 15 min after the addition of propidium iodide.
The nucleoside incorporation into the DNA of iododeoxyuridine (IdU) and chlorodeoxyuridine (CldU) does not require cell synchrony or large numbers of cells. To incorporate halogenated nucleosides into DNA in vivo, the cells must be cycling and ideally be in exponential phase. The number of cells in S phase (~40% for hESCs and ~26% for MECs) was large enough to allow the incorporation of the two halogenated nucleosides, IdU and CldU, into a sufficient number of DNA molecules for these studies.
The cells were grown at 37°C for 3.5 h in the presence of 25 μM 5-iodo-2′-deoxyuridine (Sigma-Aldrich, St. Louis, MO). The medium was withdrawn, and medium with 25 μM 5-chloro-2′-deoxyuridine (Sigma-Aldrich, St. Louis, MO) was added to the cultures, followed by another 3.5 h of incubation. The cultures were rinsed with room temperature Dulbecco's phosphate-buffered saline (DPBS), and the cells were lifted with either Accutase or trypsin. Following centrifugation, the cells were resuspended at 3 × 107 cells per ml in DPBS. Equal volumes of melted 1% InCert agarose (Lonza Rockland, Inc., Rockland, ME) in DPBS was added to the cells at 42°C. The cell solution was pipetted into a chilled plastic mold with 0.5- by 0.2-cm wells with a depth of 0.9 cm for preparing DNA gel plugs. The gel plugs were allowed to polymerize on ice for 30 min and were then pushed out of the plastic mold into a 50-ml centrifuge tube containing lysis buffer (1% n-lauroylsarcosine [Sigma-Aldrich, St. Louis, MO], 0.5 M EDTA [pH 8], and 20 mg/ml proteinase K). The gel plugs remained at 50°C for 64 h and were treated with 20 mg/ml proteinase K, recombinant PCR grade (Roche Diagnostics, Mannheim, Germany), every 24 h.
The gel plugs were rinsed several times with Tris-EDTA (TE) and once with phenylmethanesulfonyl fluoride (Sigma-Aldrich, St. Louis, MO). The plugs were rinsed with 10 mM MgCl2 and 10 mM Tris-HCl (pH 8.0). The genomic DNA in the gel plugs was digested with 50 units of PmeI (New England BioLabs Inc., Ipswich, MA) per plug in ~150 μl of digestion buffer containing 2× BSA (New England BioLabs Inc., Ipswich, MA) and 400 μM spermidine (New England BioLabs Inc., Ipswich, MA) at 37°C overnight.
The digested gel plugs were rinsed with TE and cast into a 0.7% gel (SeaPlaque GTG agarose [Lonza Rockland, Inc., Rockland, ME]). A gel lambda ladder PFG marker and yeast chromosome PFG marker (both from New England BioLabs Inc., Ipswich, MA) were cast next to the gel plugs.
A Southern transfer was performed to determine the location of the DNA fragment on the gel. The region of the gel containing the segment of interest was excised and set aside, while the rest of the DNA (which includes the chromosome ladders) was transferred to a membrane (Hybond-XL [Amersham Biosciences, Piscataway, NJ]) and hybridized with probes for either the POU5F1, Nanog, or IGH segments. Radiography was used to determine the location of the appropriate DNA segment or locus from the gel transfer. The remainder of the saved DNA gel slice was then cut into sequential 0.5-mm pieces of DNA and stored at 4°C in 50 mM EDTA and 10 mM Tris-HCl (pH 8.0).
The DNA molecules were stretched and fixed on glass slides and then hybridized with biotinylated probes that specify sequences on the molecule. Samples of gel slices from the appropriate positions in the pulsed-field electrophoresis gel were separately melted, and aliquots of the resulting DNA solutions were separately stretched on microscope slides. For each segment analyzed, the results for multiple slides did not show any apparent differences among melted DNA solutions, and the results of several experiments were in agreement (the results were pooled). The gel section containing the segment of interest was sliced into 1-mm sections, rinsed with TE several times, and melted at 72°C for 15 min. GELase enzyme (1 unit per 50 μl of agarose suspension) (Epicentre Biotechnologies, Madison, WI) was carefully added to digest the agarose. The DNA remained at 45°C for a minimum of 2 h, and the DNA strands were incubated with YOYO-1 iodide (Molecular Probes, Eugene, OR) for at least 1 h prior to stretching on 3-aminopropyltriethoxysilane (Sigma-Aldrich, St. Louis, MO)-coated glass slides. The DNA was pipetted along one side of a coverslip that had been placed on top of a silane-treated glass slide and allowed to enter by capillary action. The DNA was denatured with sodium hydroxide in ethanol and then fixed with glutaraldehyde.
Fluorescent antibodies located the molecules with the biotinylated probes and depicted the two different nucleoside labels. The slides were hybridized overnight with a biotinylated probe (see the blue probes diagrammed on the maps in Fig. Fig.1,1, ,2,2, ,4,4, and and5).5). The next day, the slides were rinsed in 2× SSC (1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate)-1% SDS and washed in 40% formamide solution containing 2× SSC at 45°C for 5 min and rinsed in 2× SSC-0.1% IGEPAL CA-630. Following several detergent rinses (4 times in 4× SSC-0.1% IGEPAL CA-630), the slides were blocked with 1% BSA for at least 20 min and treated with Alexa Fluor 350 (NeutrAvidin [Invitrogen Molecular Probes, Eugene, OR]) for 20 min. The slides were rinsed with PBS containing 0.03% IGEPAL CA-630, treated with biotinylated anti-avidin D (Vector Laboratories, Burlingame, CA) for 20 min, and rinsed again. The slides were then treated with Alexa Fluor 350 for 20 min and rinsed again, as in the previous step. The slides were incubated with the antibody to IdU, an anti-bromodeoxyuridine DNA (anti-BU-DNA) (Becton Dickinson Immunocytometry Systems, San Jose, CA), the antibody to CldU, a monoclonal rat antibromodeoxyuridine (anti-BrdU) (Accurate Chemical and Scientific Corporation, Westbury, NY), and biotinylated anti-avidin D for 1 h. This was followed by another PBS and CA-630 rinse and incubation with Alexa Fluor 350 conjugate (Molecular Probes, Eugene, OR), the antibody to Alexa Fluor 568 goat anti-mouse IgG(H+L) (Invitrogen Molecular Probes, Eugene OR), and the antibody to Alexa Fluor 488 goat anti-rat IgG(H+L) (Invitrogen Molecular Probes, Eugene, OR) for 1 h. After a final PBS and CA-630 rinse, the coverslips were mounted with ProLong gold antifade reagent (Invitrogen). A Zeiss microscope was used for fluorescence microscopy to follow the nucleoside incorporation of the DNA molecules. These results were very reproducible using several slices obtained from the same agarose strip from the pulsed-field gel in different experiments.
In order to delineate the DNA replication characteristics of pluripotent stem cells, we used two hESC lines (H9 and H14) and compared them to differentiated MECs and also to multipotent H9 hESC-derived R-NSCs (see below). To ensure that the cell lines used in this study were homogeneous, these cells were immunocytochemically characterized using fluorescent antibodies to cell-specific surface antigens and nuclear transcription factors associated with pluripotency (see Items S1, S1A, and S2 in the supplemental material). Results showed that the cells used in this study phenotypically represent the indicated cell types.
Replication of the DNA segments containing the genes of interest was studied using a previously established technique, single-molecule analysis of replicated DNA (SMARD), for fluorescent visualization of the DNA replication program across DNA segments (40, 41). Cells in exponential phase are pulsed with halogenated nucleotides: IdU followed by a second pulse with CldU. At the end of the labeling periods, DNA segments containing the loci studied here were either substituted along the whole length of the segment (totally substituted) with one nucleotide analog (see Items S3 to S11 in the supplemental material) or both nucleotide analogs, substituted along part of the length of the segment, or not substituted at all.
The cells were then suspended in agarose plugs and lysed and cleaved in the plugs to avoid DNA breakage. We then used pulsed-field gel electrophoresis to enrich for the specific segment containing the locus of interest and excised a slice of agarose that contained this segment. The agarose was melted and digested with agarase, and an aliquot of the DNA solution was allowed to slide under a coverslip; the molecules stretched as they attached to the surface of an amino silane-treated microscope slide. A red fluorescent antibody was used to detect IdU, while a green fluorescent antibody was used to detect CldU. Multiple biotinylated DNA probes were used for FISH to produce a distinctive blue “bar code” to distinguish and tag the DNA molecules containing the loci of interest and determine the orientation of these molecules. Photomicrographic images of tagged molecules that are totally substituted and contain both nucleotide analogs (red plus green along the full length) were aligned using the boundaries of the blue FISH signals. This was done in a nonsubjective manner in accordance with increasing incorporation of IdU (red stain) in randomly selected representative stretched DNA strands from the cells. Aligning each group of molecules in this manner effectively eliminates any potential bias in the arrangement of the molecules. Thus, if replication forks progress in a single direction through the examined region, the DNA molecules will have increasing red stain from one end (Fig. (Fig.11 Ai and ii). Replication initiation is represented by a patch of red stain (IdU incorporation) flanked on both sides by green stain (CldU incorporation) (Fig. (Fig.1Aiii).1Aiii). Termination of replication is represented by a patch of green stain flanked by red stain on both sides (Fig. (Fig.1A1Aiv).
The human IGH locus (~2 Mb) is a gene locus transcriptionally active only in B-lineage cells. We analyzed the replication of an ~161-kb PmeI DNA segment containing part of the IGH locus (IGH segment, including part of the V region, D, J, Cμ, and Cδ genes). We found that replication forks proceed exclusively in the 3′-to-5′ direction in all DNA molecules fully substituted with IdU and CldU (red-green molecules), lacking any internal initiation or termination sites (Fig. (Fig.1B,1B, molecules 1 to 32, and 1C, molecules 1 to 23) in the hESCs and MECs. As shown in Fig. 1D and E, the replication profiles of the IGH segment show very similar patterns (r = 0.962; P < 0.001) which decrease progressively in a 3′-to-5′ direction. Thus, the replication programs for the transcriptionally silent IGH segment are very similar in hESCs and MECs; external replication initiation sites downstream of the segment were used to replicate the IGH segment.
We next compared the replication program for an ~350-kb DNA segment containing the POU5F1 gene encoding the pluripotency regulator Oct4 (POU5F1 segment). Unlike the IGH segment, which is passively replicated by origins located downstream of the segment, several internal replication initiations were observed in the POU5F1 segment (Fig. (Fig.2).2). The identification of the positions of initiation sites determined by SMARD is based on the following two assumptions. First, the replication fork proceeds bidirectionally from the initiation sites. Second, the replication forks progress at an equal speed in both directions. Thus, the initiation site will be located at the center of a red patch surrounded by green, assuming there is only one initiation site in the IdU-incorporated (red-stained) patch in the molecule. It should be noted that these assumptions likely apply to most but not necessarily all replication initiations in mammalian cells. In the H9 and H14 hESCs, most of the replication initiation sites (asterisks in Fig. 2A and B) were close to the POU5F1 gene, exhibiting a peak centered at the POU5F1 gene in the replication profile (Fig. 2D and E). Forks originating from external replication origins both upstream and downstream of the segment also replicate the POU5F1 segments. For all of the red-green DNA molecules without initiations and terminations, 20% and 39% proceeded from the 5′-to-3′ direction in H9 and H14 hESCs, respectively (Fig. (Fig.2A,2A, molecules 16 to 20, and B, molecules 21 to 31, respectively), and 80% and 61% proceeded from the 3′-to-5′ direction in H9 and H14 hESCs, respectively (Fig. (Fig.2A,2A, molecules 21 to 40, and B, molecules 32 to 48, respectively). There were 8 molecules and 12 molecules with replication termination events that randomly occurred in the DNA segment for the H9 and H14 hESCs, respectively (Fig. (Fig.2A,2A, molecules 41 to 48, and B, molecules 49 to 60, respectively). These results showed that the DNA replication programs for the POU5F1 segment are very similar in the two hESC lines which exhibited very similar DNA replication profiles (r = 0.818; P < 0.001) (Fig. 2D and E).
In the MECs, DNA replication initiation sites were also detected within the POU5F1 segment (Fig. (Fig.2C,2C, molecules 1 to 27). There were 17 termination sites in the segments, and many of these termination sites were located in the molecules that also have initiation sites (molecules 15 to 27, 44 to 47). With regard to fork direction, there were almost twice as many molecules with replication forks proceeding from 5′ to 3′ (Fig. (Fig.2C,2C, molecules 28 to 37) than those with replication forks proceeding from 3′ to 5′ through the entire segment containing the POU5F1 gene (Fig. (Fig.2C,2C, molecules 38 to 43). In particular, replication forks progressed predominantly in the 5′-to-3′ direction through the POU5F1 gene (Fig. (Fig.2C).2C). This was reflected in the replication profile of the MECs (Fig. (Fig.2F),2F), where there was a decrease in the percentage of molecules containing incorporated IdU from 5′-to-3′ near the POU5F1 gene. Importantly, this revealed a significant difference between hESCs and MECs in the direction of replication fork movement through the POU5F1 gene. In hESCs, forks proceeded through the gene predominantly in the 3′-to-5′ direction, in contrast to fork movement in the MECs, which was predominantly in the 5′-to-3′ direction (Fig. (Fig.2).2). Thus, replication fork progression through the POU5F1 gene in hESCs was clearly different from that in MECs.
An additional important finding was that the replication initiation program of the POU5F1 segment in H9 and H14 hESCs was clearly different from that of MECs (Fig. (Fig.2).2). In DNA molecules from different cells, initiation can occur at different sites. For the POU5F1 segment, as in many other loci, these initiation sites are located within a region called the initiation zone. DNA replication initiated within an ~100-kb region centered around the POU5F1 gene in H9 and H14 hESCs (Fig. 2A and B), whereas most of the initiation sites in MECs were located in a more distal region from the gene (Fig. (Fig.2C).2C). The differences in the location of replication initiation sites in hESCs and MECs were evident in the replication profiles (Fig. 2D to F). The peaks in the profiles indicated the center of the initiation zone. To better visualize initiation events in DNA molecules, we determined the replication profile for DNA molecules that had replication initiations (initiation profile). We calculated the distribution of red-stained areas that are surrounded by green stain (initiation region) in DNA molecules and plotted this at intervals of 5 kb along the POU5F1 segment. Assuming the center of the red signal is the initiation site, the midpoint of the peak of the initiation profile will reflect the location of preferential initiation sites. Profiles of initiation for both hESC lines showed a peak midpoint near the POU5F1 gene (Fig. (Fig.33 A and B), indicating that most initiation originated near the gene. In contrast, the peak midpoint of the replication initiation profile for MECs was ~70 kb upstream of the POU5F1 gene (Fig. (Fig.3C).3C). These results thus indicated that there is a clear difference in the position of initiation sites for the POU5F1 segment; initiations are proximal to the gene in hESCs but distal to the gene in MECs.
We examined whether these differences in the fork directions and origin locations could be caused by significant variations in DNA sequences, for example, large DNA insertions or deletions or rearrangements in the examined segments. We did not detect any large duplications or deletions for the PmeI segment in the hESCs and MECs based on the segment sizes measured by pulsed-field gel electrophoresis (at a resolution of 5 kb). Additionally, on a more sensitive level, PCR results did not reveal any significant differences between hESCs and MECs when the lengths of the products (see Items S12 and S13 in the supplemental material) obtained from the POU5F1 and Nanog segments (where the differences in replication were detected) were compared. Therefore, it is highly unlikely that the differences in replication at the POU5F1 segment for hESCs and MECs could be attributed to detectable insertions or deletions. Since the directions of replication forks through the POU5F1 gene and locations of origins were conserved in the two hESCs (and also in the H1 hESCs [data not shown]), it indicates that they are an intrinsic epigenetic property of hESCs.
To further determine whether the DNA replication program used by the two hESC lines to replicate the POU5F1 segment was specific to these lines, we examined replication in R-NSCs (Fig. (Fig.4)4) . R-NSCs are a novel neural stem cell type with broad differentiation potential toward central nervous system and peripheral nervous system fates and capable of in vivo engraftment (15). Furthermore, these cells share a genotype with the H9 hESCs studied here. Additional important features that distinguish R-NSCs from MECs include the broad differentiation repertoire (multipotency), expression of several markers shared with pluripotent cells, such as SOX2 (Fig. (Fig.4A)4A) and LIN28, and the early embryonic nature of the cells emerging shortly after the loss of hESC pluripotency. Similar to MECs, R-NSCs do not express POU5F1 (Fig. (Fig.4A)4A) or Nanog.
The replication of the POU5F1 segment in R-NSCs (Fig. 4B and C) was clearly different from that in the H9 ESCs from which they were derived as well as another independent hESC line, H14 (Fig. 2A and B), and was also different from that in the MECs (Fig. (Fig.2C).2C). In H9 hESCs, replication mainly initiated within (showing a peak in the replication profile near the POU5F1 gene) (Fig. (Fig.2A)2A) or downstream of the segment (predominance of 3′-to-5′ replication forks), while in the R-NSCs, replication initiated predominantly upstream of the segment (5′-to-3′ replication forks). As shown in Fig. Fig.4B,4B, the replication profile decreased in the 5′-to-3′ direction, indicating that most of the replication forks proceeded 5′ to 3′ through the segment. Forks proceeded predominantly 5′ to 3′ through the POU5F1 gene in R-NSCs, distinct from hESCs, in which the forks proceeded predominantly 3′ to 5′ through the gene. Furthermore, in the H9 hESCs, most of the replication initiation sites were found near the POU5F1 gene (no more than 50 kb upstream of the POU5F1 gene), while in R-NSCs, most replication initiations were located at positions more than 50 kb upstream of the gene (Fig. (Fig.3D3D and and4B).4B). These findings demonstrate that DNA replication for the DNA segment containing the pluripotent locus POU5F1 in the H9 derivative R-NSCs is clearly distinct from that in the hESCs and the somatic MECs.
We analyzed the replication of the DNA segment containing another pluripotency gene, NANOG (Nanog segment, ~110 kb). In the H9 hESCs, most of the replication forks proceeded from 3′ to 5′. Three initiation and three termination sites were observed near or upstream of the NANOG gene. In the H14 hESCs, similar numbers of molecules (Fig. (Fig.5B,5B, molecules 1 to 26) exhibited forks progressing from 5′ to 3′ and 3′ to 5′. Twenty-three percent of the DNA molecules exhibited replication initiation sites (Fig. (Fig.5B,5B, molecules 27 to 35 and 40). There were five termination sites upstream and downstream of the NANOG gene for the H14 hESCs (Fig. (Fig.5B,5B, molecules 36 to 40). Comparison of the replication profiles of H9 and H14 hESCs showed that the replication program of the segment was not completely conserved in the two hESCs analyzed (Fig. 5A and B).
In contrast to H14, in the MECs and R-NSCs, almost all of the molecules had replication forks proceeding only in the 3′-to-5′ direction (Fig. 5C, D, G, and H). We did not detect any DNA molecules that proceeded 5′ to 3′ through the segment in the MECs (Fig. (Fig.5C),5C), and only one did so in the R-NSCs (Fig. (Fig.5D,5D, molecule 1). Calculation of the directions of replication forks proceeding through the NANOG gene showed that replication forks proceeded predominantly 3′ to 5′ through the gene in H9 hESCs, MECs, and R-NSCs but showed similar numbers of 5′-to-3′ and 3′-to-5′ forks proceeding through the NANOG gene in H14 hESCs.
Like the POU5F1 segment, we did not detect any difference in the lengths of the Nanog segments for all cell types examined in this study based on pulsed-field gel electrophoresis and PCR analysis (20 pairs of primers were designed to span a 100-kb region [see Item S13 in the supplemental material]); hence, major sequence modifications did not account for the differences we detected.
Since the chromatin in hESCs is generally more open and more highly transcribed than it is in somatic cells (for a review, see reference 36), it is possible that replication forks will progress at significantly different speeds in hESCs (H9 and H14 hESCs) compared to in MECs and R-NSCs. We calculated the average speeds of replication forks using an equation described previously (40, 41) (see Items S13 and S14 in the supplemental material). We found, however, that the average speed of replication forks in hESCs was similar to that in MECs and R-NSCs. For the DNA segments containing the POU5F1 and NANOG genes, which are expressed in hESCs, a significantly different average replication fork speed was not observed compared to the average speed observed for MECs and R-NSCs, in which the genes are silenced. In addition, we found that the speeds of replication forks in several segments containing portions of the IGH locus in H9 and H14 hESCs were not significantly different than those in human MECs (data not shown). To test whether this is general for the hESCs, we extended our analysis and calculated the average speed of replication forks for many other genomic regions. We collected, from pulsed-field gels, DNA molecules with a length of ~350 kb from many different locations in the human genome and calculated the average speed of the replication forks using the same equation as described above (see Item S14 in the supplemental material). Sixty microscope fields were randomly selected from three independent slides for each cell type; molecules that were not well stretched were not included in the calculation. Consistent with the results for the three specific genomic regions studied in detail here, the results, from many different genomic regions, showed that the average speed of replication forks in hESCs was not unusual compared to speeds measured in somatic R-NSCs (see Item S16 in the supplemental material).
In conclusion, analysis of the POU5F1 segment revealed significant differences in hESCs versus MECs or R-NSCs in the direction of replication forks proceeding through the POU5F1 gene and in the positions of initiation sites relative to the POU5F1 gene. Analysis of the Nanog segment showed replication forks proceeding in the 3′-to-5′ and 5′-to-3′ directions through the NANOG gene in H14 hESCs, while predominantly 3′-to-5′ replication forks were detected in H9 hESCs and MECs. Thus, although several features of replication were shared between the pluripotent hESCs, multipotent R-NSCs, and differentiated MECs examined in this study, there were a number of distinctive differences in replication fork direction and initiation sites of the two pluripotent genes expressed in undifferentiated hESCs (Table (Table11).
Mammalian genomes exhibit great flexibility in utilizing sites to initiate DNA replication, but at the same time, the initiation sites are subject to precise developmental regulation. Replication studies have shown that mammalian chromosomes use thousands of replication origins that proceed bidirectionally with an average interorigin distance of 50 to 150 kb (29). Recently, we showed that many of the origins in the IGH region are present in zones that can contain 30 or more initiation sites and can be as large as 700 kb (40). The choice of replication initiation sites depends upon epigenetic modification and metabolic factors as well as cis-regulatory elements in DNA sequences (reviewed in references 2 and 25). The chromatin state in stem cells and differentiated cells likely affects the structure and function of specific genes and also likely influences replication. Changes in the differentiation state result in changes in origin usage; examples include the silencing of some origins in the transcriptionally active HoxB locus of P19 mouse cells, discussed above (21). In addition, it has been reported that changes in nucleotide pool sizes affect the specification of replication origins (5, 9).
Previous studies on mammalian DNA replication reported changes in the sizes of replication initiation zones during development (2). In pro- and pre-B cells, developmentally regulated origins separated by ~50 kb are temporally activated in the DJC gene cluster of the mouse IGH locus; these origins are silent in other cell types (40). The pattern of replication initiation in the β-globin locus is similar in differentiated and undifferentiated mouse ESCs (4). However, our results show an alteration in the position of replication initiation sites as well as in fork direction for transcriptionally active pluripotent genes in hESCs compared to MECs and R-NSCs, in which those genes are silent.
We found that the replication program for the POU5F1 segment was remarkably similar in the H9 and H14 hESC lines presented here as well as in the H1 hESC line (data not shown). These similarities included the region in which replication initiation occurred, the average number of initiation events, the average number of replication forks, and the average speed per replication fork. However, significant differences were detected when the replication program of pluripotent regulatory gene POU5F1 examined here in hESCs was compared to that in MECs and R-NSCs. In the DNA segment containing the POU5F1 gene, the location of the center of the replication initiation zone in hESCs was clearly different from the center of the initiation zone in somatic cells. The predominant direction of fork progression through the gene was also significantly different in hESCs compared to that in the somatic cells, indicating a differential origin usage in the region flanking the gene or outside the whole segment. Hence, our findings reveal that, in the cells studied here, there are characteristic differences in replication initiation and fork direction between hESCs and somatic cells.
Importantly, fork direction has been found to play a major role in a developmental pathway of S. pombe (mating type switching) (10-12); thus, the implications of replication fork direction in pluripotency could be an important question to address in the future. Although the R-NSCs represent a multipotent stem cell type with several properties in common with the hESCs they derive from, the fact that there are differences in the replication program of the POU5F1 segment in the R-NSCs provides additional evidence for tissue-specific replication programs.
The potential role of replication timing, transcriptional activity, and chromatin structure in the heterogeneity of initiation in zones in mammalian cells is currently unclear. Although initiation sites have a heterogeneous frequency of usage in both Schizosaccharomyces cerevisiae and mammalian cells, there is a major difference. S. cerevisiae has initiation sites at defined DNA sequences that are used at different frequencies, in contrast to initiation in mammalian cells, which often occurs at a large number of sites within initiation zones. We present the first evidence for initiation zones in hESCs. We observed a DNA replication initiation zone within the 350-kb segment containing the POU5F1 gene and a zone within the 110-kb segment containing the NANOG gene in two hESC lines. Importantly, the major portion of initiations in the POU5F1 zone did not appear to occur at preferred sites in any of the cell lines examined here. Another example of heterogeneity of initiation in hESCs is also exemplified with the Nanog segment. When two different hESC lines are compared with respect to replication of the segment containing the NANOG gene, there were significant differences in positions of initiation sites. An additional point is that variations in NANOG expression have been observed among ESCs (47). The POU5F1 and Nanog results provide further evidence for heterogeneity in the selection of initiation sites.
In hESCs, RNA levels from POU5F1 are approximately 30-fold higher than the average levels from seven other genes in the flanking 100-kb region (23). However, the increased transcriptional activity does not result in preferred initiation sites within the initiation zone (Fig. (Fig.22 and and3).3). In the hMECs and R-NSCs, where the POU5F1 gene is not expressed, there is still a replication initiation zone, but the center of the zone is shifted from POU5F1 in the direction of the TCF19 gene. Interestingly, in vascular endothelial cells and neuronal cells, the TCF19 gene is also a highly transcribed gene in this region (Gene Expression Omnibus series GSE19090). The shift in the position of the center of the POU5F1 initiation zone is not due to a change in replication timing in hESCs relative to that in somatic cells, since the 350-kb segment containing the POU5F1 gene replicates relatively early in S phase in all the cell types studied here (14, 23, 26; and R. S. Hansen, personal communication) and (Gene Expression Omnibus series GSE19090). To assess replication timing, we also used the valuable Replication Domain website (http://www.replicationdomain.org) developed by David Gilbert. Thus, the results reported here support the heterogeneity of initiation sites and raise the possibility that transcriptional activity, chromatin structure, and other factors play a role in their selection; however, their direct role in establishing replication initiation sites appears to be very complex and is beyond the scope of this study (2).
The DNA replication programs of hESCs could also be influenced by their more open chromatin structure, resulting in the modulation of the speed of replication forks, the density of replication initiation sites, the rate of DNA replication, and/or the length of S phase. Importantly, in the hESCs studied here, the DNA segments containing the pluripotent genes did not have a faster or strikingly different speed of replication fork progression from that measured in somatic cells. In addition, speeds of replication forks for DNA segments in several other regions (called temporal transition regions [TTR]) have been determined. These include our results for the IGH locus TTR in human mesenchymal cells (data not shown) and the IGH locus in pro-B, pre-B, immature and mature B, and T cells (27). We show that these average replication fork speeds were not very different from speeds for the same segments in hESCs and from speeds for segments containing the POU5F1 and NANOG gene loci in hESCs. Furthermore, analysis of the average speed of replication forks for many other genomic regions from many different genomic locations confirmed that the fork rates do not differ dramatically (see Item S16 in the supplemental material). We also determined that the average densities of replication origins, calculated by choosing DNA molecules with a length of ~350 kb from many different locations in the human genome, were not significantly different when hESCs were compared to the somatic cells (see Item S16 in the supplemental material). It has been pointed out that unique cell cycle characteristics might play a role in maintaining the pluripotent characteristics of ES cells (17). Here, we have shown that several general characteristics of DNA replication during the S phase were not specifically distinct in hESCs. In addition, we found that the length of S phase for the hESCs was not significantly less than that in somatic cells, which is about 8 h (see Item S17 in the supplemental material). Thus, while changes in any of the above-examined characteristics of DNA replication could potentially lead to an altered replication program in hESCs, we did not observe significant differences in these aspects of replication. Our results showed that for the three DNA segments analyzed, several features of the basic DNA replication program, including the speed of replication forks and the density of replication origins, are conserved in hESCs and somatic cells. Importantly, these similarities are contrasted by the significant differences in the replication program of hESCs with respect to origin locations and fork directions observed in the pluripotent cell lines used in these experiments.
In summary, the studies on replication presented here delineate key epigenetic characteristics of hESCs that could reflect features specific to pluripotency in these hESC lines. It has been suggested that many abnormalities in nuclear transfer animals may have an epigenetic basis. Our results delineate changes in DNA replication, an intrinsic epigenetic property of cells. It is likely that these epigenetic characteristics must be established correctly during nuclear reprogramming in order to produce normal animals or during somatic cell reprogramming in the case of iPSCs. These results suggest that no major changes in the rate of replication fork progression are required for human somatic cell reprogramming. However, for the pluripotent regulatory gene POU5F1, the positions of replication initiation sites should change during reprogramming, and these changes will affect the direction of replication fork progression through this gene. These findings suggest that origin distribution may play an important role in defining the pluripotent state. Moreover, our study suggests that characterization of these changes in the DNA replication program will be an important tool for characterizing pluripotent cells. The lack of a suitable in vivo assay for hiPSCs stresses the need for developing such novel surrogate in vitro assays based on defined epigenetic features that can be used to identify cells that are pluripotent.
We thank Eliseo Eugenin and Joan Berman for characterizing and providing the human endothelial cells, Prasanta Patel for experiments, data collection, and analysis of the IGH locus in the human embryonic stem cells, Paolo Norio for providing advice on the SMARD technique, and Zeqiang Guan, Scott Hansen, and William Drosopoulos for valuable suggestions during the writing of the manuscript. We thank former lab member Xiaohua Qi for information about the replication of the Oct4 locus in H1 hESCs.
This work was supported by the National Cancer Institute Immunology and Immunooncology training grant 5T32CA00173, National Institute of Health Molecular Neuropathology training grant NS 07098 (NIGMS P-20 GM 075037), National Institute of General Medicine Sciences grant 5R01-GM045751 (C.L.S.), and the Empire State Stem Cell Fund through NYS contract C024348 (C.L.S. and L.S.).
Published ahead of print on 20 July 2010.
#Supplemental material for this article may be found at http://mcb.asm.org/.