Methodology. Our technique to investigate the dynamics of ssDNA formation on a genomic scale is outlined in . We harvested cells at discrete times after releasing them from late G1 phase (alpha factor) arrest into a synchronous S phase in the presence of 200 mM HU (). Chromosomal DNA isolated from these S phase samples and an alpha factor arrested G1 control sample were differentially labeled with Cy-conjugated deoxyribonucleotides by random priming and synthesis without denaturation of the DNA, followed by co-hybridization to a microarray (). Because the labeling was done without denaturation of the template DNA, single-stranded regions of the genome should preferentially act as templates for dye incorporation. Although labeling DNA without random hexameric primers does render some incorporation of deoxyribonucleotides, the reaction can be enhanced approximately seven fold when random primers are included (data not shown). The average size of the labeled DNA was approximately 500 nt (data not shown). Comparison of experimental (S phase) and control (G1 phase) samples from the microarray hybridization revealed regions of the genome that became single-stranded in S phase.
Figure 1 Outline of experimental procedures. (A) Synchronization and yeast cell sample collections. (B) Labeling of DNA for microarray hybridization. (C) Slot blotting and hybridization for quantification of ssDNA. The total amount of ssDNA in the genome for each (more ...)
We also assessed the total percentage of ssDNA in the samples by blotting native (undenatured) genomic DNA and fully denatured genomic DNA, followed by hybridization with a genomic DNA probe (). The calculated total percentages of ssDNA in the samples were then used to normalize the relative ratio of ssDNA (S/G1) (see Supplementary Information, Normalization), which, when plotted against chromosomal coordinates, generated a ssDNA profile (). The normalized relative ratio of ssDNA was then smoothed over a 4 kb window via Fourier transformation (see Supplementary Information, Smoothing). We identified peaks of ssDNA computationally (see Supplementary Information, Extrema detection). All experiments (including sample collection, DNA isolation, labeling and hybridization) were done at least twice with reproducible results. The results shown below are from one such experiment, for WT and rad53 cells each.
ssDNA formation in WT vs. rad53 cells.
) cells synchronously entering S phase in the presence of 200 mM HU showed very little DNA synthesis for up to three hours (). The total amount of ssDNA in the genome (assayed as in ) remained at approximately 0.2% during this 3-hour period (). At 30 minutes following the release from the alpha factor block into medium containing HU, WT cells accumulated ssDNA predominantly at regions corresponding to known early replicating origins such as ARS305 and ARS306 (, 30 min). Over time, the amount of ssDNA at these early origins decreased gradually, accompanied by an increase of ssDNA at adjacent regions. By 3 hour in HU, ssDNA was broadly distributed throughout regions of early firing origins (for example, across two-thirds of chromosome III) (). These data suggest that replication forks in WT cells are moving in the presence of HU, albeit at a very slow pace, rather than stalling. These data are also consistent with the previous observation that the size of replication bubbles increases steadily in WT cells in HU with the amount of ssDNA in the bubbles remaining relatively constant4
. When compared to the replication profile of an isogenic strain during a normal S phase5
, it is clear that distinct peaks of ssDNA at the late/inefficient origins such as ARS301/320 and ARS316 were not readily observed for up to 3 hours in HU ().
Figure 2 Dynamics of ssDNA formation in WT cells and rad53 cells. (A) Flow cytometric analysis of an asynchronous cell population (Asy) and of cells undergoing S phase in the presence of 200 mM HU; the times indicated are those following the release from an alpha (more ...)
cells, at 30 minutes post release the levels and locations of ssDNA in the genome were comparable to those seen in WT cells at 30 minutes (, 30 min). However, ssDNA increased ~2.5 fold in rad53
cells between 30 and 60 minutes (). The amount of ssDNA at the early origins showed a sharp increase between 30 minutes and 1 hour post release in rad53
cells. At 1 hour ssDNA peaks appeared at additional origins such as ARS301/302, ARS313/314 and the HMR-E
ARS in rad53
cells that were not seen in WT cells (, 1 hr). These observations are consistent with the notion that Rad53 is involved in delaying or preventing the firing of some origins in the presence of HU6-9
. Moreover, in rad53
cells the locations of ssDNA were confined to origins over time, up to 3 hours post release, suggesting that replication bubbles do not expand in HU in rad53
cells as they do in WT cells. This result is consistent with the observation that bubble size in rad53
cells in HU as assayed by EM does not increase over time4
. Finally, the sub-telomeric regions in the rad53
cells showed elevated levels of ssDNA relative to the remainder of the chromosomes, in contrast to the WT cells (, see Supplementary Information, Fig. S1, and see below).
We quantified the accumulation of ssDNA in a 20 kb window from either end of each chromosome excluding any data point in the most distal 4 kb window to avoid any artifacts introduced by the smoothing process near chromosome ends (). Comparison between the ratios of ssDNA at the telomeric vs. the internal regions of the chromosomes showed not only that the telomeric regions accumulated relatively higher amounts of ssDNA than the internal regions in rad53
cells, but also that the telomeric regions continued to increase in the amount of ssDNA over time in rad53
cells (). The disproportionate increase in ssDNA at the telomeres in rad53
cells could be the consequence of either replication or partial telomere erosion. If it is indeed due to replication-related events, the fact that the entire telomeric region (including local maxima and minima) showed an increase in ssDNA suggests that replication forks initiated in these regions are capable of moving in rad53
, insensitive to HU ( & see Supplementary Information, Fig. S1). It has been reported that a checkpoint mutation of rad53
causes telomere length shortening10
and that Rad53 specifically inhibits the Exo1 dependent degradation of double stranded DNA to ssDNA at unprotected telomeres in a cdc13
. Whether the treatment of HU in the rad53
mutant mimics the unprotected telomere phenotype in the cdc13
mutant is not known. In order to test whether Exo1 is indeed responsible for the continued increase in ssDNA formation at the telomeres in rad53
cells in HU, we examined ssDNA formation in the presence of HU in a rad53 exo1
double mutant strain. Quantification of the ratio of telomeric ssDNA to internal (non-telomeric) ssDNA showed that the relative amount of ssDNA at the telomeres does not continue to increase over time, as it does in rad53
cells (). This result suggests that at least the continued increase of ssDNA in HU in rad53
cells is dependent on Exo1.
Figure 3 Elevated levels of ssDNA at telomeric regions. (A) Overlay of ssDNA profiles of chromosome VI for WT cells (light purple, purple, and dark blue for 1, 2, and 3 hours post release, respectively) and rad53 cells (yellow, orange, and red for 1, 2, and 3 (more ...)
ssDNA locations accurately predict origin locations. We took advantage of the observation that ssDNA formation appeared to occur near origins of replication in rad53 cells in HU to explore the predictive power of our technique in origin identification. For simplicity, we are presenting only one set of experiments of rad53 cells in HU. We identified the locations of all ssDNA peaks and valleys, i.e., local maxima and minima, and scored a given ssDNA peak as significant if the peak differentials from both of its flanking local minima were at least three standard deviations above the background level of variation (see Supplementary Information, Identification of significant ssDNA peaks). We repeated this process for the other timed samples. These ssDNA peaks from the three timed samples were then clustered by genomic location, allowing a maximal 4 kb difference (the size of the sliding window used in the smoothing process) between a pair of ssDNA peaks from any two samples. From this single experiment, we identified a total of 364 unique ssDNA peak locations, of which 315 were present in at least two timed samples (with 249 present in all three timed samples) and were named “clustered ssDNA peaks”, while 49 appeared as singletons (Table S1). Even among the singletons, only 14 were true singletons inasmuch as the remaining 35 were identified at least twice in other repetitions of the experiment (Table S1).
We compared our origin predictions from the genomic ssDNA profile in rad53
cells to a recently compiled list of 168 ABOs (array based origins) that are common origins identified in three independent microarray based studies12
and found that 149 (89%) ABOs overlap with clustered ssDNA peaks within a 10 kb window, with a mean distance of 1.33 kb and a median of 1.07 kb. Out of the remaining 19 ABOs, 10 are located less than 4 kb away from another ABO, and therefore are not resolved by our analysis. The 19 unmatched ABOs are listed in Table S2. Among the remaining 166 clustered ssDNA peaks that do not correspond to an ABO, we estimated that 11 are new origins that do not match either a proARS13
or a previously described replication origin5
. We chose three ssDNA peaks for further analysis: one on each of three chromosomes previously well-studied for ARS activity, chromosome III, VI, and X (at 206, 99, and 161.3 kb, respectively), that did not correspond to any known ARS or Pro-ARS (, D & see Supplementary Information, Fig. S1). The ssDNA peak at 161.3 kb on chromosome VI was defined by a single ORF, which on closer examination proved to contain a highly repetitive gene PAU5
. We initially did not filter out this repetitive sequence due to a change in nomenclature in the most recent version of the Saccharomyces Genome Database (SGD).
Figure 4 New origin identification from the ssDNA profiles of rad53 cells. (A)&(D) ssDNA profiles for rad53 cells at 1, 2 and 3 hour (yellow, orange, and red respectively) post release from alpha factor arrest into S phase in the presence of 200 mM HU (more ...)
We tested all three locations for potential origin activity by two-dimensional gel electrophoresis (chromosomal structures for III and X are shown in and E respectively). As expected we did not detect bubble intermediates for the region on chromosome VI (data not shown). However, bubble intermediates were readily observed for the target regions on chromosome III and X in an isogenic strain RM14-3a5
and in HM14-3a, respectively (, F). We named these newly discovered origins ORI314.5 and ORI1007.5. We were particularly intrigued by the discovery of a new origin on chromosome III, a chromosome that has been exhaustively mapped for ARS activities14
. We have confirmed that a plasmid bearing a fragment at the ORI314.5 location amplified from HM14-3a transforms yeast at high frequency (A. Hemmaplardh, W. Feng, M. K. Raghuraman, B. Brewer, unpublished results). In contrast, a plasmid bearing the same fragment covering ORI314.5 amplified from the strain described by Poloumienko et al
did not show high frequency transformation (A. Dershowitz and C. Newlon, personal communication and A. Hemmaplardh, W. Feng, M. K. Raghuraman, B. Brewer, unpublished results). These results suggest that perhaps the discovery of ORI314.5 reflects strain polymorphisms. We are currently sub-cloning the fragment at ORI314.5 from the two laboratory strains in order to identify sequence variations. Nevertheless, the discovery of these origins underscores the predictive power for origins of the ssDNA mapping technique.
For at least the first 30 minutes after release from alpha factor block in HU, WT and rad53
cells accumulated ssDNA at the same limited set of origins. However, by 1 hour, rad53
cells showed peaks of ssDNA at virtually all known chromosomal origins while WT cells still only showed ssDNA at a subset of origins. This observation permits the classification of origins into two groups whose responses to HU appear to differ: those that can be identified in both WT and rad53
cells (called “Rad53-unchecked origins”); and those that only appear in rad53
cells (called “Rad53-checked origins”). For this comparative analysis we analyzed the 1-hour S phase sample of the WT cells for origin locations and compared them to the 315 significant clusters of ssDNA peaks in rad53
cells from the time course. We did not attempt to analyze the later S phase samples of WT cells since the locations of ssDNA moved significantly from the origins to neighboring regions during the time course. Data smoothing and statistical analysis in ssDNA peak identification for the WT cells were conducted in a similar fashion as for rad53
cells (see Supplementary Information, Normalization, Smoothing, Extrema detection and Identification of significant ssDNA peaks). We identified 113 ssDNA locations in WT cells at 1 hour that accumulate significant amount of ssDNA. Out of these 113 locations, 106 match one of the 315 “clustered ssDNA peaks” from rad53
cells within a distance of 4 kb, and thus were defined as Rad53-unchecked origins (See Supplementary Information, Table S1). The 209 remaining ssDNA peak locations from the rad53
cells were defined as the Rad53-checked origins (See Supplementary Information, Table S1). Ninety-six of the 106 Rad53-unchecked origins co-localize to regions previously identified as origins that increase in copy number in WT cells in HU15
(See Supplementary Information, Fig. S2).
cells fired virtually all origins, we wondered whether the amount of ssDNA at a given origin in rad53
cells is correlated with its replication timing or firing efficiency or both. By addressing this question we hoped to determine the common feature shared by Rad53-checked origins. Because genome-wide systematic studies of origin efficiency have not been done, we compared the magnitude of ssDNA formation with origin efficiency and replication timing (Trep
) for just seven origins on chromosome VI16
. Although there is a positive correlation between the amount of ssDNA formed by 1 hour post release in rad53
cells in HU and replication timing (R = 0.59568), there is a much stronger correlation with origin efficiency (R = 0.89027) (). However, short of a method that allows the determination of origin firing efficiencies on a genomic scale, it is still unclear what chromosomal feature determines whether a given origin is checked by Rad53 in HU. We therefore refrain from assuming that Rad53-checked origins are synonymous with late origins. For example, it has been shown that a small number of origins (6 out of 122) that showed increased copy number in HU-treated WT cells (thus Rad53-unchecked) appeared to be late-firing when replication timing was measured during a synchronized S phase15
Origin mapping in Schizosaccharomyces pombe.
The observation that ssDNA formation is restricted to replication origins during a HU-challenged S phase suggests a potential use for our method in mapping origins in other eukaryotes for which there is a RAD53
homologue that can be inactivated by mutation or RNAi-mediated depletion. For example, deletion of the S. pombe
, which encodes a homologue of S. cerevisiae
Rad53, causes an irreversible cell cycle arrest in HU17
. Moreover, it has been suggested that Cds1 plays a similar role in origin regulation during HU treatment, i.e.
, the loss of checkpoint function of Cds1 leads to activation of late origins in the presence of HU18
. Therefore, we reasoned that the identification of locations of ssDNA formation in WT fission yeast cells should reveal those origins that are capable of firing in the presence of HU and that the same analysis of the Δcds1
cells in HU would reveal all the origin locations.
We synchronized fission yeast cells in either G1 or early S phase via nitrogen starvation or treatment with 12 mM HU, respectively. We then isolated DNA from the nitrogen-starved (G1 control) and HU-arrested (early S phase) cells, differentially labeled DNA without denaturation with Cy-conjugated dye and co-hybridized to a S. pombe
microarray (Eurogentec). Data analysis was performed similarly as described for budding yeast (Supplementary Information, Normalization, Smoothing, Extrema detection, and Identification of significant ssDNA peaks) except that we smoothed the data over a 12 kb window and employed a less stringent criterion in the determination of significant ssDNA peaks ( legend). We identified 321 significant ssDNA peaks in Δcds1
cells and 241 in WT cells. As shown in and Fig. S3, we detected significant ssDNA peaks at 32 previously identified origins (there are 48 origins included in19
and references therein). Among the remaining previously described origins, 5 mapped in close proximity to another origin, thus were detected as a cluster of origins, and 9 were located in the telomeric or centromeric regions whose sequences were not represented on our microarray. The remaining two known origins, pcr1 and pARS727, each showed a distinct small ssDNA peak that did not meet our statistical criteria (). Interestingly, “bubble” structures were not observed for pARS727 in its chromosomal location by two-dimensional gel analysis previously, presumably due to its inefficiency20
. Out of the 241 ssDNA peaks in WT cells, 196 can be matched to a ssDNA peak in the Δcds1
cells within a 12 kb distance. Our analysis also revealed more ssDNA peaks in Δcds1
cells than wild type cells, consistent with the hypothesis that Cds1 plays a similar role in origin regulation in the presence of HU.
Figure 5 Single-stranded DNA profiles of S. pombe chromosomes (higher resolution profiles are shown in Fig. S3). Chromosomal coordinates were downloaded from the from the Sanger Center (sequence version AL137130.1) and from the National Center for Biotechnology (more ...)
We compared our list of 321 significant ssDNA peaks from the Δcds1
cells to 385 A+T-rich islands predicted to be origins of replication (including 48 previously known origins)19
and found that 71% of ssDNA peaks overlap with these A+T-rich islands within a 12 kb distance, with a mean distance of 2.56 kb and median of 2.22 kb. This finding confirms that the A+T-rich characteristic is an important parameter of S. pombe
origins. We also compared the distribution of inter-origin distances in S. pombe
to that in S. cerevisiae
using ssDNA peak locations mapped in Δcds1
cells respectively. We found that the two distributions are very similar (p = 0.85 in a student T-test), with the most prevalent inter-origin distances being approximately 35 kb in both species and the average inter-origin distances being 38.3 and 38.6 kb for S. pombe
and S. cerevisiae
, respectively ().
Recently, based on measurement of inter-bubble distances (distances between converging forks) on combed DNA molecules, it has been suggested that origin firing in S. pombe
is a stochastic event21
. Currently our analysis provides no evidence suggesting that origin usage in S. pombe
is random. However, the DNA combing method examines single molecules whereas our analysis investigates a population of cells, making it difficult to compare our study with that by Patel et al
Comparison between the distributions of inter-origin distances in S. cerevisiae (cross-hatched bars) and S. pombe (solid black bars). The bin size for the histogram was set at 10 kb.
In summary, our results demonstrate that the ssDNA mapping technique accurately predicts locations of origins of replication and has wide potential for other eukaryotes such as human where the checkpoint function of Rad53 homologues is conserved and where origin mapping has been difficult. Our ssDNA assay also has the potential to be extended to other realms of biology than DNA replication, such as recombination and repair studies, where ssDNA formation also plays a vital role.