|Home | About | Journals | Submit | Contact Us | Français|
The discovery of the DNA double helix structure half a century ago immediately suggested a mechanism for its duplication by semi-conservative copying of the nucleotide sequence into two DNA daughter strands. Shortly after, a second fundamental step toward the elucidation of the mechanism of DNA replication was taken with the isolation of the first enzyme able to polymerize DNA from a template. In the subsequent years, the basic mechanism of DNA replication and its enzymatic machinery components were elucidated, mostly through genetic approaches and in vitro biochemistry. Most recently, the spatial and temporal organization of the DNA replication process in vivo within the context of chromatin and inside the intact cell are finally beginning to be elucidated. On the one hand, recent advances in genome-wide high throughput techniques are providing a new wave of information on the progression of genome replication at high spatial resolution. On the other hand, novel super-resolution microscopy techniques are just starting to give us the first glimpses of how DNA replication is organized within the context of single intact cells with high spatial resolution. The integration of these data with time lapse microscopy analysis will give us the ability to film and dissect the replication of the genome in situ and in real time.
Following the elucidation of the DNA double helix structure by Watson and Crick (Watson and Crick 1953) and the isolation of the first DNA polymerase by Kornberg (Kornberg 1960), a fundamental biological question has been how genomes are duplicated prior to cell division. In the 50 years since these seminal discoveries, the basic mechanisms of DNA replication have been established by a powerful combination of genetics and in vitro biochemistry. From this body of work, it is clear that the fundamental features and components of DNA replication are well conserved throughout evolution from bacteria to mammals. A comprehensive list of proteins that are involved in the process and their activities has been compiled, and it is now possible to provide a relatively detailed description of the DNA replication machinery on a molecular level (Fig. 1) (Perumal et al. 2009). However, genomic DNA in eukaryotic cells is hierarchically packed within the nucleus and genome duplication requires the concerted effort of many thousands of individual replication units. As such, an equally important question is how DNA replication is coordinated in space and time across the entire genome within the living cell.
Every cycling cell needs to duplicate its genome before it divides. This entails not only duplicating precisely and completely its genetic information, which is the focus of this article, but also restoring its epigenetic information to build up the differentiated chromatin structures.
After cells divide, the so-called prereplication complexes including the DNA helicase complex (MCM 2-7) assemble onto the DNA throughout the G1 phase (Blow and Dutta 2005). At the end of G1, input from the cell cycle machinery via the cell-cycle dependent kinases triggers initiation of DNA replication at discrete sites, known as origins of replication, scattered throughout the genome. Each origin fires once per cell cycle and their spacing must ensure that the entire genome is replicated during the S phase. Early results from cell fusion experiments indicated that replicated DNA differed from unreplicated DNA because it was not permissive for replication unless it passed through one mitotic division (Rao and Johnson 1970). Subsequent experiments using cell-free replication assays further refined this idea and a model was proposed whereby replication origins were “licensed” to replicate at late mitosis and G1, and the “license” was removed as the DNA was replicated in S phase (Blow and Laskey 1988). Detailed biochemical evidence is now available and indicates that preventing rebinding of the MCM complex to DNA is the key to avoid rereplication (Blow and Dutta 2005).
In bacteria and lower eukaryotes, replication origins are defined by specific DNA sequences; whereas in metazoans the defining characteristics of these sites remain less clear (Robinson and Bell 2005; Aladjem 2007; Hamlin et al. 2008). At each origin, replication proteins are assembled, which then duplicate one segment (replicon) of the genome in a processive manner (Jacob and Brenner 1963; Jacob 1993). The active units of DNA replication, consisting of the replication machinery (also called replisome), the existing DNA template, and the nascent DNA strand, are referred to as replication forks. Following initiation of DNA synthesis, replication forks proceed bidirectionally from the origin, unwinding the genomic DNA as they traverse the chromosome (Huberman and Riggs 1968).
At the molecular level, replication of a genomic DNA template involves at least two different DNA polymerases, a DNA primase, a DNA helicase, a single strand DNA binding (SSB) protein complex (replication protein A, RPA), DNA topoisomerases, a clamp loading complex (replication factor C, RFC), and a DNA polymerase clamp or processivity factor (proliferating cell nuclear antigen, PCNA). Furthermore, the inherent polarity of the DNA synthesis reaction (5′ to 3′ direction) necessitates discontinuous duplication of the lagging strand in addition to the continuous leading-strand synthesis. This process involves the synthesis of short (180–200 bp in eukaryotic cells) DNA fragments known as Okazaki fragments (Okazaki et al. 1968) that are subsequently processed and ligated together by a number of additional enzymes, including the flap endonucleases (FEN-1) and DNA ligase I (Hubscher and Seo 2001). To explain how the synthesis of both strands is coordinated, an asymmetric dimer of DNA polymerases and associated factors has been proposed first for prokaryotic DNA replication (McHenry 1988) and subsequently extended for eukaryotes (Tsurimoto and Stillman 1989). In order that the dimeric polymerases can extend the two antiparallel strands at the same rate and in the same direction, looping back of the lagging strand into the replisome (“trombone” model) has been postulated, with recycling of the lagging-strand polymerase from the end of one Okazaki fragment to the next RNA primer forming a priming loop (Sinha et al. 1980; Pandey et al. 2009). This situation suggests that the replication machinery at a given replication fork likely consists of at least two functional (sub)modules, one responsible for leading-strand synthesis, and the other for lagging-strand synthesis (Fig. 1). Biochemical evidence points to the existence of large preformed multiprotein replication complexes that contain all activities (Noguchi et al. 1983; Tom et al. 1996). However, evidence from live-cell microscopy analysis points to short time interactions between the individual components, suggesting highly dynamic complexes (Sporbert et al. 2002; Sporbert et al. 2005; Schermelleh et al. 2007; Gorisch et al. 2008).
Investigating how the distinct activities required for DNA synthesis are organized within the cell nucleus relates to the larger issue of understanding how DNA replication is regulated on a cellular level. A cell must duplicate its entire genome once and only once every time it divides. Therefore, in the broadest sense, investigating the global regulation of DNA replication can be distilled into the question of how the activity of individual replication units is coordinated throughout a cell cycle. This rather daunting problem can be broken down into more specific questions. How is replication propagated along the chromosomes? How does the cell ensure that the whole genome gets replicated? Where along the chromosome does replication begin? These questions were originally addressed in bacteria, where the replication program proceeds in a rather straightforward fashion with genetically well defined replication origins, which fire once per cell cycle (Mott and Berger 2007).
However, the much larger size and complexity of eukaryotic genomes impose additional difficulties on the organization of DNA replication. Hence, eukaryotic DNA replication represents a more complex situation that remains poorly understood. The confusion centers around two seemingly contradictory observations. First, a number of studies have clearly shown that DNA replication follows a defined, temporal progression (Sparvoli et al. 1994; Cardoso et al. 1997; Jackson and Pombo 1998; Ma et al. 1998; Dimitrova and Gilbert 2000; Leonhardt et al. 2000; Sadoni et al. 2004; Easwaran et al. 2005). Actively transcribed, euchromatic regions of the genome tend to be duplicated early in S phase, whereas heterochromatin, which is more condensed and often transcriptionally silent, replicates late in S phase (Fig. 2 and Movie 1 online at http://cshperspectives.cshlp.org/). This phenomenon was originally described by observing replication along Giemsa stained chromosomes and correlating DNA synthesis with banding patterns (Drouin et al. 1990). Second, eukaryotic replication origins fire in a stochastic fashion throughout S phase (Dijkwel et al. 2002; Patel et al. 2006). Therefore, the distribution of active origins, and thus replication initiation, changes between each cell cycle. Given the random nature of replication origin firing, it is hard to understand how a cell can maintain the temporal progression of replication. Reconciling these data acquired at very different spatial resolution levels (from hundreds of base pairs in stretched DNA fibers and by two-dimensional gel electrophoresis analysis to megabase chromatin domains in whole cells in situ) presents a significant hurdle to our understanding of how replication proceeds in eukaryotic cells.
It should be noted that the budding yeast Saccharomyces cerevisiae represents an exception to the standard eukaryotic strategy for genome duplication. Similar to bacteria, S. cerevisiae possess well-defined replication origin sequences that can fire at a very efficient rate during S phase, leading to a very homogenous pattern of DNA replication (Fangman and Brewer 1991; Gilbert 2001). Furthermore, genome-wide analysis of replication initiation indicates no bias for sites of active transcription and no observable delay for any distinct regions of the genome (Raghuraman et al. 2001). For these reasons, it is perhaps misleading to generalize conclusions obtained with this system when contemplating eukaryotic DNA replication timing. Therefore, this article focuses on how DNA replication is regulated within the nucleus of metazoan systems.
The recent development of genomic assays has for the first time permitted a genome-wide examination of replication timing in populations of eukaryotic cells. For example, microarray analysis has been successfully used in a variety of systems to generate genome-wide profiles of replication timing. First performed in S. cerevisiae, and more recently in S. pombe, Drosophila melanogaster, and Homo sapien cells (Raghuraman et al. 2001; Schubeler et al. 2002; Watanabe et al. 2002; White et al. 2004; Woodfine et al. 2004; Jeon et al. 2005; Eshaghi et al. 2007; Hiratani et al. 2008; Watanabe et al. 2008), this technique usually requires the synchronization of a population of cells at the G1/S boundary or their flow cytometric sorting based on increasing DNA content through S phase. Following release into S phase, the accumulation of newly synthesized DNA over time is measured by hybridization to DNA arrays. Alternatively, isolated nascent DNA can be directly sequenced using novel high-throughput deep sequencing techniques. A summary of these studies is compiled in Table 1.
DNA combing is another technique that has recently been used to systematically investigate replication initiation and elongation at the level of single DNA fibers. Long, individual DNA molecules are stripped of proteins, uniformly stretched across a glass surface, and examined by standard fluorescence microscopy (Bensimon et al. 1994). By pulse-labeling cells with nucleotides prior to this treatment, it is possible to directly examine sites of DNA synthesis (Pasero et al. 2002; Anglana et al. 2003). Originally used with radioactively labeled nucleotides to accurately measure the rate of replication fork progression (Huberman and Riggs 1968), more recent studies have used this approach to compare the efficiency of origin firing between early and late replicating regions (Patel et al. 2006).
The application of these methods has, for the first time, permitted the examination of how DNA replication proceeds at the genomic level. The availability of several complete genome sequences has further allowed in silico evaluation of the correlation of DNA replication with structural and functional genomic features. For example, whereas a relationship between transcriptional activity and replication timing had been suggested by earlier reports analyzing single genes (Gilbert 2002), microarray studies investigating the timing of DNA replication along D. melanogaster and human chromosomes convincingly linked these processes on a genome-wide level (Schubeler et al. 2002; Woodfine et al. 2004; Jeon et al. 2005). Specifically, early replicating regions displayed a strong correlation with gene rich areas that possessed a high GC content and contained actively transcribed genes. In fact, a high-resolution human genome analysis revealed that boundaries between regions with different GC content (the so-called isochores) correlated with borders between DNA replication timing zones (Costantini and Bernardi 2008). In addition, recent studies also showed that a substantial portion of the genome (approx. 60%) does not replicate until much later in S phase (Eshaghi et al. 2007).
The dynamics of origin firing has also been examined by genome-wide analysis. Consistent with earlier reports, this process was found to be stochastic, leading to a random distribution of replication initiation sites across the chromosomes (Patel et al. 2006). It was also shown that early S phase origins are quite inefficient at initiating replication. In contrast, the relative firing efficiency of late S phase origins, although still occurring in a random manner, was significantly higher (Eshaghi et al. 2007). This could reflect the fact that less DNA is “licensed” to replicate at later S phase stages leading to a seemingly higher firing efficiency. Alternatively, assuming recycling of a factor limiting replication initiation—firing propensity redistribution—has also allowed reasonable modeling of S phase based on stochastic firing of individual origins (Lygeros et al. 2008). Cyclin-dependent kinases were suggested as factors limiting initiation. These factors are recycled to late-firing origins, thus increasing the probability of their activation.
Together, these results have led to the “increasing efficiency model,” which has the potential to explain many of the questions surrounding the regulation of DNA replication. The idea centers around the hypothesis that as a cell progresses through S phase, the overall efficiency with which the remaining fraction of available origins initiates replication increases (Lucas et al. 2000; Hyrien et al. 2003). An appealing aspect of this model is that it accounts for how cells can avoid the problem of gaps in DNA replication. Furthermore, by assuming that different regions of the genome possess variable efficiencies of origin firing, it can also explain how a cell could maintain stochastic firing of replication origins and still replicate its DNA in an ordered fashion (Rhind 2006).
In general, this proposal provides a satisfactory explanation for how replication timing is modulated in eukaryotic nuclei. However, two very important questions remain. First, how does the firing efficiency of replication origins increase over the course of S phase? Second, what determines the inherent replication efficiency of a given genomic region? Regarding the first question, one model centers on the concept of polymerase recycling, whereby a fixed number of DNA polymerase complexes are available to the cell. At the onset of S phase, only the most efficient and/or accessible origins have a chance of initiating replication. As the genome becomes duplicated, the number of potential origins decreases, thereby increasing the probability that they will fire. The answer to the second question remains less clear, but alterations in chromatin structure are likely to play a central role.
The likelihood of a connection between chromatin structure and replication timing has been well established (Donaldson 2005). This discussion is most often framed by the correlation between the open chromatin structure present at transcriptionally active regions and the fact that these sites are often replicated early in S phase. In the simplest view, a relaxed chromatin structure results in more accessible genomic DNA, which leads to more efficient binding by both transcription and replication factors. There has been some discussion as to whether replication or transcription plays a causal role in this relationship (i.e., whether early replicating sites “define” regions of transcription or vice versa) but no conclusive results have been obtained. One shortcoming of the genomic methods described earlier is that whereas replication timing profiles certainly reflect the influence of chromatin structure, it is difficult to backtrack and directly examine the nature of this chromatin. This is due in part to the ensemble/pooled nature of genomic analysis, as well as the fact that neither assay permits a particularly detailed examination of nuclear structure.
Chromosomes and chromosomal domains are nonrandomly organized within eukaryotic nuclei and their topology is thought to have functional significance (Cremer and Cremer 2001; Kumaran et al. 2008; Takizawa et al. 2008). Microscopic inspection of nuclei is a powerful approach to investigate the spatio-temporal organization of replication within the context of nuclear architecture. Ongoing DNA synthesis provides a way to directly detect the nuclear sites of DNA replication after introducing labeled nucleotides into the cells. Initially, radioactive thymidine was used (Milner 1969) and later, with the development of antibodies specifically detecting halogenated thymidine analogs (Gratzner 1982; Aten et al. 1992), immunofluorescence analysis of nuclear replication structures became reality (Nakamura et al. 1986; Nakayasu and Berezney 1989; O’Keefe et al. 1992). More recently, live-cell microscopy analysis of replication progression was made possible by introducing fluorescently conjugated nucleotides (Schermelleh et al. 2001) or by expression of fluorescent replication factors (Cardoso et al. 1997; Leonhardt et al. 2000).
With these approaches, DNA replication was found to occur at subnuclear sites called replication foci, which accumulate numerous DNA replication factors and cell cycle proteins (Cardoso et al. 1993; Cardoso et al. 1997). These foci show distinct patterns of localization over the course of S phase; as such, studying their composition and dynamics enables an examination of how DNA replication is regulated on a cellular level. Time-lapse microscopy of living mammalian cells over the course of an entire cell cycle (Fig. 2 and Movie 1 online at http://cshperspectives.cshlp.org/) has shown that early in S phase, immediately following the onset of DNA replication, a multitude of small replication foci are distributed throughout the nucleus. During mid S phase, the replication foci are uniformly larger, and are distributed around the periphery of the nucleoli and nuclear envelope. Finally, at the end of S phase, the sites of replication have been consolidated into a small number of very large foci. In addition to reflecting the stage of S phase progression, the pattern of replication foci also correlates with the nature and topology of the chromatin that is being replicated. For example, early replication patterns represent actively transcribed euchromatin, whereas late replication patterns are associated with heterochromatic regions. Thus, this type of microscopic analysis permits the simultaneous visualization of both replication dynamics and chromatin structure in a single cell basis.
The role of chromatin modifications and structural rearrangements in replication organization is yet to be established. Some experimental evidence suggests a role of chromatin remodeling and assembly factors in facilitating replication through heterochromatin domains (Collins et al. 2002; Quivy et al. 2008). However, the impact of histone modifications is less clear. For example, disruption of histone H3 lysine 9 trimethylation, the typical epigenetic mark for heterochromatin, does not significantly affect the late replication of mouse chromocenters (Wu et al. 2006). Other histone modifications, such as phosphorylation of the linker histone H1 by Cdk2, have been proposed to play a role in the large-scale decondensation of chromatin associated with replication (Alexandrow and Hamlin 2005).
Examining the dynamics of the replication machinery during S phase has in addition provided a detailed view of how DNA replication proceeds on a cellular level. The combination of time-lapse microscopy with fluorescence photobleaching/activation indicated that the processivity of the replication machinery is built on transient interactions of various replication enzymes with a stable core consisting of the processivity factor PCNA (Sporbert et al. 2002; Sporbert et al. 2005; Schermelleh et al. 2007; Gorisch et al. 2008). The latter stayed associated with the DNA and upon photobleaching no recovery was measured for periods of over 10 minutes. When assembly of PCNA was measured, it was found to occur at sites adjacent to the ones previously labeled. This indicated that once replication is completed at a given site, a new replication focus assembles de novo at a neighboring site (Sporbert et al. 2002; Sadoni et al. 2004). Consistent data has been reported using double nucleotide pulse-chase-pulse experiments, whereby sequentially replicated DNA is labeled by two consecutive pulses of modified nucleotides (Manders et al. 1996; Jackson and Pombo 1998). Thus, replication of a specific genomic region facilitates subsequent loading of new replication factors at neighboring sites. One interpretation of this result is that the act of replication induces local changes in chromatin condensation, which in turn promotes the access/recruitment of replication factors and the initiation of additional replication cycles. Replication would begin at sites with an “open” chromatin conformation, similar to what has previously been proposed. As a result of these initial replication events, the chromatin at adjacent regions, which normally would not support replication initiation, would begin to decondense, leading to an increased probability that origins present at these sites would fire. It is tempting to suggest that the activity of replication helicases such as the MCM proteins would promote local chromatin decondensation. However, it is equally feasible that the mere act of DNA polymerization is sufficient to induce such changes. This propagating chromatin fiber decondensation can be visualized as analogous to pulling on a shoelace.
On a genome-wide scale, this model, which we refer to as the domino model, leads to a simple, self-propagating mode by which the entire chromosomes become fully duplicated by simply spreading the replication process using the “nearest neighbor” principle (Fig. 3). This hypothesis also dovetails nicely with the increasing efficiency model, in that it provides a physical basis for why origin efficiency would increase over the course of S phase. The increased efficiency of late origins is not because of polymerase recycling; rather it can be accounted for by the fact that as a cell progresses through S phase, it becomes more and more likely that any given region of the genome is proximal to a site of DNA replication. Therefore, even origins contained in hyper-condensed, late replicating regions such as the heterochromatin become accessible as more and more of the surrounding chromatin undergoes replication.
Despite significant advances in the characterization of the process of DNA replication, several basic questions remain unanswered. For example, it is still not fully evident how propagation of DNA replication over chromatin is coordinated with other nuclear processes. In other words, how is duplication of the (epi)genome once and only once per cell cycle achieved with high precision in a highly variable environment including parallel transcription, repair, and other DNA metabolic activities, and on such a differentiated template as chromatin? How are chromatin epistates maintained at every cell cycle? Is transcriptional activity or chromatin structure determining replication timing or are they rather determined by the time they are replicated during S phase?
Interdependency of all these nuclear processes is also influenced by the fact that multiple molecular components acting on DNA metabolism are shared. On the one hand, the fact that, e.g., many DNA replication factors are shared with DNA repair pathways (the same is also the case between repair and transcription factors) is far more economic for the cell when dealing with similar tasks. On the other hand, this can be a dangerous strategy as mutations on single factors will have pleiotropic effects and factors can get exhausted and become rate limiting.
Finally, understanding the DNA replication process will require the ability to connect data from genomic studies with data from single cells in a unified coherent model. Closing this (temporal and spatial) gap is becoming reality with the advent of novel super-resolution nanoscopy techniques. In fact, recent analysis of DNA replication in intact cells provided much increased resolution and consequently much higher numbers of replication foci throughout S phase (Baddeley et al. 2009). In the near future, a major goal of the field will be to visualize and to characterize in full detail single replicons in intact cells, which were previously only identified in stretched DNA fibers (Fig. 4). The integration of these data with time lapse analysis will give us the ability to film and dissect the replication of the genome in situ and in real time.
We are indebted to Robert M. Martin for the artwork in Figure 1 and to Corella Casas Delucchi for the DNA fibers in Figure 4. We also thank all the past and present members of our laboratory for their many contributions along the years. Last but not least, we thank our many collaborators, which have made our work possible and enjoyable. Our research has been supported by grants of the Deutsche Forschungsgemeinschaft and the Volkswagen foundation. We regret that because of space constraints, we had to eliminate many important and relevant citations.
Editors: David Spector and Tom Misteli
Additional Perspectives on The Nucleus available at www.cshperspectives.org