|Home | About | Journals | Submit | Contact Us | Français|
This article considers the fidelity of DNA replication performed by eukaryotic DNA polymerases involved in replicating the nuclear genome. DNA replication fidelity can vary widely depending on the DNA polymerase, the composition of the error, the flanking sequence, the presence of DNA damage and the ability to correct errors. As a consequence, defects in processes that determine DNA replication fidelity can confer strong mutator phenotypes whose specificity can help determine the molecular nature of the defect.
The fidelity with which genetic information is replicated depends on the ability of DNA polymerases to select correct nucleotides rather than incorrect or damaged nucleotides for incorporation, without adding or deleting nucleotides. Polymerase selectivity is a major determinant of fidelity both at the replication fork and during synthesis to repair DNA damage. Mismatches generated during DNA chain elongation can be removed by 3′-exonuclease activity of the major replicative polymerases, thereby enhancing fidelity. If a mismatch escapes proofreading, DNA mismatch repair (MMR) can excise the replication error in the nascent strand and replace it with the correct sequence. Under normal circumstances, nucleotide selectivity, proofreading and MMR operate in series to replicate the genome with very high fidelity, thereby contributing to low spontaneous mutation rates (Fig. 1). Consequently, defects in any of these three processes can increase mutation rates, and this mutator phenotype can potentially be a driving force for cancer.
When lesions in DNA generated by endogenous cellular metabolism or exposure to chemical or physical insults from the external environment are not removed prior to replication, they sometimes distort the DNA helix and impede replication fork progression. In such circumstances, cell survival can be enhanced by several different DNA transactions. One such transaction is translesion synthesis (TLS), a process that allows lesions to be tolerated until they can be repaired. TLS is catalyzed by highly specialized, exonuclease-deficient DNA polymerases whose catalytic properties, including fidelity, are quite different than those that perform the bulk of replication. This review focuses on the eukaryotic replicative and TLS DNA polymerases that have critical roles in accurately and efficiently replicating the nuclear genome. Readers interested in DNA synthesis catalyzed in mitochondria, or in additional DNA polymerases that function in DNA repair, can consult other reviews [1–4]. Emphasis here is on the fidelity of replication, how it can be reduced to result in mutator phenotypes associated with cancer, and differences in replication error specificity that may have diagnostic value when considering replication infidelity as a source of point mutations arising in tumor cells.
DNA polymerases are classified by sequence homology into seven families (A, B, C, D, X, Y and RT). Most organisms encode multiple polymerases, often including several members of the same family. The human genome encodes 16 DNA polymerases, and more than half are involved in replicating the nuclear genome (Table 1). Four (Pols α, δ, ε and ζ) are B family members, four (Pols η, κ, ι and Rev1) are Y family members; two (Pols θ and ν) are A family members. These enzymes can have multiple functions, and some uncertainty remains as to exactly where and when they operate in vivo. Nonetheless, their primary functions are currently thought to be in replication and TLS (Table 1).
Pols α, δ and ε perform the vast majority of nuclear DNA replication. Several models have been put forth for their division of labor at the nuclear DNA replication fork [5,6]. These models all share the idea that initiation of replication at origins and of each Okazaki fragment on the lagging strand template requires the RNA primase activity associated with Pol α to synthesize RNA primers of ~10 bases (Fig. 2A). Pol α then extends these primers by incorporating about 20–30 deoxynucleotides. Once DNA chains are initiated, current evidence [5,7] suggests that the leading strand template may be copied primarily by Pol ε, a highly processive enzyme with an associated proofreading exonuclease activity. Pol δ copies primarily the lagging strand template, generating a series of Okazaki fragments of approximately 200–300 base pairs each that are ultimately processed to remove RNA primers and permit ligation .
Although the above situation may apply to replicating undamaged DNA, enzymology becomes more complicated when the fork encounters lesions that distort the DNA helix (Fig. 2B). Family B members like Pols α, δ and ε depend heavily on normal helix geometry for efficient and accurate synthesis [1,9], such that helix distorting lesions can slow or prevent replication fork progression. To solve this problem, a stalled replication fork can elicit one or more switches among DNA polymerases capable of bypassing lesions. The specialized TLS polymerases include Pols ζ, η, κ, ι and Rev1 (Y family) and Pols θ and ν (A family). These polymerases can incorporate nucleotides opposite lesions and/or extend the resulting primer terminus for one or more nucleotides, thus creating primer-templates that can be used by the major replication enzymes [10,11]. The number of switches involved in TLS, and the identity and number of polymerases required, can vary depending on the lesion, of which there are many types that differ by composition and structure . The number of switches also depends on the substrate preferences of the various polymerases, which also vary widely and can partially overlap. The location and timing of TLS may vary, sometimes occurring at the fork during ongoing DNA replication, sometimes occurring during post-replication gapfilling synthesis, e.g., catalyzed by Pol ζ , or perhaps occurring during excision repair (e.g., Pol κ). In fact, eukaryotic DNA polymerases other than those listed in Table 1 also have been implicated in TLS, including two family X members Pol β and Pol λ that fill short gaps during base excision repair and non-homologous end joining of double strand breaks in DNA. Pol β has been shown to be involved with bypass of an abasic site and a d(GpG)-cisplatin adduct [14,15] and PCNA was found to efficiently stimulate the bypass of an abasic site by Pol λ [2,16,17]. Readers interested in the substrate specificities of the TLS polymerases and mechanisms of polymerase switching involving post-translational modifications of polymerases and their accessory proteins can consult any of several recent reviews on this topic [4,18,19].
The potential of replication infidelity to drive cancer via a mutator phenotype may depend heavily on the enzymatic source and the specificity of DNA biosynthetic errors. Studies of DNA synthesis fidelity in vitro reveal wide variations in error rates for the two main types of errors that DNA polymerases generate, single base pair substitutions and single base deletions (Fig. 3). These rates reflect nucleotide selectivity, which prevents errors from forming, and also exonucleolytic proofreading, which corrects mismatches during ongoing replication.
Most DNA polymerases lack intrinsic 3′-exonuclease activity to excise errors. Therefore, their fidelity depends on the ability to prevent incorporation of incorrect dNTPs that lead to base substitutions, and to prevent incorporation involving misaligned substrates that leads to loss or addition of nucleotides. Amazingly, DNA polymerase selectivity can vary over a million-fold range (Fig. 1).
Pols α, δ and ε nearly always insert correct dNTPs onto properly aligned primer-templates. This is illustrated by the low single base substitution and deletion error rates of Pols α, δ and ε. These error rates are for a polymerase that naturally lacks proofreading activity (Pol α) and for polymerases where their intrinsic proofreading exonucleases are intentionally inactivated (Pols δ and ε) (Fig. 3A). This high nucleotide selectivity reflects the fact that the nascent base pair binding pocket of accurate DNA polymerases, whose assembly is induced by dNTP binding, snugly accommodates base pairs with correct Watson–Crick geometry. The induced fit mechanism establishes the active site geometry needed for rapid phosphodiester bond formation. Deviations from correct geometry resulting from incorrect dNTP binding or primer-terminal mismatches strongly reduce the probability of incorporation.
Geometric distortions can also result from strand misalignments that generate one or more unpaired bases, either in the primer strand (leading to additions) or in the template (leading to deletions). Ideas to account for how these misalignments initiate and are stabilized for continued synthesis include classical DNA strand slippage proposed by Streisinger, misinsertion followed by primer relocation, and misalignment of a nucleotide at the active site . Extensive biochemical and structural support for each of these models exists [6,21]. Typically, single base deletion error rates are substantially higher than are single base addition error rates. It is also worth noting that the base substitution and deletion error rates in Fig. 3 are “average” values, with wide variations in rates observed among the 12 possible single base–base mismatches, the composition of deletion and insertion mismatches, and the local sequences flanking these mismatches. For example, while the average base substitution error rate of Pol α is about 10−4, the rate at which it forms C–C mismatches can be much lower, e.g., 10−6. Other examples of such variability are considered below.
The TLS polymerases lack proofreading activity and have lower nucleotide selectivity than the major replicative polymerases, as indicated by their higher error rates for base substitutions and deletions (Table 1 and Fig. 3B). The extreme case is for Pol ι. This remarkable enzyme actually prefers to insert dGTP more often than correct dATP opposite template T, i.e., the error rate for this event approaches one ( and references therein). Structural studies suggest that the generally low fidelity of Y family members ability is partly due to relaxed geometric selectivity in the nascent base pair binding pocket, which is more open and solvent-accessible than those of more accurate DNA polymerases . For example, recent work shows that Pol η can accommodate two nucleotides in the active site [24,25]. Pols η, κ, θ and ν are not only error-prone for base substitutions, but also for deletions (Table 1 and Fig. 3B). While not the focus of this review, elegant work on Y family polymerases also has been performed using bacterial TLS polymerases, whose functions and properties are reviewed elsewhere [26–28].
The B-family member Pol ζ also has critical roles in TLS. When copying undamaged DNA, Pol ζ has somewhat higher fidelity than the Y-family polymerases, but lower fidelity than the other B-family members (Table 1). The ability of Pol ζ to generate single base mismatches at relatively high rates is consistent with its known contribution to spontaneous mutagenesis, as well as mutagenesis induced by a variety of DNA damaging agents [29,30]. Pol ζ’s high base substitution error rate clearly demonstrates that it has low nucleotide selectivity, consistent with a possible direct role in mutagenic misinsertion of dNTPs in vivo. Also relevant are kinetic studies demonstrating that Pol ζ efficiently extends terminal mismatches . This is true for undamaged DNA as well as for extending damaged termini, the latter being consistent with a role for Pol ζ in the extension step of TLS in a 2-polymerase model (Fig. 2B). A similar role also has been proposed for Pol κ, which like Pol ζ, is promiscuous for mismatch extension. There are numerous reports describing the ability of TLS and fidelity of various Y-family (and other) polymerases when encountering a wide range of structurally diverse lesions. These studies [11,23,32–34] show that for a given lesion, the bypass efficiency and fidelity is highly polymerase specific.
Once an incorrect dNTP is incorporated into DNA, the mismatched primer terminus is more difficult to extend than a correctly paired and properly aligned primer terminus. The delay in extension caused by a mismatch allows the primer terminus to fray and travel (switch) from the polymerase to the exonuclease active site for excision of the error .
Among the many eukaryotic DNA polymerases, only the three polymerases responsible for the bulk of chain elongation during replication (Pols κ, ε and the mitochondrial replicase Pol γ) contain intrinsic 3′-exonucleolytic proofreading activity. The contribution of proofreading to base substitution fidelity is illustrated by the high fidelity of the exonuclease-proficient Pols δ and ε compared to their exonuclease-deficient derivatives (Fig. 3A). A variety of studies in vitro indicates that proofreading improves replication fidelity by factors ranging from a few-fold to more than 100-fold, depending on the mismatch, the sequence context and the polymerase [1,35]. Proofreading corrects mismatches at the terminus itself, and it also corrects more internal insertion and deletion mismatches resulting from misalignments.
Given a base substitution error rate of ~10−4 (Fig. 3A) and its role in initiating Okazaki fragments, Pol α may generate many thousands of mismatches during each replication cycle. This leads to the question of whether a separate exonuclease may edit Pol α’s errors. This possibility, which we refer to as extrinsic proofreading, has been examined both biochemically  and genetically . The results suggest that the 3′-exonuclease of Pol δ may indeed proofread errors generated by Pol α during initiation of Okazaki fragments. Extrinsic proofreading could be relevant to other DNA transactions that involve exonuclease-deficient polymerases, such as base excision repair and TLS .
Replication errors that escape proofreading are corrected by DNA mismatch repair (MMR). Eukaryotic MMR (reviewed in [39–41]) requires recognition of mismatches by MutS proteins, followed by binding of MutL proteins. Both these proteins bind and hydrolyze ATP to undergo conformational changes that help coordinate the multiple protein partnerships and reactions needed to find the strand-discrimination signal (still unknown in eukaryotes), incise the nascent strand, excise the replication error, correctly synthesize new DNA and ligate the nascent strand. Generally, MMR improves replication fidelity by about 100- to 1,000-fold (Fig. 1), with exceptions wherein MMR efficiency can be negligible (e.g., some C–C mismatches), or much higher (deletion mismatches in mononucleotide runs, discussed below). MMR is subdivided into multiple “subpathways” that differ by substrate specificity and protein requirements, in ways that are only partly understood. MMR proteins also can modulate recombination, participate in repair of double-strand DNA breaks and signal apoptosis in response to DNA damage. As a consequence, loss of MMR is associated with elevated mutation rates and altered survival in response to DNA damage.
There are multiple ways to perturb each of the three major mechanisms that determine replication fidelity (Fig. 1, right).
Nucleotide selectivity can be reduced by mutations in polymerase genes that alter amino acids comprising the active site or the nascent base pair binding pocket, or even more distant amino acids that indirectly influence selectivity  (for further review please see chapter by Sweasy). For the major replicative polymerases whose polymerase activities are critical for cell viability, the probability that such defects will be associated with cancer is likely to be small, because most such mutations reduce polymerase activity and therefore reduce proliferative potential. The probability that a dNTP will be misinserted also depends on the relative concentrations of each of the four dNTPs available for DNA synthesis. For example, an abnormally high concentration of dTTP relative to dCTP will promote misinsertion of dTMP opposite template guanine which, if uncorrected, will yield G–C to T–A substitutions. Interestingly, the four dNTP concentrations are not equal even in normal cells (e.g., the dGTP concentration is typically lower than the other three), and they are highly regulated throughout the cell cycle as well as in response to environmental stress. Any deviation from the normal dNTP ratios has the potential to decrease replication fidelity and generate a mutator phenotype. This is illustrated by the elevated mutation rates in cells containing mutations in genes that regulate dNTP concentrations, e.g., ribonucleotide reductase [43–45]. dNTP pool imbalances induced by means other than genetic defects, e.g., by environmental stress, also have the potential to elevate mutation rates. This could occur in a transient manner that would contribute only to a subset of the multiple mutations needed for tumor formation. dNTP pool imbalances that result from increasing one or more dNTP concentrations may be more relevant to cancer than imbalances resulting from decreasing dNTP concentrations below normal levels, because the latter can limit proliferative potential .
The efficiency of proofreading depends on a balance between polymerization from terminal mismatches and 3′-exonucleolytic excision of the error. Thus, defects to either activity can reduce the contribution of proofreading to replication fidelity (Fig. 1). The exonuclease can be inactivated by active site mutations. It can be reduced by the presence of dNMPs, which are the end products of excision that can bind to the exonuclease active site and inhibit activity. Proofreading can be reduced by mutations that alter the ability of the primer terminus to switch from the polymerase to the exonuclease active site. Proofreading can be suppressed by mutations in the polymerase active site that promote mismatch extension, or by dNTP pool changes that increase the concentration of the correct dNTPs needed to extend mismatches.
Proofreading potential can also be affected by DNA damage. A fine example involves the common oxidative lesion, 8-oxo-guanine. In its syn conformation, 8-oxo-G pairs with adenine and largely escapes proofreading because it mimics the shape of a correct base pair within the nascent base pair binding pocket of the polymerase. Finally, even some undamaged mismatches are simply poor substrates for normal proofreading. The best example involves inefficient proofreading of insertion and deletion mismatches in repetitive sequences. Efficiency decreases as the length of a repetitive sequence increases, because the unpaired base can reside upstream of the terminus, where it can be protected from excision by intervening correct base pairs . In principle, any of these mechanisms could be relevant to a mutator phenotype associated with cancer, because proofreading is dispensable for eukaryotic cell viability.
MMR is also dispensable for eukaryotic cell viability, yet critical for replication fidelity. Like proofreading, MMR efficiency can be reduced by mutations in genes whose products are required for MMR (Fig. 1). The degree to which these genetic defects elevate mutation rates varies considerably, depending on the nature of the gene mutation, the function of the particular gene product in MMR and the type and location of the replication error considered. Another way to reduce MMR efficiency is to change the expression of MMR genes. MMR can be inactivated by promoter hypermethylation to silence expression of the essential human MMR gene MLH1, a mechanism strongly associated with spontaneous colon cancer. The level of MMR proteins needed to correct replication errors under normal conditions may be insufficient when replication errors accumulate at a higher rate. For example, studies have shown that MMR capacity can be saturated by rapid error-prone replication in Escherichia coli, or by the presence of DNA lesions to which MMR proteins can bind. In yeast, over-expression of two MMR genes, MLH1 and MSH3, reduces MMR efficiency, presumably by perturbing normal protein–protein partnerships required for MMR. Exposure of budding yeast to cadmium, a metal that is ubiquitously present in the environment and is a known human carcinogen, can inhibit MMR and strongly elevate the mutation rate . There potentially may be other environmental exposures that inhibit MMR. Reduced MMR due to altered expression, saturation, or cadmium inhibition is reversible; again potentially generating transient mutator phenotypes.
Defects in the cellular machinery required for accurate DNA replication have been correlated strongly with mutator phenotypes and increased cancer susceptibility. A particularly well-understood example involves Pol η, which is highly efficient at bypassing cyclobutane pyrimidine dimers, one of the major lesions generated by exposure to ultraviolet light. Cells lacking Pol η have an elevated frequency of UV-light-induced mutations, and patients lacking Pol η have increased susceptibility to sunlight-induced skin cancer [47–49]. This effect has been recapitulated in mouse models [50,51], and additional studies in mice also reveal increased cancer susceptibility due to defects in two other TLS polymerases, Pol ι [52,53] and Pol ζ [54,55]. Defects in the major replicative polymerases, and in polymerases involved in repair synthesis, also are associated with cancer susceptibility. These studies are reviewed elsewhere  (for further reading please see Sweasy, Hoffman, Preston and Loeb chapters in this volume), allowing us to turn final attention to the potential value of replication error specificity for identifying the causes of mutator phenotypes.
Finally, we consider examples of how replication error signatures are, or might, be used to probe the mutator hypothesis.
A highly successful current example is microsatellite instability (MSI), the loss or gain of base pairs in short (1–6 base) repetitive DNA sequences. These mutations result from strand slippage during replication to generate one or more unpaired bases. When slippage occurs in repetitive sequences, the extra base(s) comprising the mismatch can be present in otherwise duplex DNA, well upstream of the primer terminus and stabilized by intervening correct base pairs whose numbers increase as the number of base pairs in the repetitive sequence increase [57–59]. As a consequence, slipped mismatch intermediates are not only frequently generated during replication, they are also more readily extended than terminal mismatches and therefore inefficiently proofread. For these reasons, MMR is the major guardian of genome stability against MSI, as illustrated by the observation that a MMR defect can increase the rate of certain single base deletions by 100,000-fold [60,61]. This is why MSI is now a widely used “replication error signature” that is diagnostic of MMR defective tumors [62–66] (for further review see chapter by Salk and Horwitz). MSI may be a useful biomarker for causes of mutator phenotypes other than MMR defects. For example, a recent study reported that if ribonucleotides incorporated into DNA during replication by yeast Pol ε are not removed by RNase H2-dependent repair, they result in 2–5 base pair deletions in repetitive sequences .
Another clear example of a useful error signature, albeit one not directly related to cancer per se, involves a base substitution signature for somatic hypermutation (SHM) of immunoglobulin genes. SHM generates mutations in the variable regions of immunoglobulin genes at an extraordinary rate, thereby contributing to affinity maturation of antibodies . Initial studies of the error specificity of Pol η in vitro demonstrated that human and mouse Pol η preferentially generate mismatches at A–T base pairs, and preferentially in certain sequence contexts . This motivated subsequent studies (e.g. [70–73]) that now strongly support the idea that Pol η is indeed responsible for much of the SHM occurring in vivo at A–T base pairs. Table 2 lists other point mutation signatures of replication infidelity that eventually may have diagnostic relevance for mutator phenotypes and cancer.
Forming a tandem double base substitution by DNA synthesis requires misinsertion, followed by another misinsertion onto a mismatched terminus, and then followed by multiple extensions of a doubly mismatched terminus. Each of these steps alone is normally rare, and therefore accurate replicative DNA polymerases have rarely been seen to generate tandem double base substitutions. However, Pol κ , Pol ζ  and especially Pol η [69,72,76], generate tandem double base substitutions at readily detectable rates when copying undamaged DNA. In fact, the accuracy of Pol η is so low that it also generates tandem triple substitutions and mutants with substitutions separated by 1–3 base pairs. Tandem double base substitutions are also a characteristic feature of mutagenesis resulting from exposure to UV light, e.g., CC to TT mutations that are likely generated during TLS of UU dimers resulting from deamination of CC dimers. For these reasons, tandem double base substitutions are a biomarker for involvement of TLS polymerases.
When copying undamaged DNA in vitro, Pol ζ generates multiple substitutions, deletions and/or insertions within a few base pairs of each other. Pol ζ also generates such complex errors in vivo [77,78], more so in genetic backgrounds defective in processing DNA damage. Highly mutagenic DNA synthesis by yeast Pol ζ also has recently been observed when copying long tracts of damaged single stranded DNA . Such long tracts can be generated during the end-resection phase of DNA double strand break repair , or by uncoupling leading and lagging strand replication. This Pol ζ-dependent mutagenesis can introduce multiple mutations spread over thousands of base pairs. Whether tandem, clustered or spread out, the concerted introduction of multiple mutations in a single DNA transaction provides an opportunity for selection in tumor progression, or in evolution, that differs from the selective advantages of single point mutations introduced in series over multiple cell generations (for further discussion, see [79,81,82] and references therein).
Other TLS polymerases also have unusual error specificities (Table 2). These include Pol ι, which frequently inserts incorrect dNTPs opposite template pyrimidines [83,84], Pol ν, which forms T-dGTP mismatches at an unusually high rate [85,86], and Pol θ, which generates single base additions at a higher rate than other polymerases . In addition, mutant alleles of the three major replicative polymerases have high error rates for some mismatches but not others. These biases have been useful for assigning the roles of Pol δ and Pol ε in replicating the leading and lagging strand templates in yeast [7,8], and for identifying potential mutable sequence motifs for Pol δ errors in a whole yeast genome sequencing study .
The above examples of replication error specificity illustrate the potential value of testing the mutator hypothesis by not only looking for increases in spontaneous and stress-induced mutation rates and frequencies, but also actually using replication error signatures as clues to the molecular nature of the defects that initiate and/or promote tumor formation.
We thank Katarzyna Bebenek and Jessica Williams for thoughtful comments on the manuscript. The research conducted in Thomas Kunkel’s laboratory is supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Projects Z01 ES065070 and Z01 ES065089).
Conflict of interest
The authors declare that there is no conflict of interest.